Differences between revisions 19 and 21
|Deletions are marked like this.||Additions are marked like this.|
|Line 1:||Line 1:|
= The year 2038 problem =
All 32-bit kernels to date use a signed 32-bit time_t type, which can only represent time until January 2038. Since embedded systems running 32-bit Linux are going to survive beyond that date, we have to change all current uses, in a backwards compatible way.
== User space interfaces ==
We will likely keep the 32-bit time_t in all user space interfaces that currently use it, but add new interfaces with a 64-bit timespec or another type that can represent later times. Most importantly that impacts system calls, but also specific ioctl commands and a few other interfaces. User space programs have to be recompiled to use the new interfaces, and the policy whether to use the old or the time time is left to the C library. While that policy is a complex topic itself, we don't cover it here.
=== System calls ===
https://docs.google.com/spreadsheets/d/1HCYwHXxs48TsTb6IGUduNjQnmfRvMPzCN6T_0YiQwis has a table of all affected system calls, here are some explanations:
==== clocks and timers ====
* clock_gettime, clock_settime, clock_adjtime, clock_getres, clock_nanosleep, timer_gettime, timer_settime, timerfd_gettime, timerfd_settime:
these should be done consistently, either using timespec64 or 64-bit nanoseconds, either one works. 64-bit nanoseconds would simplify the kernel internally quite a bit by avoiding the double timekeeping (we keep track of both nanoseconds and timespec in the timekeeper struct). the downside of nanoseconds-only is that each existing caller would need a conversion in user space, where currently we can avoid the expensive ktime_to_ts() for some cases.
time, stime, gettimeofday, settimeofday, adjtimex, nanosleep,getitimer, setitimer:
all deprecated => wontfix
==== i/o ====
pselect6, ppoll, io_getevents, recvmmsg:
These currently pass a timespec into the kernel with *relative* timeouts. Internally, they convert it to ktime_t and back on the way out. We have three options:
a. leave as is, get the libc to convert 64-bit timespec to 32-bit timespec on the way into the kernel and back on the way out, which works because the relative timeout will not overflow
a. use ktime_t to make these more efficient in the kernel, at the expense of requiring user space to convert it (all except io_getevents pass back the remaining time).
a. leave the current behavior, but use 64-bit timespec.
select, old_select, poll:
==== ipc ====
These get an *absolute* timeout, so we have to change them. Internally they use ktime_t, so that would be the natural interface, but timespec64 would work as well.
This uses a relative timeout that is converted to jiffies internally, so using ktime_t would not be as natural, unless we rewrite the function to use hrtimers.
msgctl, semctl, shmctl:
These have an output, which is a time_t that stores the absolute seconds value of the last time something happened. Internally this comes from get_seconds(), which has to be efficient anyway. The best way forward is probably to use a structure layout for these that is compatible with what 64-bit architectures do. Note that the structures sometimes have padding to deal with the extension of time_t to 64-bit, but not all architectures have that, and some (notably big-endian arm) have it in the wrong place, so my feeling is that we're better off not using that padding and instead doing something that works for everyone.
==== inodes and filesystems ====
utimesnsat, fstat64, fstatat64:
inode timestamps need to represent times before 1970 and way into the future, so we need 64-bit time_t here, I see no other alternatives here, so we have to pass struct timespec64 into utimensat, and create version 4 of 'struct stat' to pass into the future fstat and fstatat. I would use a version that matches the 64-bit layout of 'struct stat'.
utime, utimes, futimensat, oldstat, oldlstat, oldfstat, newstat, newlstat, newfstat, newfstatat, stat64 and lstat64:
These are all deprecated now, we have to stop getting this wrong!
==== tasks ====
these pass a 'struct rusage' that contains a 'struct timeval' with elapsed time. Again there are multiple options:
a. We could change rusage to contain a new 'struct relative_timeval' instead, with an unchanged layout, which makes the format incompatible with a standard libc that uses a 64-bit based timeval.
a. We could make the layout the same as on 64-bit machines, as x32 does, which is again incompatible with posix but would work better
a. We could make the layout what glibc expects, using 64-bit based timeval structures at the beginning.
a. We could define a new structure usings pure nanosecond counters.
This passes a relative timespec value in back out, so we could keep the current layout and have glibc convert it, or change it to something else. The kernel internally converts to jiffies to call schedule_timeout.
this passes a relative *or* absolute timespec in, so we have to change it. The kernel uses ktime_t internally here, so we could make the interface nanosecond based or stick with timespec64.
This returns a timespec with the schedule interval to user space, using a 32-bit based format is fine here, or we could convert to timespec64. The kernel uses jiffies internally.
replaced by waitid
==== system wide ====
struct sysinfo contains '__kernel_long_t uptime', we can keep that, it's fine.
==== ioctl ====
There are numerous ioctl commands using a time argument. This list is incomplete
* audio time stamps
* v4l time stamps
* input event time stamps
* socket time stamps
=== memory mapped packet sockets ===
Socket timestamps are exported to user space using a memory mapped interface defined in include/uapi/linux/if_packet.h. There are currently three versions of this interface, all use a 32-bit time type. We will likely need a version 4 to solve this.
=== Audit of include/uapi for time_t impact ===
Structure and IOCTL dependency:
struct msqid64_ds (has 2038 padding!)
struct semid64_ds (has 2038 padding!)
== File systems ==
Each file system stores its file modification times in its own format on disk, and a lot of them have the same problem.
|| '''file system''' || '''time type''' || '''expiration year''' ||
|| 9p (9P2000) || unsigned 32-bit seconds || 2106 ||
|| 9p (9P2000.L) || signed 64-bit seconds, ns || never ||
|| adfs || 40-bit cs since 1900 || 2248 ||
|| affs || u32 days/mins/(secs/50) || 11760870 ||
|| afs || unsigned 32-bit seconds || 2106 ||
|| befs || unsigned 48-bit seconds || never ||
|| bfs || unsigned 32-bit seconds || 2106 ||
|| btrfs || signed 64-bit seconds, 32-bit ns || never ||
|| ceph || unsigned 32-bit second/ns || 2106 ||
|| cifs (smb) || 7-bit years since 1980 || 2107 ||
|| cifs (modern) || 64-bit 100ns since 1601 || 30328 ||
|| coda || timespec ioctl || 2038 ||
|| cramfs || fixed || 1970 ||
|| efs || unsigned 32-bit seconds || 2106 ||
|| exofs || signed 32-bit seconds || 2038 ||
|| ext2 || signed 32-bit seconds || 2038 ||
|| ext3 || signed 32-bit seconds || 2038 ||
|| ext4 (good old inodes) || signed 32-bit seconds || 2038 ||
|| ext4 (new inodes || 34 bit seconds / 30-bit ns (but broken) || 2038 ||
|| f2fs || 64-bit seconds / 32-bit ns || never ||
|| fat || 7-bit years since 1980, 2s resolution || 2107 ||
|| freevxfs || unsigned 32-bit seconds/u32 microseconds || 2106 ||
|| fuse || 64-bit second/32-bit ns || never ||
|| gfs2 || u64 seconds/u32 ns || never ||
|| hfs || u32 seconds since 1904 || 2040 ||
|| hfsplus || u32 seconds since 1904 || 2040 ||
|| hostfs || timespec || 2038 ||
|| hpfs || unsigned 32-bit seconds || 2106 ||
|| isofs || 'char' year since 1900 (fixable) || 2028 (!) ||
|| jffs2 || unsigned 32-bit seconds || 2106 ||
|| jfs || unsigned 32-bit seconds/ns || 2106 ||
|| logfs || signed 64-bit ns || 2262 ||
|| minix || unsigned 32-bit seconds || 2106 ||
|| ncpfs || 7-bit year since 1980 || 2107 ||
|| nfsv2,v3 || unsigned 32-bit seconds/ns || 2106 ||
|| nfsv4 || u64 seconds/u32 ns || never ||
|| nfsd || unsigned 32-bit seconds/ns || 2106 ||
|| nilfs2 || u64 seconds/u32 ns || never ||
|| ntfs || 64-bit 100ns since 1601 || 30828 ||
|| ocfs2 || 34-bit seconds/30-bit ns || 2514 ||
|| omfs || 64-bit milliseconds || never ||
|| pstore || ascii seconds || 2106 ||
|| qnx4 || unsigned 32-bit seconds || 2106 ||
|| qnx6 || unsigned 32-bit seconds || 2106 ||
|| reiserfs || unsigned 32-bit seconds || 2106 ||
|| romfs || fixed || 1970 ||
|| squashfs || unsigned 32-bit seconds || 2106 ||
|| sysv || unsigned 32-bit seconds || 2106 ||
|| ubifs || u64 second/u32 ns || never ||
|| udf || u16 year || 2038 ||
|| ufs1 || unsigned 32-bit seconds || 2106 ||
|| ufs2 || signed 64-bit seconds/u32 ns ||never ||
|| xfs || signed 32-bit seconds/ns || 2106 ||
== Tasks ==
The task list is for people that want to get involved, there will be many more tasks over time, so this is just a starting point. In the end, we should remove all instances of 'time_t', 'timespec' and 'timeval' from the kernel.
=== Small tasks ===
* Find a driver using time_t/timespec/timeval internally and convert it to ktime_t/timespec64, examples:
* drivers/staging/media/lirc/lirc_imon.c (timeval, trivial)
* drivers/staging/ft1000/ (time_t and timeval)
* drivers/staging/android/sync_debug.c (timeval, very easy)
* drivers/staging/android/timed_gpio.c (timeval, easy)
* drivers/staging/bcm/LeakyBucket.c (timeval, slightly tricky)
* drivers/staging/bcm/Bcmchar.c (timeval, very easy)
* drivers/staging/comedi/drivers/comedi_test.c (timeval)
* drivers/staging/comedi/drivers/serial2002.c (timeval, easy)
* drivers/staging/dgnc/dgnc_tty.c (timeval, very easy)
* drivers/staging/gdm72xx/ (timeval, easy)
* drivers/staging/media/lirc/lirc_igorplugusb.c (timeval)
* drivers/staging/media/lirc/lirc_parallel.c (timeval, easy)
* drivers/staging/media/lirc/lirc_sasem.c (timeval, very easy)
* drivers/staging/media/lirc/lirc_serial.c (timeval, easy)
* drivers/staging/media/lirc/lirc_sir.c (timeval)
* drivers/staging/rts5208/rtsx.h (timeval) [Status: Completed, Ksenija Stanojevic]
* drivers/staging/olpc_dcon/olpc_dcon.c (timespec, rather broken)
* drivers/staging/ozwpan/ozhcd.c (timespec)
* drivers/staging/ozwpan/ozproto.c (timespec)
* kernel/cpuset.c (time_t) [Status: Completed, Heena Sirwani]
* fs/reiserfs/journal.c (time_t)
* drivers/scsi/ips.c (time_t)
* sound/pci/es1968.c (timeval) [Status: Completed, Tina Ruchandani]
* kernel/power/hibernate.c (timeval) [Status: Completed, Tina Ruchandani]
* drivers/s390/net/ctcm_fsms.c (timespec) [Status: Completed, Aya Mahfouz]
* drivers/power/ab8500_fg.c (timespec) [Status: Completed, Ebru Akagunduz]
=== Medium tasks ===
* Modify an ioctl interface in a driver to support both 32- and 64-bit time interfaces, examples:
* drivers/staging/comedi/comedi_fops.c (INSN_GTOD, timeval)
* drivers/staging/android/alarm-dev.c (timespec)
* include/uapi/linux/atm_zatm.h (zatm_t_hist/timeval)
* include/uapi/linux/videodev2.h (v4l2_buffer/timespec)
* Fix the android logger time format (drivers/staging/android/logger.c)
* Convert the internal timekeeping in fs/nfsd
* Convert all 'ptp' users in the kernel
* Convert all 'struct key' users (time_t)
* Introduce known unsafe types (possibly like __kernel_time32_t, __kernel_compat_time_t etc) so we can annotate interfaces that are known to use a fixed size and cannot be changed to new types.
* Fix all time issues in drivers/staging/lustre (maybe advanced task)
=== Advanced tasks ===
* Introduce a new system call family to replace one or more of the problematic calls listed above.
* Change the on-disk layout of a broken file system to optionally support longer time stamps
* Port a small C library (uClibc, newlib, ...) to optionally use 64-bit time_t and build an embedded distribution (openembedded, openwrt, buildroot, ...) with this.
=== Tasks later in the project ===
* Hook up all 32-bit architectures to use the new system calls
* Introduce a Kconfig symbol to disable all code that has not yet been converted at compile time.