The year 2038 problem
All 32-bit kernels to date use a signed 32-bit time_t type, which can only represent time until January 2038. Since embedded systems running 32-bit Linux are going to survive beyond that date, we have to change all current uses, in a backwards compatible way.
User space interfaces
We will likely keep the 32-bit time_t in all user space interfaces that currently use it, but add new interfaces with a 64-bit timespec or another type that can represent later times. Most importantly that impacts system calls, but also specific ioctl commands and a few other interfaces. User space programs have to be recompiled to use the new interfaces, and the policy whether to use the old or the time time is left to the C library. While that policy is a complex topic itself, we don't cover it here.
System calls
https://docs.google.com/spreadsheets/d/1HCYwHXxs48TsTb6IGUduNjQnmfRvMPzCN6T_0YiQwis has a table of all affected system calls, here are some explanations:
clocks and timers
clock_gettime, clock_settime, clock_adjtime, clock_getres, clock_nanosleep, timer_gettime, timer_settime, timerfd_gettime, timerfd_settime:
- these should be done consistently, either using timespec64 or 64-bit nanoseconds, either one works. 64-bit nanoseconds would simplify the kernel internally quite a bit by avoiding the double timekeeping (we keep track of both nanoseconds and timespec in the timekeeper struct). the downside of nanoseconds-only is that each existing caller would need a conversion in user space, where currently we can avoid the expensive ktime_to_ts() for some cases.
time, stime, gettimeofday, settimeofday, adjtimex, nanosleep, getitimer, setitimer:
all deprecated => wontfix
i/o
pselect6, ppoll, io_getevents, recvmmsg:
- These currently pass a timespec into the kernel with *relative* timeouts. Internally, they convert it to ktime_t and back on the way out. We have three options: - leave as is, get the libc to convert 64-bit timespec to 32-bit
- timespec on the way into the kernel and back on the way out, which works because the relative timeout will not overflow
- expense of requiring user space to convert it (all except io_getevents pass back the remaining time).
select, old_selct, pselect6: deprecated
ipc
mq_timedsend, mqtimedreceive: These get an *absolute* timeout, so we have to change them. Internally they use ktime_t, so that would be the natural interface, but timespec64 would work as well.
semtimedop: This uses a relative timeout that is converted to jiffies internally, so using ktime_t would not be as natural, unless we rewrite the function to use hrtimers.
msgctl, semctl, shmctl: These have an output, which is a time_t that stores the absolute seconds value of the last time something happened. Internally this comes from get_seconds(), which has to be efficient anyway. The best way forward is probably to use a structure layout for these that is compatible with what 64-bit architectures do. Note that the structures sometimes have padding to deal with the extension of time_t to 64-bit, but not all architectures have that, and some (notably big-endian arm) have it in the wrong place, so my feeling is that we're better off not using that padding and instead doing something that works for everyone.
inodes and filesystems
utimesnsat, fstat64, fstatat64:
inode timestamps need to represent times before 1970 and way into the future, so we need 64-bit time_t here, I see no other alternatives here, so we have to pass struct timespec64 into utimensat, and create version 4 of 'struct stat' to pass into the future fstat and fstatat. I would use a version that matches the 64-bit layout of 'struct stat'.
utime, utimes, futimensat, oldstat, oldlstat, oldfstat, newstat, newlstat, newfstat, newfstatat, stat64 and lstat64: these are all deprecated now, we have to stop getting this wrong!
tasks
getrusage, waitid: these pass a 'struct rusage' that contains a 'struct timeval' with elapsed time. Again there are multiple options: - We could change rusage to contain a new 'struct relative_timeval'
- instead, with an unchanged layout, which makes the format incompatible with a standard libc that uses a 64-bit based timeval.
- We could make the layout the same as on 64-bit machines, as x32 does,
- which is again incompatible with posix but would work better
- We could make the layout what glibc expects, using 64-bit based
- timeval structures at the beginning.
- We could define a new structure usings pure nanosecond counters.
rt_sigtimedwait: This passes a relative timespec value in back out, so we could keep the current layout and have glibc convert it, or change it to something else. The kernel internally converts to jiffies to call schedule_timeout.
futex: this passes a relative *or* absolute timespec in, so we have to change it. The kernel uses ktime_t internally here, so we could make the interface nanosecond based or stick with timespec64.
sched_rr_get_interval: This returns a timespec with the schedule interval to user space, using a 32-bit based format is fine here, or we could convert to timespec64. The kernel uses jiffies internally.
wait4: replaced by waitid
system wide
sysinfo: struct sysinfo contains 'kernel_long_t uptime', we can keep that, it's fine.
There are numerous ioctl commands using a time argument. This list is incomplete
Socket timestamps are exported to user space using a memory mapped interface defined in include/uapi/linux/if_packet.h. There are currently three versions of this interface, all use a 32-bit time type. We will likely need a version 4 to solve this.
Structure and IOCTL dependency:
Each file system stores its file modification times in its own format on disk, and a lot of them have the same problem. file system time type expiration year 9p (9P2000) unsigned 32-bit seconds 2106 9p (9P2000.L) signed 64-bit seconds, ns never adfs 40-bit cs since 1900 2248 affs u32 days/mins/(secs/50) 11760870 afs unsigned 32-bit seconds 2106 befs unsigned 48-bit seconds never bfs unsigned 32-bit seconds 2106 btrfs signed 64-bit seconds, 32-bit ns never ceph unsigned 32-bit second/ns 2106 cifs (smb) 7-bit years since 1980 2107 cifs (modern) 64-bit 100ns since 1601 30328 coda timespec ioctl 2038 cramfs fixed 1970 efs unsigned 32-bit seconds 2106 exofs signed 32-bit seconds 2038 ext2 signed 32-bit seconds 2038 ext3 signed 32-bit seconds 2038 ext4 (good old inodes) signed 32-bit seconds 2038 ext4 (new inodes 34 bit seconds / 30-bit ns (but broken) 2038 f2fs 64-bit seconds / 32-bit ns never fat 7-bit years since 1980, 2s resolution 2107 freevxfs unsigned 32-bit seconds/u32 microseconds 2106 fuse 64-bit second/32-bit ns never gfs2 u64 seconds/u32 ns never hfs u32 seconds since 1904 2040 hfsplus u32 seconds since 1904 2040 hostfs timespec 2038 hpfs unsigned 32-bit seconds 2106 isofs 'char' year since 1900 (fixable) 2028 jffs2 unsigned 32-bit seconds 2106 jfs unsigned 32-bit seconds/ns 2106 logfs signed 64-bit ns 2262 minix unsigned 32-bit seconds 2106 ncpfs 7-bit year since 1980 2107 nfsv2,v3 unsigned 32-bit seconds/ns 2106 nfsv4 u64 seconds/u32 ns never nfsd unsigned 32-bit seconds/ns 2106 nilfs2 u64 seconds/u32 ns never ntfs 64-bit 100ns since 1601 30828 ocfs2 34-bit seconds/30-bit ns 2514 omfs 64-bit milliseconds never pstore ascii seconds 2106 qnx4 unsigned 32-bit seconds 2106 qnx6 unsigned 32-bit seconds 2106 reiserfs unsigned 32-bit seconds 2106 romfs fixed 1970 squashfs unsigned 32-bit seconds 2106 sysv unsigned 32-bit seconds 2106 ubifs u64 second/u32 ns never udf u16 year 2038 ufs1 unsigned 32-bit seconds 2106 ufs2 signed 64-bit seconds/u32 ns never xfs signed 32-bit seconds/ns 2106
The task list is for people that want to get involved, there will be many more tasks over time, so this is just a starting point. In the end, we should remove all instances of 'time_t', 'timespec' and 'timeval' from the kernel.
Introduce known unsafe types (possibly like
ioctl
memory mapped packet sockets
Audit of include/uapi for time_t impact
time_t
struct msqid64_ds (has 2038 padding!)
struct semid64_ds (has 2038 padding!)
struct cyclades_idle_stats
struct video_event
VIDEO_GET_EVENT
struct msqid_ds
struct ppp_idle
PPPIOCGIDLE
struct semid_ds
union semun
struct timespec
SIOCGSTAMPNS
struct coda_vattr
...
struct scm_timestamping
struct som_hdr
struct itimerspec
struct v4l2_event
VIDIOC_DQEVENT
struct snd_pcm_status
SNDRV_PCM_IOCTL_STATUS
struct snd_pcm_mmap_status
struct snd_pcm_sync_ptr
SNDRV_PCM_IOCTL_SYNC_PTR
struct snd_rawmidi_status
SNDRV_RAWMIDI_IOCTL_STATUS
struct snd_timer_status
SNDRV_TIMER_IOCTL_STATUS
struct snd_timer_tread
struct snd_ctl_elem_value
SNDRV_CTL_IOCTL_ELEM_READ
SNDRV_CTL_IOCTL_ELEM_WRITE
struct timeval
SIOCGSTAMP
struct zatm_t_hist
struct bcm_msg_head
struct elf_prstatus
struct input_event
struct omap3isp_stat_data
VIDIOC_OMAP3ISP_STAT_REQ
PPGETTIME
PPSETTIME
struct rusage
struct itimerval
struct timex
struct v4l2_buffer
VIDIOC_QUERYBUF
VIDIOC_QBUF
VIDIOC_DQBUF
VIDIOC_PREPARE_BUF
struct utimbuf
File systems
Tasks
Small tasks
Medium tasks
kernel_time32_t, kernel_compat_time_t etc) so we can annotate interfaces that are known to use a fixed size and cannot be changed to new types. Advanced tasks
Tasks later in the project