• Immutable Page
  • Info
  • Attachments

Diff for "y2038"

Differences between revisions 19 and 20

Deletions are marked like this. Additions are marked like this.
Line 15: Line 15:
these should be done consistently, either using timespec64 or 64-bit nanoseconds, either one works. 64-bit nanoseconds would simplify the kernel internally quite a bit by avoiding the double timekeeping (we keep track of both nanoseconds and timespec in the timekeeper struct). the downside of nanoseconds-only is that each existing caller would need a conversion in user space, where currently we can avoid the expensive ktime_to_ts() for some cases.  these should be done consistently, either using timespec64 or 64-bit nanoseconds, either one works. 64-bit nanoseconds would simplify the kernel internally quite a bit by avoiding the double timekeeping (we keep track of both nanoseconds and timespec in the timekeeper struct). the downside of nanoseconds-only is that each existing caller would need a conversion in user space, where currently we can avoid the expensive ktime_to_ts() for some cases. Current patches use 64-bit timespec.
Line 23: Line 23:
These currently pass a timespec into the kernel with *relative* timeouts. Internally, they convert it to ktime_t and back on the way out. We have three options:
 
These currently pass a timespec into the kernel with *relative* timeouts. Internally, they convert it to ktime_t and back on the way out. We have three options:

 a. leave the current behavior, but use 64-bit timespec. This is what the current patches do
Line 29: Line 32:
 a. leave the current behavior, but use 64-bit timespec.
Line 37: Line 39:
 These get an *absolute* timeout, so we have to change them. Internally they use ktime_t, so that would be the natural interface, but timespec64 would work as well.  These get an *absolute* timeout, so we have to change them. Internally they use ktime_t, so that would be the natural interface, but timespec64 would work as well. The patches currently use a 64-bit timespec.
Line 40: Line 42:
 This uses a relative timeout that is converted to jiffies internally, so using ktime_t would not be as natural, unless we rewrite the function to use hrtimers.  This uses a relative timeout that is converted to jiffies internally, so using ktime_t would not be as natural, unless we rewrite the function to use hrtimers. The current patches simply use a 64-bit timespec.
Line 43: Line 45:
 These have an output, which is a time_t that stores the absolute seconds value of the last time something happened. Internally this comes from get_seconds(), which has to be efficient anyway. The best way forward is probably to use a structure layout for these that is compatible with what 64-bit architectures do. Note that the structures sometimes have padding to deal with the extension of time_t to 64-bit, but not all architectures have that, and some (notably big-endian arm) have it in the wrong place, so my feeling is that we're better off not using that padding and instead doing something that works for everyone.  These have an output, which is a time_t that stores the absolute seconds value of the last time something happened. Internally this comes from get_seconds(), which has to be efficient anyway. There is sufficient padding available on all architectures (except MIPS) to extend the structure in a backwards-compatible way to 64 bit time values. Because of endianess concerns, we cannot simply replace time_t with time64_t unfortunately, but have to pass the low and high 32-bit components separately. On MIPS, we can only use 48-bit times for some structures, but that should be good enough as well.
Line 48: Line 50:
 inode timestamps need to represent times before 1970 and way into the future, so we need 64-bit time_t here, I see no other alternatives here, so we have to pass struct timespec64 into utimensat, and create version 4 of 'struct stat' to pass into the future fstat and fstatat. I would use a version that matches the 64-bit layout of 'struct stat'.  inode timestamps need to represent times before 1970 and way into the future, so we need 64-bit time_t here, I see no other alternatives here, so we have to pass struct timespec64 into utimensat, and create version 4 of 'struct stat' to pass into the future fstat and fstatat. I would use a version that matches the 64-bit layout of 'struct stat', as posted in v1 of the patch set.
Line 58: Line 60:
 a. We could change rusage to contain a new 'struct relative_timeval' instead, with an unchanged layout, which makes the format incompatible with a standard libc that uses a 64-bit based timeval.

 a. We could make the layout the same as on 64-bit machines, as x32 does, which is again incompatible with posix but would work better
 a. We could change rusage to contain a new 'struct relative_timeval' instead, with an unchanged layout, which makes the format incompatible with a standard libc that uses a 64-bit based timeval. There will be no y2038 overflow, but the ru_utime may overflow on systems with many cores when the total runtime of a task exceeds 68 years, e.g. after running on continuously for 259 days on a 96-core machine that is getting affordable now.

 a. We could make the layout the same as on 64-bit machines, as x32 does, which is again incompatible with posix but would work better. This is what the patch currently does.

The year 2038 problem

All 32-bit kernels to date use a signed 32-bit time_t type, which can only represent time until January 2038. Since embedded systems running 32-bit Linux are going to survive beyond that date, we have to change all current uses, in a backwards compatible way.

User space interfaces

We will likely keep the 32-bit time_t in all user space interfaces that currently use it, but add new interfaces with a 64-bit timespec or another type that can represent later times. Most importantly that impacts system calls, but also specific ioctl commands and a few other interfaces. User space programs have to be recompiled to use the new interfaces, and the policy whether to use the old or the time time is left to the C library. While that policy is a complex topic itself, we don't cover it here.

System calls

https://docs.google.com/spreadsheets/d/1HCYwHXxs48TsTb6IGUduNjQnmfRvMPzCN6T_0YiQwis has a table of all affected system calls, here are some explanations:

clocks and timers

* clock_gettime, clock_settime, clock_adjtime, clock_getres, clock_nanosleep, timer_gettime, timer_settime, timerfd_gettime, timerfd_settime:

  • these should be done consistently, either using timespec64 or 64-bit nanoseconds, either one works. 64-bit nanoseconds would simplify the kernel internally quite a bit by avoiding the double timekeeping (we keep track of both nanoseconds and timespec in the timekeeper struct). the downside of nanoseconds-only is that each existing caller would need a conversion in user space, where currently we can avoid the expensive ktime_to_ts() for some cases. Current patches use 64-bit timespec.

time, stime, gettimeofday, settimeofday, adjtimex, nanosleep,getitimer, setitimer:

  • all deprecated => wontfix

i/o

pselect6, ppoll, io_getevents, recvmmsg:

  • These currently pass a timespec into the kernel with *relative* timeouts. Internally, they convert it to ktime_t and back on the way out. We have three options:

  1. leave the current behavior, but use 64-bit timespec. This is what the current patches do

  2. leave as is, get the libc to convert 64-bit timespec to 32-bit timespec on the way into the kernel and back on the way out, which works because the relative timeout will not overflow

  3. use ktime_t to make these more efficient in the kernel, at the expense of requiring user space to convert it (all except io_getevents pass back the remaining time).

select, old_select, poll:

  • deprecated

ipc

mq_timedsend, mqtimedreceive:

  • These get an *absolute* timeout, so we have to change them. Internally they use ktime_t, so that would be the natural interface, but timespec64 would work as well. The patches currently use a 64-bit timespec.

semtimedop:

  • This uses a relative timeout that is converted to jiffies internally, so using ktime_t would not be as natural, unless we rewrite the function to use hrtimers. The current patches simply use a 64-bit timespec.

msgctl, semctl, shmctl:

  • These have an output, which is a time_t that stores the absolute seconds value of the last time something happened. Internally this comes from get_seconds(), which has to be efficient anyway. There is sufficient padding available on all architectures (except MIPS) to extend the structure in a backwards-compatible way to 64 bit time values. Because of endianess concerns, we cannot simply replace time_t with time64_t unfortunately, but have to pass the low and high 32-bit components separately. On MIPS, we can only use 48-bit times for some structures, but that should be good enough as well.

inodes and filesystems

utimesnsat, fstat64, fstatat64:

  • inode timestamps need to represent times before 1970 and way into the future, so we need 64-bit time_t here, I see no other alternatives here, so we have to pass struct timespec64 into utimensat, and create version 4 of 'struct stat' to pass into the future fstat and fstatat. I would use a version that matches the 64-bit layout of 'struct stat', as posted in v1 of the patch set.

utime, utimes, futimensat, oldstat, oldlstat, oldfstat, newstat, newlstat, newfstat, newfstatat, stat64 and lstat64:

  • These are all deprecated now, we have to stop getting this wrong!

tasks

getrusage, waitid:

  • these pass a 'struct rusage' that contains a 'struct timeval' with elapsed time. Again there are multiple options:

  1. We could change rusage to contain a new 'struct relative_timeval' instead, with an unchanged layout, which makes the format incompatible with a standard libc that uses a 64-bit based timeval. There will be no y2038 overflow, but the ru_utime may overflow on systems with many cores when the total runtime of a task exceeds 68 years, e.g. after running on continuously for 259 days on a 96-core machine that is getting affordable now.

  2. We could make the layout the same as on 64-bit machines, as x32 does, which is again incompatible with posix but would work better. This is what the patch currently does.

  3. We could make the layout what glibc expects, using 64-bit based timeval structures at the beginning.

  4. We could define a new structure usings pure nanosecond counters.

rt_sigtimedwait:

  • This passes a relative timespec value in back out, so we could keep the current layout and have glibc convert it, or change it to something else. The kernel internally converts to jiffies to call schedule_timeout.

futex:

  • this passes a relative *or* absolute timespec in, so we have to change it. The kernel uses ktime_t internally here, so we could make the interface nanosecond based or stick with timespec64.

sched_rr_get_interval:

  • This returns a timespec with the schedule interval to user space, using a 32-bit based format is fine here, or we could convert to timespec64. The kernel uses jiffies internally.

wait4:

  • replaced by waitid

system wide

sysinfo:

  • struct sysinfo contains 'kernel_long_t uptime', we can keep that, it's fine.

ioctl

There are numerous ioctl commands using a time argument. This list is incomplete

  • audio time stamps

  • v4l time stamps

  • input event time stamps

  • socket time stamps

  • ...

memory mapped packet sockets

Socket timestamps are exported to user space using a memory mapped interface defined in include/uapi/linux/if_packet.h. There are currently three versions of this interface, all use a 32-bit time type. We will likely need a version 4 to solve this.

Audit of include/uapi for time_t impact

Structure and IOCTL dependency:

time_t
        struct msqid64_ds (has 2038 padding!)
        struct semid64_ds (has 2038 padding!)
        struct cyclades_idle_stats
        struct video_event
                VIDEO_GET_EVENT
        struct msqid_ds
        struct ppp_idle
                PPPIOCGIDLE
        struct semid_ds
                union semun
        struct timespec
                SIOCGSTAMPNS
                struct coda_vattr
                        ...
                struct scm_timestamping
                struct som_hdr
                struct itimerspec
                struct v4l2_event
                        VIDIOC_DQEVENT
                struct snd_pcm_status
                        SNDRV_PCM_IOCTL_STATUS
                struct snd_pcm_mmap_status
                        struct snd_pcm_sync_ptr
                                SNDRV_PCM_IOCTL_SYNC_PTR
                struct snd_rawmidi_status
                        SNDRV_RAWMIDI_IOCTL_STATUS
                struct snd_timer_status
                        SNDRV_TIMER_IOCTL_STATUS
                struct snd_timer_tread
                struct snd_ctl_elem_value
                        SNDRV_CTL_IOCTL_ELEM_READ
                        SNDRV_CTL_IOCTL_ELEM_WRITE
        struct timeval
                SIOCGSTAMP
                struct zatm_t_hist
                struct bcm_msg_head
                struct elf_prstatus
                struct input_event
                struct omap3isp_stat_data
                        VIDIOC_OMAP3ISP_STAT_REQ
                PPGETTIME
                PPSETTIME
                struct rusage
                struct itimerval
                struct timex
                struct v4l2_buffer
                        VIDIOC_QUERYBUF
                        VIDIOC_QBUF
                        VIDIOC_DQBUF
                        VIDIOC_PREPARE_BUF
        struct utimbuf

File systems

Each file system stores its file modification times in its own format on disk, and a lot of them have the same problem.

file system

time type

expiration year

9p (9P2000)

unsigned 32-bit seconds

2106

9p (9P2000.L)

signed 64-bit seconds, ns

never

adfs

40-bit cs since 1900

2248

affs

u32 days/mins/(secs/50)

11760870

afs

unsigned 32-bit seconds

2106

befs

unsigned 48-bit seconds

never

bfs

unsigned 32-bit seconds

2106

btrfs

signed 64-bit seconds, 32-bit ns

never

ceph

unsigned 32-bit second/ns

2106

cifs (smb)

7-bit years since 1980

2107

cifs (modern)

64-bit 100ns since 1601

30328

coda

timespec ioctl

2038

cramfs

fixed

1970

efs

unsigned 32-bit seconds

2106

exofs

signed 32-bit seconds

2038

ext2

signed 32-bit seconds

2038

ext3

signed 32-bit seconds

2038

ext4 (good old inodes)

signed 32-bit seconds

2038

ext4 (new inodes

34 bit seconds / 30-bit ns (but broken)

2038

f2fs

64-bit seconds / 32-bit ns

never

fat

7-bit years since 1980, 2s resolution

2107

freevxfs

unsigned 32-bit seconds/u32 microseconds

2106

fuse

64-bit second/32-bit ns

never

gfs2

u64 seconds/u32 ns

never

hfs

u32 seconds since 1904

2040

hfsplus

u32 seconds since 1904

2040

hostfs

timespec

2038

hpfs

unsigned 32-bit seconds

2106

isofs

'char' year since 1900 (fixable)

2028 (!)

jffs2

unsigned 32-bit seconds

2106

jfs

unsigned 32-bit seconds/ns

2106

logfs

signed 64-bit ns

2262

minix

unsigned 32-bit seconds

2106

ncpfs

7-bit year since 1980

2107

nfsv2,v3

unsigned 32-bit seconds/ns

2106

nfsv4

u64 seconds/u32 ns

never

nfsd

unsigned 32-bit seconds/ns

2106

nilfs2

u64 seconds/u32 ns

never

ntfs

64-bit 100ns since 1601

30828

ocfs2

34-bit seconds/30-bit ns

2514

omfs

64-bit milliseconds

never

pstore

ascii seconds

2106

qnx4

unsigned 32-bit seconds

2106

qnx6

unsigned 32-bit seconds

2106

reiserfs

unsigned 32-bit seconds

2106

romfs

fixed

1970

squashfs

unsigned 32-bit seconds

2106

sysv

unsigned 32-bit seconds

2106

ubifs

u64 second/u32 ns

never

udf

u16 year

2038

ufs1

unsigned 32-bit seconds

2106

ufs2

signed 64-bit seconds/u32 ns

never

xfs

signed 32-bit seconds/ns

2106

Tasks

The task list is for people that want to get involved, there will be many more tasks over time, so this is just a starting point. In the end, we should remove all instances of 'time_t', 'timespec' and 'timeval' from the kernel.

Small tasks

  • Find a driver using time_t/timespec/timeval internally and convert it to ktime_t/timespec64, examples:

    • drivers/staging/media/lirc/lirc_imon.c (timeval, trivial)

    • drivers/staging/ft1000/ (time_t and timeval)

    • drivers/staging/android/sync_debug.c (timeval, very easy)

    • drivers/staging/android/timed_gpio.c (timeval, easy)

    • drivers/staging/bcm/LeakyBucket.c (timeval, slightly tricky)

    • drivers/staging/bcm/Bcmchar.c (timeval, very easy)

    • drivers/staging/comedi/drivers/comedi_test.c (timeval)

    • drivers/staging/comedi/drivers/serial2002.c (timeval, easy)

    • drivers/staging/dgnc/dgnc_tty.c (timeval, very easy)

    • drivers/staging/gdm72xx/ (timeval, easy)

    • drivers/staging/media/lirc/lirc_igorplugusb.c (timeval)

    • drivers/staging/media/lirc/lirc_parallel.c (timeval, easy)

    • drivers/staging/media/lirc/lirc_sasem.c (timeval, very easy)

    • drivers/staging/media/lirc/lirc_serial.c (timeval, easy)

    • drivers/staging/media/lirc/lirc_sir.c (timeval)

    • drivers/staging/rts5208/rtsx.h (timeval) [Status: Completed, Ksenija Stanojevic]

    • drivers/staging/olpc_dcon/olpc_dcon.c (timespec, rather broken)

    • drivers/staging/ozwpan/ozhcd.c (timespec)

    • drivers/staging/ozwpan/ozproto.c (timespec)

    • kernel/cpuset.c (time_t) [Status: Completed, Heena Sirwani]

    • fs/reiserfs/journal.c (time_t)

    • drivers/scsi/ips.c (time_t)

    • sound/pci/es1968.c (timeval) [Status: Completed, Tina Ruchandani]

    • kernel/power/hibernate.c (timeval) [Status: Completed, Tina Ruchandani]

    • drivers/s390/net/ctcm_fsms.c (timespec) [Status: Completed, Aya Mahfouz]

    • drivers/power/ab8500_fg.c (timespec) [Status: Completed, Ebru Akagunduz]

Medium tasks

  • Modify an ioctl interface in a driver to support both 32- and 64-bit time interfaces, examples:

    • drivers/staging/comedi/comedi_fops.c (INSN_GTOD, timeval)

    • drivers/staging/android/alarm-dev.c (timespec)

    • include/uapi/linux/atm_zatm.h (zatm_t_hist/timeval)

    • include/uapi/linux/videodev2.h (v4l2_buffer/timespec)

  • Fix the android logger time format (drivers/staging/android/logger.c)

  • Convert the internal timekeeping in fs/nfsd

  • Convert all 'ptp' users in the kernel

  • Convert all 'struct key' users (time_t)

  • Introduce known unsafe types (possibly like kernel_time32_t, kernel_compat_time_t etc) so we can annotate interfaces that are known to use a fixed size and cannot be changed to new types.

  • Fix all time issues in drivers/staging/lustre (maybe advanced task)

Advanced tasks

  • Introduce a new system call family to replace one or more of the problematic calls listed above.

  • Change the on-disk layout of a broken file system to optionally support longer time stamps

  • Port a small C library (uClibc, newlib, ...) to optionally use 64-bit time_t and build an embedded distribution (openembedded, openwrt, buildroot, ...) with this.

Tasks later in the project

  • Hook up all 32-bit architectures to use the new system calls

  • Introduce a Kconfig symbol to disable all code that has not yet been converted at compile time.

Tell others about this page:

last edited 2015-05-11 15:03:34 by arnd