Swift’s native Clocks are very inefficient – Wade Tregaskis

By which I mean, things like ContinuousClock and SuspendingClock .

In absolute terms they don’t have much overhead – think sub-microsecond for most uses. Which makes them perfectly acceptable when they’re used sporadically (e.g. only a few times per second).

However, if you need to deal with time and timing more frequently, their inefficiency can become a serious bottleneck.

I stumbled into this because of a fairly common and otherwise uninteresting pattern – throttling UI updates on an I/O operation’s progress. This might look something like:

struct Example : View { let bytes: AsyncSequence< UInt8 > @State var byteCount = 0 var body: some View { Text ( "Bytes so far: \( byteCount. formatted (. byteCount ( style : . binary )) ) " ) . task { var unpostedByteCount = 0 let clock = ContinuousClock () var lastUpdate = clock. now for try await byte in bytes { … // Do something with the byte. unpostedByteCount += 1 let now = clock. now let delta = now - lastUpdate if ( delta > . seconds ( 1 ) || ( (delta > . milliseconds ( 100 ) && 1_000_000 <= unpostedByteCount))) { byteCount += unpostedByteCount unpostedByteCount = 0 lastUpdate = now } } } } }

☝️ This isn’t a complete implementation, as it won’t update the byte count if the download stalls (since the lack of incoming bytes will mean no iteration on the loop, and therefore no updates even if a full second passes). But it’s sufficient for demonstration purposes here. ????️ Why didn’t I just use throttle from swift-async-algorithms? I did, at first, and quickly discovered that its performance is horrible. While I do suspect I can ‘optimise’ it to not be atrocious, I haven’t pursued that as it was easier to just write my own throttling system.

The above seems fairly straightforward, but if you run it and have any non-trivial I/O rate – even just a few hundred kilobytes per second – you’ll find that it saturates an entire CPU core, not just wasting CPU time but limiting the I/O rate severely.

Using a SuspendingClock makes no difference.

In a nutshell, the problem is that Swift’s Clock protocol has significant overheads by design. If you look at a time profile of code like this, you’ll see things like:

That’s a lot of time wasted in function calls and struct initialisation and type conversion and protocol witnesses and all that guff. The only part that’s actually retrieving the time is the swift_get_time call (which is just a wrapper over clock_gettime , which is just a wrapper over clock_gettime_nsec_np(CLOCK_UPTIME_RAW) , which is just a wrapper over mach_absolute_time ).

I wrote some simple benchmarks of various alternative time-tracking methods, with these results with Swift 5.10 (showing the median runtime of the benchmark, which is a million iterations of checking the time):

Method 10-core iMac Pro M2 MacBook Air ContinuousClock 429 ms 258 ms SuspendingClock 430 ms 247 ms Date / NSDate 30 ms 19 ms clock_gettime_nsec_np(CLOCK_MONOTONIC_RAW) 32 ms 10 ms clock_gettime_nsec_np(CLOCK_UPTIME_RAW) 27 ms 10 ms gettimeofday 24 ms 12 ms mach_absolute_time 15 ms 6 ms

All these alternative methods are well over an order of magnitude faster than Swift’s native clock APIs, showing just how dreadfully inefficient the Swift Clock API is.

mach_absolute_time for the win

Unsurprisingly, mach_absolute_time is the fastest. It is what all these other APIs are actually based on; it is the lowest level of the time stack.

The downside to calling mach_absolute_time directly, though, is that it’s on Apple’s “naughty” list – apparently it’s been abused for device fingerprinting, so Apple require you to beg for special permission if you want to use it (even though it’s used by all these other APIs anyway, as the basis for their implementations, and there’s nothing you can get from mach_absolute_time that you can’t get from them too ????).

I was quite surprised to see good ol’ Date (a.k.a. NSDate ) performing competitively with the traditional C-level APIs, at least on x86-64. Even on arm64 it’s not bad, at still a third to half the speed of the C APIs. This surprised me because it has the overhead of at least one Objective-C message send (for timeIntervalSinceNow ), unless somehow the Swift compiler is optimising that into a static function call, or inlining it entirely…?

Update: I later looked at the disassembly, and found no message sends, only a plain function call to Foundation.Date.timeIntervalSinceNow.getter (which is only 40 instructions, on arm64, over clock_gettime and __stack_chk_fail – and the former is hundreds of instructions, so it’s adding relatively little overhead to the C API). This isn’t being done by the compiler, it’s because that’s actually how it’s implemented in Foundation. I keep forgetting that Foundation from Swift is no longer just the old Objective-C Foundation, but rather mostly the new Foundation that’s written in native Swift. So these performance results likely don’t apply once you go back far enough in Apple OS releases (to when Swift really was calling into the Objective-C code for NSDate ) – but it’s safe to rely on good Date performance now and in future.