The Hidden Cost of Core Data Managed Relationships

Written by Louis Debaere, Software Engineer watchOS

Introduction: a shared database library

The Runtastic app for iPhone has an Apple Watch counterpart that offers the same great tracking experience. On iOS, our persistence layer is managed by a library on top of Apple’s Core Data framework. Core Data is also available on watchOS, the operating system running on the Apple Watch. It seemed only natural for us to leverage the same persistence framework for both platforms in one shared library, to reuse as much of our existing code as possible.

Does the performance hold up?

Before starting to use it, however, we had to make sure it performed up to our standards. To do this, we built a prototype simulating our most performance critical database operation: saving data during an active sports session. We add data like GPS points and heart rate to a session object that is saved automatically in real time. If the app were to suddenly terminate, we can then safely recover and restore an active session.

Simulation

We created a simple prototype that simulates a typical running session. Just like in the Runtastic app, we save the session to the database every five seconds and add a new GPS location every two seconds.

private var coreDataController: CoreDataController
private var locationTimer: Timer?
private var saveTimer: Timer?
private func start() {
    coreDataController.createSession()
    locationTimer = .scheduledTimer(withTimeInterval: 2.0, repeats: true) { [weak self] _ in
        self?.coreDataController.addLocationToSession()
    }
    saveTimer = .scheduledTimer(withTimeInterval: 5.0, repeats: true) { [weak self] _ in
        self?.coreDataController.save()
    }
}

What actually happens inside our CoreDataController class? It’s a simple Core Data wrapper that is powered by a managed object context.

final class CoreDataController {
    private let context: NSManagedObjectContext
    private var session: Session?
    func save() {
        let start = CFAbsoluteTimeGetCurrent()
        context.performAndWait {
            do {
                try context.save()
            } catch {
                assertionFailure("Error saving managed object context: \(error)")
            }
        }
        let stop = CFAbsoluteTimeGetCurrent()
        os_log("Saving took: %.2f", stop - start)
    }
    func addLocationToSession() {
        context.performAndWait {
            let location: Location = .random(in: context)
            session?.addToLocations(location)
        }
    }
}

We keep track of how long saving takes by logging the difference in absolute time before and after a blocking save operation. This is important — we can’t measure the execution time of an asynchronous call.

Each new location that gets added to the session is populated with random data.

Results

Of course, for an accurate representation of the logged data, we had to test on real devices instead of the watchOS simulator we used for development.
Our simulation kept running for over an hour. The numbers are presented below. On the y-axis you find the saving duration in relation to the elapsed time on the x-axis. Each generation of Apple Watch is depicted for comparison: S0 represents the original, released back in 2015, and S4 is the latest generation.

Two things immediately jump out. First, the saving duration increases linearly, but it should be constant. Our team was quite surprised by this outcome, so the next step was to find a possible cause for it.

Second, the series 4 watch performs significantly better and approaches iPhone-like performance (where we obviously don’t notice an issue).

Investigation

We took a closer look at the data model of our session and it’s attributes, specifically, the locations relationship, which is handled entirely by Core Data.

In our code above, only one call affects the session: session?.addToLocations (location). Find the definition of addToLocations below.

// MARK: Generated accessors for locations
extension Session {
    @objc(addLocationsObject:)
    @NSManaged public func addToLocations(_ value: Location)
    @objc(removeLocationsObject:)
    @NSManaged public func removeFromLocations(_ value: Location)
    @objc(addLocations:)
    @NSManaged public func addToLocations(_ values: NSOrderedSet)
    @objc(removeLocations:)
    @NSManaged public func removeFromLocations(_ values: NSOrderedSet)
}

It’s a function automatically generated by Core Data, along with other useful ones. These accessors manipulate the underlying data type of a Core Data relationship. The data type powering a relationship varies according to its cardinality, arrangement, and more. locations is a one-to-many relationship with an ordered arrangement.

extension Session {
    @NSManaged public var id: String?
    @NSManaged public var startDate: Date?
    @NSManaged public var locations: NSOrderedSet?
}

This means the locations associated with our session are stored in an`NSOrderedSet`, a collection data type that keeps track of the arrangement of its elements on top of guaranteeing uniqueness like a regular set.

First optimization win

After checking our requirements and inspecting our code, we realized we did not need this order characteristic, at least not at the point of saving. We could simply sort the locations according to their `unixTimestamp` property afterwards.

It turned out, only a single change in arrangement made a huge difference. See for yourself:

Switching the arrangement to unordered showed an immediate performance increase. On the baseline device, half an hour in, saving takes about 100 ms compared to 165 ms and after one hour only 150 ms, when it used to take over 300 ms!

Although it drastically reduced the saving time, we weren’t satisfied. The time needed to save was still going up, so we start wondering if Core Data relationships themselves came with some overhead.

Solution

Going deeper, we asked ourselves if the Core Data relationship might cause even more overhead. Luckily, during WWDC, we got a chance to share our findings with an Apple engineer working on the Core Data team. He confirmed our suspicion: we were told that ordered as well as unordered sets in combination with relationships create a memory overhead that increases over time. Unlike the phone, the watch simply doesn’t have the resources to deal with it. At least not yet. We can clearly see it’s catching up quickly.

As a final test we tried linking the locations to the session manually. Instead of the session holding a collection of locations as a relationship, we added a sessionID attribute to each saved location, so we could retrieve them using the ID of their respective session as the key.

This proved to be the ideal approach: we finally saw the constant performance we were aiming for from the start! 🎉

Conclusion

At the end of the day, it’s the data that matters. Once we had proved our saving performance was not optimal, we investigated the root cause and found a more optimized solution.

Apple suggested the same approach as above. We have set out to link all session info we collect manually, like elevation and heart rate, for optimal performance. 

One other benefit of sharing our database framework is that this optimization comes to iOS as well, so we can be even more confident in the efficiency of long running sessions on the phone.

***

RATE THIS ARTICLE NOW

adidas Runtastic Tech Team We are made up of all the tech departments at Runtastic like iOS, Android, Backend, Infrastructure, DataEngineering, etc. We’re eager to tell you how we work and what we have learned along the way. View all posts by adidas Runtastic Tech Team »