Core Data Made Easy: Some Code + Practices for Beginners and Experts

Core Data is Apple’s answer to “Wow! It’s difficult to store objects in an SQL database!”. Extended, over time, to do a lot more than just that – but that’s the core.

If you know what you’re doing – and you avoid the pit-traps along the way – it can be very good indeed. But we frequently see code written by iOS professionals that betrays a misunderstanding of what Apple intended, and misses out on some of the best features of Core Data.

Over time, we’ve coalesced some of our practices into re-useable code and techniques. All the code in this article is already up on GitHub, and I’ll be maintaining it periodically with improvements from our own projects.

(this post will be followed by a few others, or else extended over time)

Alternatives

NB … if you’re not interested in alternative approaches, skip ahead to the next section.

…re-write / extend Core Data?

Before we start, I should point out that if you want “a better Core Data” there’s several things out there. I’ve tried a few, but none of them have ever made it onto our launched apps. This is *even though* I’ve been quietly impressed by some of them.

Why not? Well … here’s the issues that have affected us (and may affect you too)

  1. Core Data – in fact, any Persistence Layer – is massively complex, it deliberately breaks the Objective-C/Cocoa platform, and it’s extremely dangerous to mess with. Some of the tech we’ve seen tries too hard to “fix” CoreData, and in the process has unexpected side-effects.
    • …when you’re dealing with USER DATA, that’s a terrifying prospect
    • Examples include: the Objective-C standard method “description” *corrupts yourd data if you alter it in any way”
    • Examples include: the Objective-C standard method “copy” *is hardcoded* to be unusable on CD objects – you get a nice error message at runtime from Apple if you try to use it, and your app crashes
    • …and other similar terrors. Unless you are 120% confident of your “CoreData++” library, don’t risk it.
  2. Core Data is a single standard that Apple beats developers over the head with. There’s only one version, it’s very difficult to customize.
    • So: if you stick to standard CoreData method calls and behaviour, most developers can immediately understand your code without having to learn your proprietary setup
    • …which saves a lot of time and money on projects with multiple teams / companies involved
  3. For some projects / clients / partners, the “license” on code is a serious issue, and code that you or I would happily use gets “forbidden” by the legal teams.
    • For some clients, anything we use has to be written by us, or have a massively open license (no GPL, no LGPL – not even MIT/BSD, in some cases)
    • For a very narrow set of clients, anything we use has to be reviewed *by us* (or by the client) line by line, and we have to warranty it. i.e. the code needs to be extremely short and simple!
  4. Any code we write on top of this – can it be re-used in a “plain” CoreData project?
    • Often, you want to copy/paste snippets of code from one old project to a new one (assuming of course you – or same client – own the original code).
    • …if you’ve extended CoreData with new classes and methods, you might find this is very hard to do. (this has happened before on some iOS projects, with other libs we’ve used)

So, instead, we focus on making as few changes to CoreData as possible – and generally just sticking to Apple’s own concepts.

Making life easy, with minimal changes

We’re going to do three broad things:

  1. Clean up Apple’s code, and enable some Apple-provided features you probably didn’t know existed
  2. Very slightly extend Apple’s code (mostly: we’re going to make a start at adding some Blocks, since Apple still hasn’t!)
  3. Offer some good practices in how you write YOUR code

1. Replace “100 lines of code” with a simple, easy, encapsulated class

One new class: CoreDataStack.m

One of the best parts of Core Data is that the high-level architecture is neatly split into 4 distinct layers. Collectively, Apple terms these the Core Data “stack” (although that term never appears in code – only in docs). Unfortunately, since all those layers are needed before you can do anything, it requires 100 lines of code just to read a single item from Core Data.

Apple’s image + text showing the layers (may require login to Apple Dev network)

  1. Layer 1: file (on disk, or flash memory – Apple has NOT YET IMPLEMENTED a network version, or a remote DB one, although devs have been asking for this for many years. Unlikely to happen anytime soon)
  2. Layer 2: Persistent Object Store (the DB part)
  3. Layer 3: Persistence Store Co-ordinator (high-level management stuff)
  4. Layer 4: NSManagedObjectContext (this is what 99% of your app code talks to)

Apple has largely ignored this problem; their “solution” is that when you create a new project in Xcode, they dump 100 lines (well, not quite 100, but close) of crud into your AppDelegate class.

This the wrong place to put that code – by Apple’s own rules, it should not go there. If your application is multi-threaded (90% of apps are) then it also MUST NOT go there (that’s almost guaranteed to cause data-loss bugs in the long run).

So, the first thing we did was to create a class – CoreDataStack – that encapsulates Apple’s 100 lines of boilerplate code:

GitHub link to CoreDataStack.m

You can drag/drop this into your projects (take the .h file too, of course) – it has NO DEPENDENCIES, except for CoreData.Framework (which is needed in all Core Data projects, by definition)

Using CoreDataStack

Instead of Apple’s large number of method calls, and large number of properties, in real-world projects you need just one property and just one method.

(NB: this assumes you’ve already used Apple’s GUI Editor to create a CoreDataModel file with one or more Entities and Attributes)

To start using Core Data:

/** one line complete init */
CoreDataStack* cdStack = [CoreDataStack coreDataStackWithModelName:@"MyModelName"];

/** one property you need for all Core Data method calls */
cdStack.managedObjectContext;

…where MyModelName is the filename of the Model file (the thing you click on to get the GUI interface for editing Entities and Attributes)

That’s it! Really! Everything else can be inferred (so we do it automatically, in code).

For compatibility, all the other items you *might, theoretically* want to access (e.g. the PersistenceStoreCoordinator, etc) are presented as properties on CoreDataStack.

Why is this a separate class, and yet not a Singleton?

This is actually very important: you DO NOT WANT CoreDataStack to be a Singleton! (although many people try to write their apps that way – c.f. Apple’s default template and its abuse of AppDelegate.m!)

Why?

Because Core Data was designed to have *multiple simultaneous instances* of NSManagedObjectContext, NSPersistentStoreCoordinator, etc. If you limit yourself to one copy of each – as per Apple’s template – you disable many of the features of CoreData.

Worse, some of those features are *required* if you’re going to use CoreData correctly on a real-world project.

The nearest you can get to making this a singleton is … have one instance per xcdatamodel file (an earlier version of the code would even cache this for you, although we since removed that because the performance impact was invisible – not worth the complexity).

More on this (multiple CoreDataStack instances) in the next section…

Tips and tricks: Improving your own code

Use multiple CoreDataStack instances

Here’s a little secret: Core Data was designed for you to access MULTIPLE “models” at once, within a single app … but Apple’s default template for iPhone projects makes that impossible (unless you replace it).

Also, remember the most important rule of Core Data: Apple’s source code is NOT THREAD SAFE and WILL CORRUPT YOUR DATA at random intervals if your app is multithreaded in the “wrong” way [click for more info].

…but here’s another, bigger, secret: if you’re clever, using multiple models at once, you can *avoid* the need for thread-safe code – both your code and Apple’s code. This is the only excuse for Apple’s code being unsafe: you can avoid the need for safety.

So, with CoreDataStack, to use a second model in your app, all you do is:

/** one line complete init */
CoreDataStack* cdOtherStack = [CoreDataStack coreDataStackWithModelName:@"MyOtherModelName"];

/** one property you need for all Core Data method calls */
cdOtherStack.managedObjectContext;

So long as you only have one thread reading and writing to each CoreDataStack instance, you will avoid all the bugs caused by Apple’s unsafe code. In most multi-threaded apps, it’s easy to split your data into multiple separate models, and have each model locked to a particular thread.

Multiple models … what?

Say you’re writing an Email client, that has the following classes for Core Data:

  • Email.m : Subject, To, From, Body, isReadYet, DateReceived
  • Person.m : EmailAddress, Name

Your app does the following:

  • Shows a list of emails, that you can click on to read
  • Shows a list of people – tap a person’s name to send them an email, or tap an “Add” button to add a new person
  • Downloads emails in the background automatically
  • Syncs your contacts to an addressbook server in the background

You now have a multi-threading problem – two sets of background threads accessing Core Data. This *will corrupt your data*. Apple has never tried to make CoreData thread-safe.

Now you have two choices:

  • OPTION 1: learn how to write the bizarre code that lets CoreData function with multiple threads. Pray that none of your colleagues dares to edit any of your code – because if they do, there’s a high chance they’ll break it without realizing
  • OPTION 2: Instead of having ONE model, have TWO models. One model contains just “Email” and the other contains just “Person”. Each background thread is associated 1:1 with a separate CoreDataStack instance – and suddenly everything is Thread-Safe. Nothing to worry about!

NB: this works because of CoreData’s fundamental design: Apple’s code is *not thread-safe for a single NSManagedObjectContext, but it IS thread-safe for multiple separate NSManagedObjectContext’s in memory at once*

Experts only: Single CoreData model, multiple threads

Further, if you know what you’re doing with multi-threading, there are many situations where you need to have two copies in memory of the SAME object-model. You’ll be manually cross-synching (there lie dragons).

But to be thread-safe, you have to guarantee that the entire stack – NSPersistenceCoordinator etc – is separate for each thread. The hard way to do this is to manually manage it. The easy way is to init a separate CoreDataStack instance per thread – and this will automatically be thread-safe, because each CoreDataStack is coded to NOT share any data or references between instances.

Good practices – applies to all CoreData projects

Don’t hard-code your class-names

Apple’s example source code for using CoreData is technically correct, but practically poor. It encourages a bad habit that – for most projects – is unnecessary and the cause of many bugs over time.

When you need to reference a CD object, Core Data is written so that you don’t neet to have access to the Class of that object. In theory, you can load a Class (from CoreData’s database, saved by a different app) that doesn’t exist in your project. We’re getting into some weird and freaky stuff here.

Just in case – even though 90% of coders will never use that feature – Apple tells you to use NSString to instantiate your CoreData objects. This is bad for most projects:

  1. Apple’s refactor tools in Xcode are 10 years behind everyone else’s – they don’t support CoreData, and they ignore those strings – if you refactor a CoreData class, Xcode will break your project
  2. It’s very very easy to make a typo when writing that string. Xcode 3 would auto-complete the name for you, but Xcode 4 removed this feature

So, each time you need to create a new object in the CoreData database/store, instead of this:

Email* newEmail = [NSEntityDescription insertNewObjectForEntityForName:@"Email" inManagedObjectContext:cdStack.managedObjectContext];

do this:

Email* newEmail = [NSEntityDescription insertNewObjectForEntityForName:NSStringFromClass([Email class]) inManagedObjectContext:cdStack.managedObjectContext];

…using Xcode’s autocomplete, this is the same or fewer keystrokes, even though it results in more text. More importantly, the compiler will now double-check that class for you, and refuse to build if you typo it. You’re also safe(r) with refactoring.

Similarly, when you do a Fetch with CoreData, instead of passing the string-name of the class, use the same NSStringFromClass call as above, for the same benefits.

Upgrading the CoreDataStack

Save Errors SHOULD NEVER BE IGNORED

Apple doesn’t like Exception Handlers and Assertions. Personally, I think 30 years of computer industry have proved them wrong there, but I’m willing to accept it.

Except for when a SAVE of CoreData fails; in this particular case, it is totally unacceptable for your app to silently ignore the error. Sadly, Apple’s default setup encourages you to do this.

How many times have you seen this in an app:

[self.managedObjectContext save:nil];

?

Sadly, I see it all the time. Because the alternative requires a minimum of 5 lines of code, including a double-pointer (that a lot of junior programmers and/or people who’ve never used C/C++ seem to feel uncomfortable with).

As a bonus, Apple’s code has some nasty behaviour with that save method. They have documented this (so I guess it’s a “feature” instead of a bug – but see what you think):

If managedObjectContex is nil, then “a save that fails” will ALWAYS seem to return “there was no error”.

This is one of the most painful things I’ve had to debug on CoreData projects. Many times.

So, we’re going to fix both of those. CoreDataStack.m has an optional method:

-(void) saveOrFail:(void(^)(NSError* errorOrNil)) blockFailedToSave
  1. Because it’s a block, it’s *just as easy to log the error* as it is to ignore it (instead of requiring you to write error-creating + checkign code). Further, because it’s a block, it’s easy to nest attempted saves, logs, and failures.
  2. If your NSManagedObjectContex is nil … instead of doing what Apple does (silent failure; pretends that the save has succeeded) … we explicitly FAIL (c.f. the source code for the saveOrFail method).

PS

CoreData corrupts data if used multi-threaded?


Until 2011, this wasn’t even documented, but … not only is CoreData “not thread-safe”, but if you merely allocate/init a CoreData context from a different thread you will cause it to quietly self-destruct later on – after it’s taken your precious data. It’s easy to understand once you know – but this is not an obvious side-effect, and I’ve seen it catch a few people.

e.g. one of the patterns coders previously used to handle multi-threading – because Apple hadn’t documented this requirement – involved Thread A creating the context, adding notification listeners, and then passing the context to the thread that would use it.

This works most of the time – I know, because I used this pattern back in 2009 – and I’d learnt it from someone else who’d been using it for a while themselves. But it also fails some of the time, unpredictably, with strange crashes deep in Apple’s code. Now we know better, of course.

My take-home: multi-threaded CoreData is to be avoided unless you *really* know what you’re doing with CoreData. Given that Apple has been very slow (years slow!) to document this aspect of CoreData, I don’t recommend it, unless you’re willing to spend a lot of time learning the “community wisdom” on the topic.