MacKuba

Kuba Suder's blog on Mac & iOS development

Advanced CloudKit

Categories: CloudKit, iCloud, WWDC 14 0 comments Watch the video

CloudKit API is designed to be asynchronous, all calls return through a callback, because they all require a network connection

The main API (“operational API”) is based on NSOperation

You use it by creating special NSOperation objects for a given use case, e.g. CKFetchRecordsOperation, and specifying parameters and callbacks in its properties

Apart from the final result callback, you can set callbacks e.g. for reporting download progress or to get records one by one as they’re downloaded

Operation lifecycle (cancelling, suspending etc.) can be managed through standard NSOperation methods and NSOperationQueue

There are separete fetch/modify operation types for records, subscriptions, zones, users and notifications

You can set dependencies between operations (also if they’re in different queues), e.g. make a fetch operation and then a modify operation that needs to wait for the object to load

Operations can also have different priority levels

Starting an operation:

(ℹ️ Note: this wasn’t in the video, but it really should have been, because it's completely not obvious.)

How to start an operation once you prepare the CKOperation object:

1) Use the database’s built-in queue:

let fetchOperation = ...
CKContainer.default().privateCloudDatabase.add(fetchOperation)

2) Use your own operation queue and assign a reference to the database:

let operationQueue = NSOperationQueue()

let fetchOperation = ...
fetchOperation.database = CKContainer.default().privateCloudDatabase
operationQueue.addOperation(fetchOperation)

Custom zones

Custom zones (in the private database) let you compartmentalize data and add some special features

Records can’t be moved between zones or have cross-zone relationships

There are some operations that can only be done in custom zones:

Atomic commits:

Objects in the CloudKit database have relationships between them, and you want to keep all data consistent

Atomic commits are kind of like transactions in a relational database: batch operations succeed or fail together

Only available in the private database (because public database may be accessed by millions of users at the same time)

If an operation fails, you get a CKErrorPartialFailure response, with the user info containing info about errors on specific records (CKPartialErrorsByItemID)

Error CKErrorBatchRequestFailed means that this record wasn’t saved because of a problem with another record in the batch

Delta downloads:

Allows you to download a list of all changes since the last time the app was online, to let you perform a full sync

When a device connects, you can send a “change token” to the server asking for all changes since that version

This lets you implement an offline cache of the whole dataset and sync any changes when possible

To do that:

  • track all local changes
  • send changes to the server when connected
  • resolve conflicts
  • fetch server changes with CKFetchRecordChangesOperation
  • remember the received new server change token and send it back next time

Zone subscriptions:

Lets you subscribe for notifications about any change in the zone

When you get a notification, you request a delta download

Advanced record operations

Record changes:

When you change some fields in a CKRecord, the changes are automatically tracked locally and only the changed fields are transmitted when you save it

By default CloudKit performs a “locked update”, which makes sure that the update is only saved on the server if the record wasn’t modified in the meantime by another client (this uses record change tokens)

After you execute a save, the server returns your record with a new change token – so you should use that returned version for any subsequent changes

Unlocked update  ⭢  just overwrites server data regardless what is there

Locked update  ⭢  if the record was changed in the meantime, you get back an error (CKErrorServerRecordChanged)

The userInfo of the CKErrorServerRecordChanged error contains info that lets you perform a 3-way merge:

  • CKRecordChangedErrorClientRecordKey – what you tried to save
  • CKRecordChangedErrorAncestorRecordKey – the original version
  • CKRecordChangedErrorServerRecordKey – what is currently on the server

Based on the values from these 3 copies of the record you can decide what state the record should be in, and then retry the save

You can modify the behavior with “save policies”:

  • SaveIfServerUnchanged  ⭢  default, performs a locked update and sends only changed keys
  • SaveChangedKeys  ⭢  unlocked update, sends only changed keys
  • SaveAllKeys  ⭢  unlocked update, overwrites all keys in the record (note: this doesn’t affect keys that aren’t present in the local copy at all)

You should almost always use the default locked update (SaveIfServerUnchanged), use unlocked updates only to forcefully resolve serious conflicts

Use SaveAllKeys if the user requests to overwrite server data with local data

Partial records:

The desiredKeys field present in most operation types lets you specify that you only want to download selected keys from the server

This is useful if the whole record is very large and you don’t need all of it

Partial records can be normally saved after a change

CloudKit data modeling

References:

Forward reference  ⭢  a parent object keeps an array of references to children in its property

Backward reference  ⭢  only child objects have a reference to the parent

It’s recommended to use backward references – with a forward reference you need to update the parent object every time a new child is added, and you will run into conflicts if multiple clients are adding records

To get a list of all children using backward references, make a query for all child records with a predicate “owner = X”

References give you cascading deletes – when you delete the parent object, all child objects and their children are deleted

If an object has two parent references, it’s deleted when the first parent is deleted

When batch uploading a tree of objects, CloudKit makes sure that parent objects are uploaded first so that you don’t get inconsistent data during upload (important in the public database)

Your data objects:

CloudKit is only a transport mechanism and requires you to keep and manage your own local copy of all data

It’s recommended that you don’t subclass CK* objects to build your models – make your own completely independent model classes and translate to/from CloudKit objects when fetching and saving

Handling push notifications:

You need to remember that push notifications in general aren’t guaranteed to be delivered

The server only stores one push per client, so if you reconnect e.g. after a flight, you might miss some previous notifications

You can find pushes that you’ve missed in a “Notification Collection” where every notification is saved

The Notification Collection works kind of like delta updates – you ask for notifications since a given change token and you get a list of everything added since then

You can mark a notification as read, which notifies all other clients that they can ignore it

You should check the Notification Collection every time you get a push, since you never know what you might have missed (this doesn’t only happen with airplane mode)

The iCloud Dashboard

The dashboard lets you browse data saved by your app – the whole public database and the private database for your developer account (but not anyone else’s private database)

You can view saved records, run queries with any filters, and add new records

You can define roles in the public database and define for each model who can create/read/modify records (e.g. specify that records are publicly readable but only an admin can create them)

You will also see a list of all user ids and first/last names of those users that marked themselves as discoverable

Schema:

The CloudKit database has two separate “environments”: development and production

The schema for each record type is “just in time” during development, i.e. when you save a new type of record, it automatically creates a new schema for that record type, recording every field type, and when you save a record with a new field, it adds a field to the list

However, once you’re ready to release a new version of your app, you need to save the schema to production and at that point it’s locked – a production version of the app can’t save records or fields that aren’t defined in the schema

CloudKit also automatically creates indexes for each field in each record type – when you’re done with development, you can delete some indexes that you won’t need so they don’t waste space in the production database

Tips & tricks

Please handle all errors :)

Remember that you can get partial errors (when atomic commits aren’t used), so some records might be saved while others aren’t

Retry any “server busy” errors (CKErrorRetryAfterKey tells you the amount of time you should wait)

Don’t waste space in your users’ iCloud in private databases, they may be paying real money for it

Limits in the public database are mostly to prevent abuse, they should be fine for most normal use (the limits scale with the number of users)