Custom Core ML Model with Swift & Python

You may have used or read about Apple’s Core ML which brings machine learning to your app. In this tutorial we will be training our own custom Core ML model to recognize images of cats, emu’s and dolphins. So by the end of this tutorial you will be able to train your own custom model to recognize anything you want! Neat!

Preparing our data

Before creating our iOS App we need to train our own custom Core ML model. In order to do this we will be using turi which is a super easy to use framework to train our own models in Python.

First of all if you don’t have it head on over to the Python website and download and install Python 2.7. Then open up terminal and run the following command:

pip install turicreate

This will install the turicreate library for python which we will be needing. Once this is done we can move onto training our model! First of all download the dataset folder here. It contains images of cats, emus and dolphins. Extract this folder into the place where you are storing this project for the model. Next up create a Classifer.py folder at the root level of your project, the structure will look something like this:

Now open up Classifer.py in your favorite text editor and add the following:

import turicreate as turi
import os

def getImageFromPath(path):
    # norm path will noramilize the path /a/b/c/cat/meow1.png
    # dirname will return directoriles only /a/b/c/cat
    # basename cat
    return os.path.basename(os.path.dirname(os.path.normpath(path)))

myPath = 'dataset'
data = turi.image_analysis.load_images(myPath, with_path = True, recursive = True)

data["animals"] = data["path"].apply(lambda path: getImageFromPath(path))

print(data.groupby("animals",[turi.aggregate.COUNT]))

data.save("animals.sframe")

data.explore()

What this piece of code is doing is essentially tagging our images to then be training with Machine Learning. We tell Python to look in the ‘dataset’ folder, we then load the images from this folder. After that we run the getImageFromPath function, and any images in a folder will be tagged with the folder name.

So for example any images in the cat folder will be tagged as a ‘cat’, where as any images in the emu folder will be tagged as an ’emu’ and so on.

We then output the results to the console so we can see how many images we have for each tag, you will see something like the following output

+---------+-------+
| animals | Count |
+---------+-------+
|   emu   |   53  |
| dolphin |   65  |
|   cat   |   34  |
+---------+-------+
[3 rows x 2 columns]

Then with data.explore we can actually visualize this and verify our that our tagging is correct. Quick note you will need to run this code with Python.

Training our model

Now that we have prepared our data it’s time to train our model! Append the following code to what we already have in Python:

train_data, test_data = data.random_split(0.9)

model = turi.image_classifier.create(train_data, target="animals")

predicitions = model.predict(test_data)

metrics = model.evaluate(test_data)

print(metrics["accuracy"])

print("Saving model")
model.save("animals.model")
print("Saving core ml model")
model.export_coreml("animals.mlmodel")
print("Done")

The first line is the most important one to take note of, it splits up our data into two sets, train_data and test_data. The important thing to note is that 90% of our data from the images will be the training data and correctly tagged as a cat, dolphin or emu.

The test_data is used to test our model, improve the accuracy and make sure it works. Essentially 10% of images are picked which the model has not seen before, then it it asked to guess what is in that photo.

You can adjust the balance by changing 0.9, I wouldn’t recommend staying in the range of 0.8-0.9 which is 80%-90% of the images are used to train the model, the rest are to test it out.

The code will then train our model, and export it to the same folder. The Core ML model we will be using for the iOS app is the animals.ml model.

Using a custom Core ML model in our iOS App

First of all create a new single view appliaction in XCode. Then add the animals.mlmodel file into your project, ensuring ‘Copy items’ is selected. Then go to the storyboard and add an image, text label and buttons as follows:

Then go to the split view editor and connect up the Object’s to our ViewController.swift as follows:

// Outlets
@IBOutlet var imgGuess: UIImageView!
@IBOutlet var lblGuess: UILabel!

// Actions
@IBAction func takePhoto(_ sender: Any) {
}

@IBAction func takePhotoFromCamera(_ sender: Any) {
}

Once this is setup import CoreML, Vision and add the Navigation & Image Controller Delegate’s to our ViewController as follows:

import CoreML
import Vision

class ViewController: UIViewController, UIImagePickerControllerDelegate, UINavigationControllerDelegate {

Then add the following code to our class – the takePhoto and takePhotoFromCamera functions are already created so simply fill them out!

    func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) {
        
        if let pickedImage = info[UIImagePickerControllerOriginalImage] as? UIImage {
            lblGuess.text = "Thinking"
            
            // Set the image view
            imgGuess.contentMode = .scaleToFill
            imgGuess.image = pickedImage
            
            // Get the model
            guard let model = try? VNCoreMLModel(for: animals().model) else {
                fatalError("Unable to load model")
            }
            
            // Create vision request
            let request = VNCoreMLRequest(model: model) {[weak self] request, error in
                guard let results = request.results as? [VNClassificationObservation],
                let topResult = results.first
                else {
                    fatalError("Unexpected results")
                }
                
                // Update the main UI thread with our result
                DispatchQueue.main.async {[weak self] in
                    self?.lblGuess.text = "\(topResult.identifier) with \(Int(topResult.confidence * 100))% confidence"
                }
            }
            
            guard let ciImage = CIImage(image: pickedImage)
            else { fatalError("Cannot read picked image")}
            
            // Run the classifier
            let handler = VNImageRequestHandler(ciImage: ciImage)
            DispatchQueue.global().async {
                do {
                    try handler.perform([request])
                } catch {
                    print(error)
                }
            }
        }
        
        picker.dismiss(animated: true, completion: nil)
        
    }

    // Get a photo from the photos library
    @IBAction func takePhoto(_ sender: Any) {
        if UIImagePickerController.isSourceTypeAvailable(UIImagePickerControllerSourceType.photoLibrary) {
            let imagePicker = UIImagePickerController()
            imagePicker.delegate = self
            imagePicker.sourceType = UIImagePickerControllerSourceType.photoLibrary
            imagePicker.allowsEditing = false
            self.present(imagePicker, animated: true, completion: nil)
        }
    }
    
    // Take a photo from the camera
    @IBAction func takePhotoFromCamera(_ sender: Any) {
        if UIImagePickerController.isSourceTypeAvailable(UIImagePickerControllerSourceType.camera) {
            let imagePicker = UIImagePickerController()
            imagePicker.delegate = self
            imagePicker.sourceType = UIImagePickerControllerSourceType.camera
            imagePicker.allowsEditing = false
            self.present(imagePicker, animated: true, completion: nil)
        }
    }

When we take a photo from the photos app, or camera it will call the imagePickerController didFinishPickingMediaWithInfo – this will then run the image through our animals core ML mode, it will then display the top guess with the percentage of confidence that it is correct.