Swift CoreML Image Recognition Tutorial
In this tutorial we take a look at the newly CoreML Kit introduced at WWDC 2017. You can use it with Swift 4 & iOS11 onwards. Core ML allows you to use machine learning on your iPhone or iPad! You can either train it outside your phone & also train your machine leanings model on your phone. The best feature is it doesn’t require an internet connection!
Storyboard setup
First of all setup the storyboard as follows with constraints. We place the following:
- An image at the top (Get the image here)
- A “Thinking….” label under that
- A Take Photo and Pick Image below this
Now open up the assistant editor and connect the objects we just place as follows to the ViewController.swift class.
- Image to an outlet named myPhoto
- Thinking… label to an outlet named lblResult
- Take Photo button to an action named “takePhoto”
- Pick Image button to an action named “pickPhoto”
The Core ML Model
Before we do any coding we need a model. A model is a set of data that has been trained, for example the model we will be using is trained in 205 different categories of images. Our app will send a new image to the model & based on the training it will try and guess what is in it.
Download the model from the Apple open source model listing here. We are using the Places205-GoogLeNet model. Once downloaded drag it into your XCode app to include it.
Coding it
Now we have the model, we can use the model to determine the contents of an image, neat! To do this add the following function:
func detectImageContent() { lblResult.text = "Thinking" // 1. Try and load the model guard let model = try? VNCoreMLModel(for: GoogLeNetPlaces().model) else { fatalError("Failed to load model") } // 2. Create a vision request let request = VNCoreMLRequest(model: model) {[weak self] request, error in guard let results = request.results as? [VNClassificationObservation], let topResult = results.first else { fatalError("Unexpected results") } // 3. Update the Main UI Thread with our result DispatchQueue.main.async { [weak self] in self?.lblResult.text = "\(topResult.identifier) with \(Int(topResult.confidence * 100))% confidence" } } guard let ciImage = CIImage(image: self.myPhoto.image!) else { fatalError("Cant create CIImage from UIImage") } // 4. Run the googlenetplaces classifier let handler = VNImageRequestHandler(ciImage: ciImage) DispatchQueue.global().async { do { try handler.perform([request]) } catch { print(error) } } }
Next up in viewDidLoad we call the detectImageContent to try and determine what is in our default image.
override func viewDidLoad() { super.viewDidLoad() // Do any additional setup after loading the view, typically from a nib. detectImageContent() }
Run our app & it see what the image gets classified as!
So how does the code work? Lets break it down.
- First of all we attempt to load the model, if it fails exit the function and do nothing
- We need to create a vision request to the model, the image will be later passed to this request. This will load up top results of what it thinks the image is along with the confidence.
- Once the model has guessed what the image is update the label with the guess of the image & the confidence rating. We only show the top result in this example.
- To kick off the process we create a handler which will execute our vision request, passing the image as a CIImage data type. It needs to take in the image as this type in order for the model to guess what the image is.
Adding support for photos & the camera
First of all add the following to our viewController to get access to the users photos & camera.
class ViewController: UIViewController, UIImagePickerControllerDelegate, UINavigationControllerDelegate {
Next up add the following function which are covered in the tutorial here on how to access the photos & camera . The difference is in func imagePickerController(.. we add detectImageContent() at the end to detect the content of the image.
@IBAction func takePhoto(_ sender: Any) { if UIImagePickerController.isSourceTypeAvailable(UIImagePickerControllerSourceType.camera) { let imagePicker = UIImagePickerController() imagePicker.delegate = self imagePicker.sourceType = UIImagePickerControllerSourceType.camera imagePicker.allowsEditing = false self.present(imagePicker, animated: true, completion: nil) } } @IBAction func pickImage(_ sender: Any) { if UIImagePickerController.isSourceTypeAvailable(UIImagePickerControllerSourceType.photoLibrary) { let imagePicker = UIImagePickerController() imagePicker.delegate = self imagePicker.sourceType = UIImagePickerControllerSourceType.photoLibrary imagePicker.allowsEditing = true self.present(imagePicker, animated: true, completion: nil) } } func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [String : Any]) { if let pickedImage = info[UIImagePickerControllerOriginalImage] as? UIImage { myPhoto.contentMode = .scaleToFill myPhoto.image = pickedImage } picker.dismiss(animated: true, completion: nil) detectImageContent() }
The simulator will only support getting images from the photos app. If you want to try our the camera functionality you will need to run this on iOS 11.