Swift Optical Character Recognition Tutorial

By the end of this tutorial you will be able to use the Tesseract OCR library to recognize text in images. I have tried the following methods with the following success rates on the images in this tutorial

Core ML – 50%
Swift OCR – 70-80%
Tesseract OCR – 100%

Setting up Tesseract OCR

We will be using CocoaPods to install the Tesseract OCR library for Swift. If you are unfamilar with CocoaPods check out the tutorial here. So to get going create a new single view application in Swift. We named ours ‘SeemuOCR’. Once you have created the project navigate to the project folder in terminal and run pod init to setup the Podfile.

Once you have your podfile setup add the following line as follows to our podfile:

pod ‘TesseractOCRiOS’, ‘4.0.0’

# Uncomment the next line to define a global platform for your project
# platform :ios, '9.0'

target 'SeemuOCR' do
  # Comment the next line if you're not using Swift and don't want to use dynamic frameworks
  use_frameworks!

  # Pods for SeemuOCR
  pod 'TesseractOCRiOS', '4.0.0'
  
  target 'SeemuOCRTests' do
    inherit! :search_paths
    # Pods for testing
  end

  target 'SeemuOCRUITests' do
    inherit! :search_paths
    # Pods for testing
  end

end

Save this file and back in terminal run pod install to install TesseractOCR.

pod install

Before we go any further go and download the contents needed for the tutorial here. Once you have the contents go to the project folder and create a folder called ‘tessdata’ in the following location in your project (The same level as AppDelegate and ViewController). Once you have created this add the english training data.

A quick note – this training data is a specific version (3), the latest tesseract data files do not currently work with this library.

Next up open up the .xcworkspace project folder. With finder open drag the ‘tessdata’ folder you created into XCode under the project as follows. Once it’s in the project the folder should be blue (This means it matches the file system, rather then a link) otherwise it won’t work.

Recognizing text in images with Swift

Add the images from the contents to the images.xcassets folder:

Then add the following code to the View Controller

import UIKit
import TesseractOCR

class ViewController: UIViewController, G8TesseractDelegate {

    let tesseract:G8Tesseract = G8Tesseract(language: "eng")
    
    
    override func viewDidLoad() {
        super.viewDidLoad()
        // Do any additional setup after loading the view, typically from a nib.
        
        print("Running OCR")
        
        tesseract.delegate = self
        tesseract.charWhitelist = "0123456789"
        
        var imageToCheck = UIImage(named: "1")
        tesseract.image = imageToCheck
        tesseract.recognize()
        print("The 1 text is \(tesseract.recognizedText!)")
        
        imageToCheck = UIImage(named: "3")
        tesseract.image = imageToCheck
        tesseract.recognize()
        print("The 3 text is \(tesseract.recognizedText!)")
        
        imageToCheck = UIImage(named: "6")
        tesseract.image = imageToCheck
        tesseract.recognize()
        print("The 6 text is \(tesseract.recognizedText!)")
        
        imageToCheck = UIImage(named: "9")
        tesseract.image = imageToCheck
        tesseract.recognize()
        print("The 9 text is \(tesseract.recognizedText!)")
        
    }
    
    func shouldCancelImageRecognition(for tesseract: G8Tesseract!) -> Bool {
        return false
    }

}

The code is pretty self explanatory – run it and you will see the results printed to the console, it picks up what the text is in the image every time!

This library also has options you can optionally setup. For example you will have noticed we used ‘charWhiteList’ – this will set a list of characters that the library will only detect, essentially we are saying our images only contain numbers so only look for this.

You can also do the opposite and set a Black List of characters you don’t want the image to detect.

Swift Optical Character Recognition Tutorial

Swift Optical Character Recognition Tutorial

Setting up Tesseract OCR

Recognizing text in images with Swift

About Andrew

Welcome

Seemu’s Studio Setup

Swift Optical Character Recognition Tutorial

Swift Optical Character Recognition Tutorial

Setting up Tesseract OCR

Recognizing text in images with Swift

About Andrew

You also might be interested in

Get nth character of a String & get a Substring from a String

Replace occurrences of a character in a String

Welcome

Seemu’s Studio Setup