# Finding Things In Images: A Single-Pass Connected Components Algorithm in Swift

The mature workhorse of object identification using image processing is the connected components algorithm. The algorithm was described in the venerable but still useful two-volume classic from 1982, Digital Picture Processing by Kak and Rosenfeld, and in the decades since by numerous books, presentations, and web pages about image processing.

Unless you are relying solely on a machine learning algorithm to identify and locate an object from an image, your image processing pipeline is likely to include a connected components algorithm. Typically you manipulate a color or grayscale image to generate a binarized black and white image, and then you run a connected components algorithm to find and label regions of connected pixels or “blobs.”

Here’s a nice example from a GPU implementation: https://mcclanahoochie.com/blog/portfolio/cuda-connected-component-labeling/

Each one of those colored regions is a coin for which the size and location is now known.

How you make an image simple enough to feed into a connected components algorithm is a whole ‘nother discussion. Or two.

The Wikipedia entry on the connected components algorithm provides a reasonably thorough overview.

https://en.wikipedia.org/wiki/Connected-component_labeling

Just over nine years ago I wrote about the algorithm on StackOverflow in answer to the question “Implementing 8-Connectivity Connected-Component Labeling in Python.”

Whoever works in image processing should understand how the algorithm works. Thankfully, the algorithm can be understood without any deep exposure to graph theory, statistics, optics, or other fields related to image processing.

If you know a bit of geometry and if you can think of images as two-dimensional arrays of pixels, then you can understand the connected components algorithm. Google word combinations like “connected components image processing” until you find the simplest explanation. Then find a library in the programming language of choice and see what the connected components algorithm can do. You’ll have a much easier time learning the algorithm if you see it working first.

Some libraries and apps worth trying:

# And Now, for Those of You Looking for Code…

If you found this page because you’re looking for code associated with keywords “connected component” and “Swift,” then you and I may belong to the same karass. You already know the algorithm, but rather than integrate some honkin’ big third party library to get it you just want some quick Swift code. You’re my people!

Maybe you’re just curious about the single-pass algorithm, in which case all you may need is a link to the PDF.

# The Single-Pass Algorithm

The single-pass algorithm may be useful if you’re working with limited CPU memory. A single scan of the image labels all the pixels in the image as foreground and background and yields the outer and inner contours of blobs.

The academic paper “A Linear-Time Component-Labeling Algorithm Using Contour Tracing Technique” by our heroes Chang, Chen, and Lu describes the one-pass algorithm and provides an overview of more traditional two-pass algorithms. Here’s a link to the PDF:

https://homepage.iis.sinica.edu.tw/~fchang/paper/component_labeling_cviu.pdf

The paper is eminently readable, and the sample code I provide may not make sense unless you’ve read the paper first.

Long story short, the single scan works by tracing around blob edges and inside blob holes. During the trace the algorithm visits every pixel.

Here’s a copy of an image from the 2015 paper “Video Processing Algorithms for Detection of Pedestrians” that uses the algorithm of Chang, Chen, and Lu.

# Sample Code in Swift

As usual I’m pasting code here rather than providing a link to a GitHub. Copy & paste away.

# Blob.swift

The Swift 5 (Xcode 12.5) code below implements the single-pass connected components algorithm of Chang, Chen, and Lu as Blob.init(data:threshold:polarity:).

The use of closures and whatnot is my attempt to be reasonably Swifty.

The code runs on the CPU. If you want anything approaching real-time performance, use a 3rd party library instead. Or do what I typically do: crop out a region of interest using CICrop, and then downsample using CILanczosScaleTransform to yield an image of no more than a few hundred pixels on a side.

How you convert the image from color to grayscale is up to you. I use CIColorThresholdOtsu for binarization and then extract data from just the red channel.

The data parameter is the two-dimensional [[UInt8]] array that gets processed. Below, in the sections for Pixels.swift and UIImage+Pixels.swift, I provide sample code to extract grayscale data from a UIImage.

But first here’s Blob.swift.

`////  Blob.swift//  ImageTester 001////  Created by GARY BARTOS on 3/29/21.//import Foundationimport CoreGraphicsstruct Point: Equatable {    var x: Int    var y: Int    var cgPoint: CGPoint {        CGPoint(x: x, y: y)    }        init(_ x: Int, _ y: Int) {        self.x = x        self.y = y    }}struct Blob {    enum Polarity {        case WhiteOnBlack        case BlackOnWhite    }        private (set) var externalContours: [Int: [Point]] = [:]    private (set) var internalContours: [Int: [Point]] = [:]        private (set) var imageWidth: Int    private (set) var imageHeight: Int        private (set) var labelCount: Int        private (set) var labels: [[Int]] = []        private (set) var polarity: Polarity    private (set) var threshold: UInt8        func largestContour() -> (key: Int, value: [Point])? {        externalContours.max{ a, b in a.value.count < b.value.count }    }        init(data: [[UInt8]], threshold: UInt8, polarity: Polarity = .BlackOnWhite) {        self.polarity = polarity        self.threshold = threshold                let w = data.count        let h = data.count                self.imageWidth = w        self.imageHeight = h                var c = 1        var labels = Array.init(repeating: Array.init(repeating: 0, count: h), count: w)        let visitedBackground = -1        //        let isLabeled = { (x: Int, y: Int) -> Bool in//            labels[x][y] != 0//        }                let insideImage = { (x: Int, y: Int) -> Bool in            x >= 0 && y >= 0 && x < w && y < h        }                let outsideImage = { (x: Int, y: Int) -> Bool in            x < 0 || y < 0 || x >= w || y >= h        }                let isUnlabeled = { (x: Int, y: Int) -> Bool in            return outsideImage(x,y) || labels[x][y] == 0        }        let isForeground: (Int,Int) -> Bool        let isBackground: (Int,Int) -> Bool                switch polarity {        case .BlackOnWhite:            isForeground = { (x: Int, y: Int) -> Bool in                insideImage(x,y) && data[x][y] <= threshold            }                        isBackground = { (x: Int, y: Int) -> Bool in                outsideImage(x,y) || data[x][y] > threshold            }        case .WhiteOnBlack:            isForeground = { (x: Int, y: Int) -> Bool in                insideImage(x,y) && data[x][y] >= threshold            }                        isBackground = { (x: Int, y: Int) -> Bool in                outsideImage(x,y) || data[x][y] < threshold            }        }                /// "P is unlabeled and its left neighbor is a white [background] pixel        let isExternalContour = { (x: Int, y: Int) -> Bool in            isUnlabeled(x,y) && isBackground(x-1,y)        }                /// "the right neighbor of P is an unlabeled white [background] pixel"        let isInternalContour = { (x: Int, y: Int) -> Bool in            isUnlabeled(x+1,y) && isBackground(x+1,y)        }                // directions (variable "d")        // 5 6 7        // 4 p 0        // 3 2 1        let i = [1, 1, 0, -1, -1, -1, 0, 1] // relative offset for x, clockwise direction d        let j = [0, 1, 1, 1, 0, -1, -1, -1] // relative offset for y, clockwise direction d        let tracer = { (p: Point, direction: Int) -> (p: Point, d: Int)? in            for index in 0 ..< 8 {                let d = (direction + index) % 8                let x = p.x + i[d]                let y = p.y + j[d]                                if isForeground(x,y) {                    return (Point(x,y),d)                }                else if insideImage(x,y) {                    labels[x][y] = visitedBackground                }            }            return nil        }                //1. start external: start at d = 0        //2. start internal: start at d = 1        //3. previous contour point: start at d = (previous+6)%8        let contourTracing = { (x: Int, y: Int, direction: Int) -> [Point] in            var points = [Point(x,y)]                        guard var next = tracer(points, direction) else {                return points            }            var prev = next                        //execute until two conditions hold:            //1. tracer returns original point            //2. next contour point is T again                        repeat {                prev = next                points.append(prev.p)                next = tracer(prev.p, (prev.d + 6) % 8)!            } while prev.p != points && next.p != points                        return points        }                for y in 0 ..< h {            for x in 0 ..< w {                // only process foreground pixels; skip over background pixels                if isBackground(x,y) {                    continue                }                                //case 1: P is unlabeled and its left neighbor is background => P is an external contour point                //      execute contour tracing to find contour containing P                //case 2: right neighbor of P is an unlabeled background pixel => P is an internal contour point                //      execute contour tracing to find contour containing P                //      2a: P is not labeled; left neighbor of P must have been labeled; assign all contour points the same label as left neighbor of P                //      2b: P is already labeled; assign all contour points the same label as P                //case 3: none of the above; left neighbor of P must have been labeled already; assign to P the same label as its left neighbor                if isExternalContour(x,y) {                    let points = contourTracing(x,y,0)                    for point in points {                        labels[point.x][point.y] = c                    }                    externalContours[c] = points                    c += 1                }                else if isInternalContour(x,y) {                    let label = isUnlabeled(x,y) ? labels[x-1][y] : labels[x][y]                    let points = contourTracing(x,y,1)                    for point in points {                        labels[point.x][point.y] = label                    }                    internalContours[c] = points                }                else if x > 0 && isUnlabeled(x,y) {                    // a plain "else" condition ends up overwriting existing foreground pixels                    labels[x][y] = labels[x-1][y]                }            }        }                self.labelCount = c - 1        self.labels = labels    }}`

# Blob+image.swift

Once you’ve created an instance of Blob, you’ll want a UIImage with the biggest blob or maybe all the blobs.

Once you’ve initialized your Blob, call image(sourceOrientation:drawMode:) to get a UIImage optional. Whether it returns nil or an image, you can pass that return value to UIImageView.image.

This extension for UIKit (for iOS apps) would only need a few tweaks to run with AppKit (for desktop apps).

`////  Blob+Image.swift//  ImageTester 001////  Created by GARY BARTOS on 3/30/21.//import UIKitextension Blob {    enum DrawMode: String, CaseIterable {        case DrawAllContours        case DrawLargestContour        case FillAllBlobs        case FillLargestBlob    }        func image(sourceOrientation: UIImage.Orientation, drawMode: DrawMode) -> UIImage? {        if imageWidth == 0 || imageHeight == 0 {            return nil        }                // create a CGRect representing the full size of our input iamge        let imageRect = CGRect(x: 0, y: 0, width: imageWidth, height: imageHeight)        let format = UIGraphicsImageRendererFormat()        format.scale = 1        //format.preferredRange = .extended                //colors        let background = UIColor.black        let bigColor = UIColor.green        let colors = [UIColor.blue, UIColor.red, UIColor.yellow, UIColor.magenta, UIColor.orange, UIColor.brown, UIColor.cyan, UIColor.purple]                var dict: [UIColor: [CGRect]] = [:]        dict[bigColor] = []                for color in colors {            dict[color] = []        }        let drawRectangles = { (ctx: UIGraphicsImageRendererContext) in            for item in dict {                item.key.setFill()                ctx.cgContext.fill(item.value)            }        }                // renderer        let renderer = UIGraphicsImageRenderer(size: imageRect.size, format: format)                let result = renderer.image { ctx in            background.setFill()            ctx.cgContext.fill(imageRect)                        if let largest = largestContour() {                                let colorByLabel = { (label: Int) -> UIColor in                    label == largest.key ? bigColor : colors[label % colors.count]                }                                switch drawMode {                case .DrawAllContours:                    //TODO change Blob labeling to make internal contours same color as their external contour                    for contour in externalContours {                        let label = contour.key                        let color = colorByLabel(label)                        var rects: [CGRect] = []                                                for p in contour.value {                            rects.append(CGRect(x: p.x, y: p.y, width: 1, height: 1))                        }                        dict[color]!.append(contentsOf: rects)                    }                                        for contour in internalContours {                        let label = contour.key                        let color = colorByLabel(label)                        var rects: [CGRect] = []                                                for p in contour.value {                            rects.append(CGRect(x: p.x, y: p.y, width: 1, height: 1))                        }                        dict[color]!.append(contentsOf: rects)                    }                    drawRectangles(ctx)                                    case .DrawLargestContour:                    let points = largest.value                    let color = colorByLabel(largest.key)                    var rects: [CGRect] = []                    for p in points {                        rects.append(CGRect(x: p.x, y: p.y, width: 1, height: 1))                    }                    dict[color]!.append(contentsOf: rects)                    drawRectangles(ctx)                                    case .FillAllBlobs:                    for contour in externalContours {                        let label = contour.key                        let points = contour.value                                                ctx.cgContext.beginPath()                        ctx.cgContext.move(to: points.cgPoint)                                                for i in 1 ..< points.count {                            ctx.cgContext.addLine(to: points[i].cgPoint)                        }                                                ctx.cgContext.closePath()                                                let color = colorByLabel(label)                        color.setFill()                        ctx.cgContext.fillPath()                    }                    for contour in internalContours {                        let points = contour.value                                                ctx.cgContext.beginPath()                        ctx.cgContext.move(to: points.cgPoint)                                                for i in 1 ..< points.count {                            ctx.cgContext.addLine(to: points[i].cgPoint)                        }                                                ctx.cgContext.closePath()                                                let color = background    //make it a hole                        color.setFill()                        ctx.cgContext.fillPath()                    }                                    case .FillLargestBlob:                    let points = largest.value                    let color = colorByLabel(largest.key)                                        ctx.cgContext.beginPath()                    ctx.cgContext.move(to: points.cgPoint)                                        for i in 1 ..< points.count {                        ctx.cgContext.addLine(to: points[i].cgPoint)                    }                                        ctx.cgContext.closePath()                                        color.setFill()                    ctx.cgContext.fillPath()                                    }            }        }          if sourceOrientation == .up {            return result        }                let rotatedImage = UIImage(cgImage: result.cgImage!, scale: 1.0, orientation: sourceOrientation)        return rotatedImage                //if needed, see https://stackoverflow.com/questions/8915630/ios-uiimageview-how-to-handle-uiimage-image-orientation    }}`

# Pixels.swift

With apologies to the original authors whose links I can’t find at the moment (and/or the people from whom they inherited code), here’s the Pixels type that I’ve been using for pixel data extracted from a UIImage. This type is returned from a function made available in an extension to UIImage.

`////  PixelArrays.swift//  Point 001////  Created by GARY BARTOS on 12/20/20.//import Foundationimport CoreGraphicsimport UIKit/// TODO initialize with UIImage or CGImage, thus setting imageWidth and imageHeightstruct Pixels {    var data: [UInt8]        /// Image width. Typically CGImage.width, avoiding issues with orientation.    var imageWidth: Int        /// Image height. Typically CGImage.height, avoiding issues with orientation.    var imageHeight: Int    var bytesPerPixel: Int        var isEmpty: Bool {        return data.isEmpty    }        var byteCount: Int {        return data.count    }        var bytesPerRow: Int {        return bytesPerPixel * imageWidth    }        func cgIndex(x: Int, y: Int) -> Int {        y * bytesPerRow + x * bytesPerPixel    }    func uiIndex(x: Int, y: Int, orientation: UIImage.Orientation) -> Int {        let size = CGSize(width: imageWidth, height: imageHeight)                do {            let xform = try UIImage.cgToUITransform(cgImageSize: size, orientation: orientation)            let uiToCg = xform.inverted()            let translated = CGPoint(x: x, y: y).applying(uiToCg)            return cgIndex(x: Int(translated.x), y: Int(translated.y))        }        catch {            print(error.localizedDescription)            return 0        }    }        private func pixel(index: Int) -> UIColor {        return UIColor(            red: CGFloat(data[index])/255.0,            green: CGFloat(data[index + 1])/255.0,            blue: CGFloat(data[index + 2])/255.0,            alpha: CGFloat(data[index + 3])/255.0)    }        func cgPixel(x: Int, y: Int) -> UIColor {        if data.isEmpty {            //TODO throw exception?            return UIColor()        }                let i = cgIndex(x: x, y: y)        return pixel(index: i)    }    func red(x: Int, y: Int) -> UInt8 {        if data.isEmpty {            return 0        }        let i = cgIndex(x: x, y: y)        return data[i]    }        func green(x: Int, y: Int) -> UInt8 {        if data.isEmpty {            return 0        }        let i = cgIndex(x: x, y: y)        return data[i + 1]    }        func blue(x: Int, y: Int) -> UInt8 {        if data.isEmpty {            return 0        }        let i = cgIndex(x: x, y: y)        return data[i + 2]    }        private func arrayByOffset(offset: Int) -> [[UInt8]] {        var start = offset        var arr: [[UInt8]] = []        for _ in 0 ..< imageWidth {            var indexSet = IndexSet()            for i in stride(from: start, to: byteCount, by: bytesPerRow) {                indexSet.insert(i)            }            //let arr = Array(data[  ])            let column = indexSet.map { data[\$0] }            arr.append(column)            start += bytesPerPixel        }        return arr    }        func reds() -> [[UInt8]] {        arrayByOffset(offset: 1)    }        func greens() -> [[UInt8]] {        arrayByOffset(offset: 2)    }        func blues() -> [[UInt8]] {        arrayByOffset(offset: 3)    }         func uiPixel(x: Int, y: Int, orientation: UIImage.Orientation) -> UIColor {        let i = uiIndex(x: x, y: y, orientation: orientation)        return pixel(index: i)    }        mutating private func setPixel(index: Int, color: UIColor) {        var red: CGFloat = 0        var green: CGFloat = 0        var blue: CGFloat = 0        var alpha: CGFloat = 0                color.getRed(&red, green: &green, blue: &blue, alpha: &alpha)        data[index] = UInt8(red * 255.0)        data[index + 1] = UInt8(green * 255.0)        data[index + 2] = UInt8(blue * 255.0)        data[index + 3] = UInt8(alpha * 255.0)    }        mutating func cgSetPixel(x: Int, y: Int, color: UIColor) {        if data.isEmpty {            //TODO throw exception for x,y out of bounds            return        }                let i = cgIndex(x: x, y: y)        return setPixel(index: i, color: color)    }        mutating func uiSetPixel(x: Int, y: Int, color: UIColor, orientation: UIImage.Orientation) {        if data.isEmpty {            //TODO throw exception for x,y out of bounds            return        }        let i = uiIndex(x: x, y: y, orientation: orientation)        return setPixel(index: i, color: color)    }        static var zero: Pixels {        Pixels(data: [], imageWidth: 0, imageHeight: 0, bytesPerPixel: 0)    }}`

# UIImage+Pixels.swift

The Pixels type is used in this extension of UIImage.

Feel free to find and/or write more efficient code! This got me going. After spending a few hours downloading and testing means of extracting pixels from images, this is the code that worked best for me.

`////  UIImage+PixelDataViaCGContext.swift//  Point 001////  Created by GARY BARTOS on 12/20/20.//import UIKitextension UIImage {    /// Generates an array of the underlying CGImage    /// 0.080 seconds to generate array of UInt8 values BGRA (presumably).    /// Actually appears to be RGBW or BGRW    /// Creates array from CGContext    ///    /// https://stackoverflow.com/questions/33768066/get-pixel-data-as-array-from-uiimage-cgimage-in-swift    func pixels() -> Pixels {        //print("\(self.size.width) x \(self.size.height) UIImage in \(#function)")                guard let cgImage = self.cgImage else {            return Pixels.zero        }        let size = CGSize(width: cgImage.width, height: cgImage.height) //     self.size        let dataSize = size.width * size.height * 4        var pixelData = [UInt8](repeating: 0, count: Int(dataSize))        let colorSpace = CGColorSpaceCreateDeviceRGB()        let bytesPerPixel = 4        let context = CGContext(data: &pixelData,                                width: Int(size.width),                                height: Int(size.height),                                bitsPerComponent: 8,                                bytesPerRow: bytesPerPixel * Int(size.width),                                space: colorSpace,                                bitmapInfo: CGImageAlphaInfo.noneSkipLast.rawValue)                //print("\(cgImage.width) x \(cgImage.height) CGImage in \(#function)")                context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: size.width, height: size.height))        return Pixels(data: pixelData, imageWidth: Int(size.width), imageHeight: Int(size.height), bytesPerPixel: bytesPerPixel)    }        static func fromPixels(pixels: Pixels, orientation: UIImage.Orientation, scale: CGFloat = 1) -> UIImage? {        let colorSpace = CGColorSpaceCreateDeviceRGB()        guard let rgbData = CFDataCreate(nil, pixels.data, pixels.byteCount) else {            return nil        }        guard let provider = CGDataProvider(data: rgbData) else {            return nil        }                guard let rgbImageRef = CGImage(            width: pixels.imageWidth,            height: pixels.imageHeight,            bitsPerComponent: 8,            bitsPerPixel: 8 * pixels.bytesPerPixel,            bytesPerRow: pixels.bytesPerPixel * pixels.imageWidth,            space: colorSpace,            bitmapInfo: CGBitmapInfo(rawValue: 0),            provider: provider,            decode: nil,            shouldInterpolate: false,            intent: CGColorRenderingIntent.defaultIntent)        else {            return nil        }                return UIImage(cgImage: rgbImageRef, scale: scale, orientation: orientation)    }        /// 1D array to UIImage.    /// Passing in the returned array from pixelDataViaContext() should return the original UIImage.    /// Based on  https://stackoverflow.com/questions/2261177/cgimage-from-byte-array    static func from(pixelData: [UInt8], width: Int, height: Int, orientation: UIImage.Orientation, scale: CGFloat = 1, bytesPerPixel: Int = 4) -> UIImage? {        let pixels = Pixels(data: pixelData, imageWidth: width, imageHeight: height, bytesPerPixel: bytesPerPixel)        return fromPixels(pixels: pixels, orientation: orientation, scale: scale)    }    }`

# Usage

The following code yields a UIImage with labeled blobs from a source CIImage.

`func blobImage(ci: CIImage, binarizer: CIFilter, drawMode: Blob.DrawMode) -> UIImage? {        binarizer.setValue(ci, forKey: kCIInputImageKey)        let ciFiltered = binarizer.outputImage!                scaleDownFilter.setValue(ciFiltered, forKey: kCIInputImageKey)        let ciScaled = scaleDownFilter.outputImage!        let uiScaled = UIImage.fromCIImage(ci: ciScaled)                let reds = uiScaled.pixels().reds()        let blob = Blob(data: reds, threshold: 128, polarity: .WhiteOnBlack)        let bimg = blob.image(sourceOrientation: uiScaled.imageOrientation, drawMode: drawMode)        return bimg    }`
`let blob = Blob(data: reds, threshold: 128, polarity: .WhiteOnBlack)let bimg = blob.image(sourceOrientation: uiScaled.imageOrientation, drawMode: drawMode)`

That’s a very long post. One of these days I’ll push my code samples off to GitHub, for now I like having it all within the Medium post itself.

--

--