Finding Things In Images: A Single-Pass Connected Components Algorithm in Swift

Gary Bartos
13 min readMay 17, 2021

--

The mature workhorse of object identification using image processing is the connected components algorithm. The algorithm was described in the venerable but still useful two-volume classic from 1982, Digital Picture Processing by Kak and Rosenfeld, and in the decades since by numerous books, presentations, and web pages about image processing.

Unless you are relying solely on a machine learning algorithm to identify and locate an object from an image, your image processing pipeline is likely to include a connected components algorithm. Typically you manipulate a color or grayscale image to generate a binarized black and white image, and then you run a connected components algorithm to find and label regions of connected pixels or “blobs.”

Here’s a nice example from a GPU implementation: https://mcclanahoochie.com/blog/portfolio/cuda-connected-component-labeling/

https://mcclanahoochie.com/blog/wp-content/uploads/2011/08/coins-bwlabel.png

Each one of those colored regions is a coin for which the size and location is now known.

How you make an image simple enough to feed into a connected components algorithm is a whole ‘nother discussion. Or two.

The Wikipedia entry on the connected components algorithm provides a reasonably thorough overview.

https://en.wikipedia.org/wiki/Connected-component_labeling

Just over nine years ago I wrote about the algorithm on StackOverflow in answer to the question “Implementing 8-Connectivity Connected-Component Labeling in Python.”

Whoever works in image processing should understand how the algorithm works. Thankfully, the algorithm can be understood without any deep exposure to graph theory, statistics, optics, or other fields related to image processing.

If you know a bit of geometry and if you can think of images as two-dimensional arrays of pixels, then you can understand the connected components algorithm. Google word combinations like “connected components image processing” until you find the simplest explanation. Then find a library in the programming language of choice and see what the connected components algorithm can do. You’ll have a much easier time learning the algorithm if you see it working first.

Some libraries and apps worth trying:

  • ImageJ, an old app that worked just fine, the last I knew. Connected components is implemented as “particle analysis,” a term relevant to medical imaging: https://imagej.net/Particle_Analysis
  • OpenCV connectedComponents() function, if you know C++ or Python and don’t mind building up your own example from a large library of functions.
  • Commercial libraries by Cognex, National Instruments, Microscan, and other companies have had blazing fast, robust implementations of this algorithm for decades.

And Now, for Those of You Looking for Code…

If you found this page because you’re looking for code associated with keywords “connected component” and “Swift,” then you and I may belong to the same karass. You already know the algorithm, but rather than integrate some honkin’ big third party library to get it you just want some quick Swift code. You’re my people!

Maybe you’re just curious about the single-pass algorithm, in which case all you may need is a link to the PDF.

The Single-Pass Algorithm

The single-pass algorithm may be useful if you’re working with limited CPU memory. A single scan of the image labels all the pixels in the image as foreground and background and yields the outer and inner contours of blobs.

The academic paper “A Linear-Time Component-Labeling Algorithm Using Contour Tracing Technique” by our heroes Chang, Chen, and Lu describes the one-pass algorithm and provides an overview of more traditional two-pass algorithms. Here’s a link to the PDF:

https://homepage.iis.sinica.edu.tw/~fchang/paper/component_labeling_cviu.pdf

The paper is eminently readable, and the sample code I provide may not make sense unless you’ve read the paper first.

Long story short, the single scan works by tracing around blob edges and inside blob holes. During the trace the algorithm visits every pixel.

Here’s a copy of an image from the 2015 paper “Video Processing Algorithms for Detection of Pedestrians” that uses the algorithm of Chang, Chen, and Lu.

Edge-following graphic from the paper by Chang, Chen, and Lu

Sample Code in Swift

As usual I’m pasting code here rather than providing a link to a GitHub. Copy & paste away.

Blob.swift

The Swift 5 (Xcode 12.5) code below implements the single-pass connected components algorithm of Chang, Chen, and Lu as Blob.init(data:threshold:polarity:).

The use of closures and whatnot is my attempt to be reasonably Swifty.

The code runs on the CPU. If you want anything approaching real-time performance, use a 3rd party library instead. Or do what I typically do: crop out a region of interest using CICrop, and then downsample using CILanczosScaleTransform to yield an image of no more than a few hundred pixels on a side.

How you convert the image from color to grayscale is up to you. I use CIColorThresholdOtsu for binarization and then extract data from just the red channel.

The data parameter is the two-dimensional [[UInt8]] array that gets processed. Below, in the sections for Pixels.swift and UIImage+Pixels.swift, I provide sample code to extract grayscale data from a UIImage.

But first here’s Blob.swift.

//
// Blob.swift
// ImageTester 001
//
// Created by GARY BARTOS on 3/29/21.
//

import Foundation
import CoreGraphics

struct Point: Equatable {
var x: Int
var y: Int

var cgPoint: CGPoint {
CGPoint(x: x, y: y)
}

init(_ x: Int, _ y: Int) {
self.x = x
self.y = y
}
}

struct Blob {
enum Polarity {
case WhiteOnBlack
case BlackOnWhite
}

private (set) var externalContours: [Int: [Point]] = [:]
private (set) var internalContours: [Int: [Point]] = [:]

private (set) var imageWidth: Int
private (set) var imageHeight: Int

private (set) var labelCount: Int

private (set) var labels: [[Int]] = []

private (set) var polarity: Polarity
private (set) var threshold: UInt8

func largestContour() -> (key: Int, value: [Point])? {
externalContours.max{ a, b in a.value.count < b.value.count }
}

init(data: [[UInt8]], threshold: UInt8, polarity: Polarity = .BlackOnWhite) {
self.polarity = polarity
self.threshold = threshold

let w = data.count
let h = data[0].count

self.imageWidth = w
self.imageHeight = h

var c = 1
var labels = Array.init(repeating: Array.init(repeating: 0, count: h), count: w)

let visitedBackground = -1

// let isLabeled = { (x: Int, y: Int) -> Bool in
// labels[x][y] != 0
// }

let insideImage = { (x: Int, y: Int) -> Bool in
x >= 0 && y >= 0 && x < w && y < h
}

let outsideImage = { (x: Int, y: Int) -> Bool in
x < 0 || y < 0 || x >= w || y >= h
}

let isUnlabeled = { (x: Int, y: Int) -> Bool in
return outsideImage(x,y) || labels[x][y] == 0
}

let isForeground: (Int,Int) -> Bool
let isBackground: (Int,Int) -> Bool

switch polarity {
case .BlackOnWhite:
isForeground = { (x: Int, y: Int) -> Bool in
insideImage(x,y) && data[x][y] <= threshold
}

isBackground = { (x: Int, y: Int) -> Bool in
outsideImage(x,y) || data[x][y] > threshold
}
case .WhiteOnBlack:
isForeground = { (x: Int, y: Int) -> Bool in
insideImage(x,y) && data[x][y] >= threshold
}

isBackground = { (x: Int, y: Int) -> Bool in
outsideImage(x,y) || data[x][y] < threshold
}
}

/// "P is unlabeled and its left neighbor is a white [background] pixel
let isExternalContour = { (x: Int, y: Int) -> Bool in
isUnlabeled(x,y) && isBackground(x-1,y)
}

/// "the right neighbor of P is an unlabeled white [background] pixel"
let isInternalContour = { (x: Int, y: Int) -> Bool in
isUnlabeled(x+1,y) && isBackground(x+1,y)
}

// directions (variable "d")
// 5 6 7
// 4 p 0
// 3 2 1
let i = [1, 1, 0, -1, -1, -1, 0, 1] // relative offset for x, clockwise direction d
let j = [0, 1, 1, 1, 0, -1, -1, -1] // relative offset for y, clockwise direction d

let tracer = { (p: Point, direction: Int) -> (p: Point, d: Int)? in
for index in 0 ..< 8 {
let d = (direction + index) % 8
let x = p.x + i[d]
let y = p.y + j[d]

if isForeground(x,y) {
return (Point(x,y),d)
}
else if insideImage(x,y) {
labels[x][y] = visitedBackground
}
}
return nil
}

//1. start external: start at d = 0
//2. start internal: start at d = 1
//3. previous contour point: start at d = (previous+6)%8
let contourTracing = { (x: Int, y: Int, direction: Int) -> [Point] in
var points = [Point(x,y)]

guard var next = tracer(points[0], direction) else {
return points
}

var prev = next

//execute until two conditions hold:
//1. tracer returns original point
//2. next contour point is T again

repeat {
prev = next
points.append(prev.p)
next = tracer(prev.p, (prev.d + 6) % 8)!
} while prev.p != points[0] && next.p != points[1]

return points
}

for y in 0 ..< h {
for x in 0 ..< w {
// only process foreground pixels; skip over background pixels
if isBackground(x,y) {
continue
}

//case 1: P is unlabeled and its left neighbor is background => P is an external contour point
// execute contour tracing to find contour containing P
//case 2: right neighbor of P is an unlabeled background pixel => P is an internal contour point
// execute contour tracing to find contour containing P
// 2a: P is not labeled; left neighbor of P must have been labeled; assign all contour points the same label as left neighbor of P
// 2b: P is already labeled; assign all contour points the same label as P
//case 3: none of the above; left neighbor of P must have been labeled already; assign to P the same label as its left neighbor

if isExternalContour(x,y) {
let points = contourTracing(x,y,0)
for point in points {
labels[point.x][point.y] = c
}
externalContours[c] = points

c += 1
}
else if isInternalContour(x,y) {
let label = isUnlabeled(x,y) ? labels[x-1][y] : labels[x][y]
let points = contourTracing(x,y,1)
for point in points {
labels[point.x][point.y] = label
}
internalContours[c] = points
}
else if x > 0 && isUnlabeled(x,y) {
// a plain "else" condition ends up overwriting existing foreground pixels
labels[x][y] = labels[x-1][y]
}
}
}

self.labelCount = c - 1
self.labels = labels
}
}

Blob+image.swift

Once you’ve created an instance of Blob, you’ll want a UIImage with the biggest blob or maybe all the blobs.

Once you’ve initialized your Blob, call image(sourceOrientation:drawMode:) to get a UIImage optional. Whether it returns nil or an image, you can pass that return value to UIImageView.image.

This extension for UIKit (for iOS apps) would only need a few tweaks to run with AppKit (for desktop apps).

//
// Blob+Image.swift
// ImageTester 001
//
// Created by GARY BARTOS on 3/30/21.
//

import UIKit

extension Blob {
enum DrawMode: String, CaseIterable {
case DrawAllContours
case DrawLargestContour
case FillAllBlobs
case FillLargestBlob
}

func image(sourceOrientation: UIImage.Orientation, drawMode: DrawMode) -> UIImage? {
if imageWidth == 0 || imageHeight == 0 {
return nil
}

// create a CGRect representing the full size of our input iamge
let imageRect = CGRect(x: 0, y: 0, width: imageWidth, height: imageHeight)

let format = UIGraphicsImageRendererFormat()
format.scale = 1
//format.preferredRange = .extended

//colors
let background = UIColor.black
let bigColor = UIColor.green
let colors = [UIColor.blue, UIColor.red, UIColor.yellow, UIColor.magenta, UIColor.orange, UIColor.brown, UIColor.cyan, UIColor.purple]

var dict: [UIColor: [CGRect]] = [:]
dict[bigColor] = []

for color in colors {
dict[color] = []
}

let drawRectangles = { (ctx: UIGraphicsImageRendererContext) in
for item in dict {
item.key.setFill()
ctx.cgContext.fill(item.value)
}
}

// renderer
let renderer = UIGraphicsImageRenderer(size: imageRect.size, format: format)

let result = renderer.image { ctx in
background.setFill()
ctx.cgContext.fill(imageRect)

if let largest = largestContour() {

let colorByLabel = { (label: Int) -> UIColor in
label == largest.key ? bigColor : colors[label % colors.count]
}

switch drawMode {
case .DrawAllContours:
//TODO change Blob labeling to make internal contours same color as their external contour
for contour in externalContours {
let label = contour.key
let color = colorByLabel(label)
var rects: [CGRect] = []

for p in contour.value {
rects.append(CGRect(x: p.x, y: p.y, width: 1, height: 1))
}
dict[color]!.append(contentsOf: rects)
}

for contour in internalContours {
let label = contour.key
let color = colorByLabel(label)
var rects: [CGRect] = []

for p in contour.value {
rects.append(CGRect(x: p.x, y: p.y, width: 1, height: 1))
}
dict[color]!.append(contentsOf: rects)
}
drawRectangles(ctx)

case .DrawLargestContour:
let points = largest.value
let color = colorByLabel(largest.key)
var rects: [CGRect] = []
for p in points {
rects.append(CGRect(x: p.x, y: p.y, width: 1, height: 1))
}
dict[color]!.append(contentsOf: rects)
drawRectangles(ctx)

case .FillAllBlobs:
for contour in externalContours {
let label = contour.key
let points = contour.value

ctx.cgContext.beginPath()
ctx.cgContext.move(to: points[0].cgPoint)

for i in 1 ..< points.count {
ctx.cgContext.addLine(to: points[i].cgPoint)
}

ctx.cgContext.closePath()

let color = colorByLabel(label)
color.setFill()
ctx.cgContext.fillPath()
}

for contour in internalContours {
let points = contour.value

ctx.cgContext.beginPath()
ctx.cgContext.move(to: points[0].cgPoint)

for i in 1 ..< points.count {
ctx.cgContext.addLine(to: points[i].cgPoint)
}

ctx.cgContext.closePath()

let color = background //make it a hole
color.setFill()
ctx.cgContext.fillPath()
}

case .FillLargestBlob:
let points = largest.value
let color = colorByLabel(largest.key)

ctx.cgContext.beginPath()
ctx.cgContext.move(to: points[0].cgPoint)

for i in 1 ..< points.count {
ctx.cgContext.addLine(to: points[i].cgPoint)
}

ctx.cgContext.closePath()

color.setFill()
ctx.cgContext.fillPath()

}
}
}

if sourceOrientation == .up {
return result
}

let rotatedImage = UIImage(cgImage: result.cgImage!, scale: 1.0, orientation: sourceOrientation)
return rotatedImage

//if needed, see https://stackoverflow.com/questions/8915630/ios-uiimageview-how-to-handle-uiimage-image-orientation
}
}

Pixels.swift

With apologies to the original authors whose links I can’t find at the moment (and/or the people from whom they inherited code), here’s the Pixels type that I’ve been using for pixel data extracted from a UIImage. This type is returned from a function made available in an extension to UIImage.

//
// PixelArrays.swift
// Point 001
//
// Created by GARY BARTOS on 12/20/20.
//

import Foundation
import CoreGraphics
import UIKit

/// TODO initialize with UIImage or CGImage, thus setting imageWidth and imageHeight
struct Pixels {
var data: [UInt8]

/// Image width. Typically CGImage.width, avoiding issues with orientation.
var imageWidth: Int

/// Image height. Typically CGImage.height, avoiding issues with orientation.
var imageHeight: Int

var bytesPerPixel: Int

var isEmpty: Bool {
return data.isEmpty
}

var byteCount: Int {
return data.count
}

var bytesPerRow: Int {
return bytesPerPixel * imageWidth
}

func cgIndex(x: Int, y: Int) -> Int {
y * bytesPerRow + x * bytesPerPixel
}

func uiIndex(x: Int, y: Int, orientation: UIImage.Orientation) -> Int {
let size = CGSize(width: imageWidth, height: imageHeight)

do {
let xform = try UIImage.cgToUITransform(cgImageSize: size, orientation: orientation)
let uiToCg = xform.inverted()
let translated = CGPoint(x: x, y: y).applying(uiToCg)
return cgIndex(x: Int(translated.x), y: Int(translated.y))
}
catch {
print(error.localizedDescription)
return 0
}
}

private func pixel(index: Int) -> UIColor {
return UIColor(
red: CGFloat(data[index])/255.0,
green: CGFloat(data[index + 1])/255.0,
blue: CGFloat(data[index + 2])/255.0,
alpha: CGFloat(data[index + 3])/255.0)
}

func cgPixel(x: Int, y: Int) -> UIColor {
if data.isEmpty {
//TODO throw exception?
return UIColor()
}

let i = cgIndex(x: x, y: y)
return pixel(index: i)
}

func red(x: Int, y: Int) -> UInt8 {
if data.isEmpty {
return 0
}
let i = cgIndex(x: x, y: y)
return data[i]
}

func green(x: Int, y: Int) -> UInt8 {
if data.isEmpty {
return 0
}
let i = cgIndex(x: x, y: y)
return data[i + 1]
}

func blue(x: Int, y: Int) -> UInt8 {
if data.isEmpty {
return 0
}
let i = cgIndex(x: x, y: y)
return data[i + 2]
}

private func arrayByOffset(offset: Int) -> [[UInt8]] {
var start = offset
var arr: [[UInt8]] = []

for _ in 0 ..< imageWidth {
var indexSet = IndexSet()
for i in stride(from: start, to: byteCount, by: bytesPerRow) {
indexSet.insert(i)
}
//let arr = Array(data[ ])
let column = indexSet.map { data[$0] }
arr.append(column)

start += bytesPerPixel
}

return arr
}

func reds() -> [[UInt8]] {
arrayByOffset(offset: 1)
}

func greens() -> [[UInt8]] {
arrayByOffset(offset: 2)
}

func blues() -> [[UInt8]] {
arrayByOffset(offset: 3)
}

func uiPixel(x: Int, y: Int, orientation: UIImage.Orientation) -> UIColor {
let i = uiIndex(x: x, y: y, orientation: orientation)
return pixel(index: i)
}

mutating private func setPixel(index: Int, color: UIColor) {
var red: CGFloat = 0
var green: CGFloat = 0
var blue: CGFloat = 0
var alpha: CGFloat = 0

color.getRed(&red, green: &green, blue: &blue, alpha: &alpha)

data[index] = UInt8(red * 255.0)
data[index + 1] = UInt8(green * 255.0)
data[index + 2] = UInt8(blue * 255.0)
data[index + 3] = UInt8(alpha * 255.0)
}

mutating func cgSetPixel(x: Int, y: Int, color: UIColor) {
if data.isEmpty {
//TODO throw exception for x,y out of bounds
return
}

let i = cgIndex(x: x, y: y)
return setPixel(index: i, color: color)
}

mutating func uiSetPixel(x: Int, y: Int, color: UIColor, orientation: UIImage.Orientation) {
if data.isEmpty {
//TODO throw exception for x,y out of bounds
return
}

let i = uiIndex(x: x, y: y, orientation: orientation)
return setPixel(index: i, color: color)
}

static var zero: Pixels {
Pixels(data: [], imageWidth: 0, imageHeight: 0, bytesPerPixel: 0)
}
}

UIImage+Pixels.swift

The Pixels type is used in this extension of UIImage.

Feel free to find and/or write more efficient code! This got me going. After spending a few hours downloading and testing means of extracting pixels from images, this is the code that worked best for me.

//
// UIImage+PixelDataViaCGContext.swift
// Point 001
//
// Created by GARY BARTOS on 12/20/20.
//

import UIKit

extension UIImage {
/// Generates an array of the underlying CGImage
/// 0.080 seconds to generate array of UInt8 values BGRA (presumably).
/// Actually appears to be RGBW or BGRW
/// Creates array from CGContext
///
/// https://stackoverflow.com/questions/33768066/get-pixel-data-as-array-from-uiimage-cgimage-in-swift
func pixels() -> Pixels {
//print("\(self.size.width) x \(self.size.height) UIImage in \(#function)")

guard let cgImage = self.cgImage else {
return Pixels.zero
}

let size = CGSize(width: cgImage.width, height: cgImage.height) // self.size
let dataSize = size.width * size.height * 4
var pixelData = [UInt8](repeating: 0, count: Int(dataSize))
let colorSpace = CGColorSpaceCreateDeviceRGB()
let bytesPerPixel = 4
let context = CGContext(data: &pixelData,
width: Int(size.width),
height: Int(size.height),
bitsPerComponent: 8,
bytesPerRow: bytesPerPixel * Int(size.width),
space: colorSpace,
bitmapInfo: CGImageAlphaInfo.noneSkipLast.rawValue)

//print("\(cgImage.width) x \(cgImage.height) CGImage in \(#function)")

context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: size.width, height: size.height))

return Pixels(data: pixelData, imageWidth: Int(size.width), imageHeight: Int(size.height), bytesPerPixel: bytesPerPixel)
}

static func fromPixels(pixels: Pixels, orientation: UIImage.Orientation, scale: CGFloat = 1) -> UIImage? {
let colorSpace = CGColorSpaceCreateDeviceRGB()

guard let rgbData = CFDataCreate(nil, pixels.data, pixels.byteCount) else {
return nil
}

guard let provider = CGDataProvider(data: rgbData) else {
return nil
}

guard let rgbImageRef = CGImage(
width: pixels.imageWidth,
height: pixels.imageHeight,
bitsPerComponent: 8,
bitsPerPixel: 8 * pixels.bytesPerPixel,
bytesPerRow: pixels.bytesPerPixel * pixels.imageWidth,
space: colorSpace,
bitmapInfo: CGBitmapInfo(rawValue: 0),
provider: provider,
decode: nil,
shouldInterpolate: false,
intent: CGColorRenderingIntent.defaultIntent)
else {
return nil
}

return UIImage(cgImage: rgbImageRef, scale: scale, orientation: orientation)
}

/// 1D array to UIImage.
/// Passing in the returned array from pixelDataViaContext() should return the original UIImage.
/// Based on https://stackoverflow.com/questions/2261177/cgimage-from-byte-array
static func from(pixelData: [UInt8], width: Int, height: Int, orientation: UIImage.Orientation, scale: CGFloat = 1, bytesPerPixel: Int = 4) -> UIImage? {
let pixels = Pixels(data: pixelData, imageWidth: width, imageHeight: height, bytesPerPixel: bytesPerPixel)
return fromPixels(pixels: pixels, orientation: orientation, scale: scale)
}

}

Usage

The following code yields a UIImage with labeled blobs from a source CIImage.

func blobImage(ci: CIImage, binarizer: CIFilter, drawMode: Blob.DrawMode) -> UIImage? {
binarizer.setValue(ci, forKey: kCIInputImageKey)
let ciFiltered = binarizer.outputImage!

scaleDownFilter.setValue(ciFiltered, forKey: kCIInputImageKey)
let ciScaled = scaleDownFilter.outputImage!
let uiScaled = UIImage.fromCIImage(ci: ciScaled)

let reds = uiScaled.pixels().reds()
let blob = Blob(data: reds, threshold: 128, polarity: .WhiteOnBlack)
let bimg = blob.image(sourceOrientation: uiScaled.imageOrientation, drawMode: drawMode)
return bimg
}
  1. Initialize an instance of Blob with a 2D array of pixels. Set a threshold (in the UInt8 range from 0 to 255) and whether the image has a white foreground on a black background (.WhiteOnBlack) or a black foreground on a white background (.BlackOnWhite).
  2. Generate a UIImage of a particular orientation, and for a certain draw mode: just the outline of the biggest blob, all blobs filled in, or the like.
let blob = Blob(data: reds, threshold: 128, polarity: .WhiteOnBlack)
let bimg = blob.image(sourceOrientation: uiScaled.imageOrientation, drawMode: drawMode)

That’s a very long post. One of these days I’ll push my code samples off to GitHub, for now I like having it all within the Medium post itself.

--

--

Gary Bartos
Gary Bartos

Written by Gary Bartos

Founder of Echobatix, engineer, inventor of assistive technology for people with disabilities. Keen on accessible gaming. echobatix@gmail.com

No responses yet