Image handling, i.e. from UIImage (on tvOS/iOS, or NSImage on OSX) to DeepLearningKit representation and vice versa is one of the core functionalities needed. Found a nice iOS (Swift) project on Github called ImagePixelFun (by kNeerajPro)- that allowed setting and getting RGB(A) pixels directly on an UIImage through an extension. The only thing I needed was to do minor updates of ImagePixelFun to Swift 2.x and then it worked nicely on both tvOS and iOS, and then I integrated the extension into DeepLearningKit – (note: for OS X I still haven’t solved the issue since the NSImage API is slightly different from UIImage)
1. CIFAR-10 Image Handling (ref. Deep Learning model used in app examples)
CIFAR 10 Images are (small) 32×32 pixel images in 3 (1 byte) channels – RGB – in the json file in the tvOS and iOS app examples this is stored as a single array of length 3072 (i.e. width*height*#channels = 32x32x3) in the file conv1.json in the “input” field. The Swift code below shows how to convert the internal DeepLearningKit format (i.e. Caffe converted to JSON with caffemodel2json). The main method is setPixelColorAtPoint(CGPoint(x: j,y: i), color: UIImage.RawColorType(r,g,b,255))!
. Note that the reverse method (also shown below) – getPixelColorAtLocation(CGPoint(x:i, y:j))
– can be used to get RGB(A) from an existing UIImage (e.g. an image taken with camera and shown inside the app).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
// file: ViewController.swift - in both tvOS and iOS DeepLearningKit app examples
// e.g. https://github.com/DeepLearningKit/DeepLearningKit/blob/master/iOSDeepLearningKitApp/iOSDeepLearningKitApp/iOSDeepLearningKitApp/ViewController.swift
func showCIFARImage(var cifarImageData:[Float]) {
let size = CGSize(width: 32, height: 32)
let rect = CGRect(origin: CGPoint(x: 0,y: 0), size: size)
UIGraphicsBeginImageContextWithOptions(size, false, 0)
UIColor.whiteColor().setFill() // or custom color
UIRectFill(rect)
var image = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
// CIFAR 10 images are 32x32 in 3 channels - RGB
// it is stored as 3 sequences of 32x32 = 1024 numbers in cifarImageData, i.e.
// red: numbers from position 0 to 1024 (not inclusive)
// green: numbers from position 1024 to 2048 (not inclusive)
// blue: numbers from position 2048 to 3072 (not inclusive)
for i in 0..<32 {
for j in 0..<32 {
let r = UInt8(cifarImageData[i*32 + j])
let g = UInt8(cifarImageData[32*32 + i*32 + j])
let b = UInt8(cifarImageData[2*32*32 + i*32 + j])
// used to set pixels - RGBA into an UIImage
// for more info about RGBA check out https://en.wikipedia.org/wiki/RGBA_color_space
image = image.setPixelColorAtPoint(CGPoint(x: j,y: i), color: UIImage.RawColorType(r,g,b,255))!
// used to read pixels - RGBA from an UIImage
var color = image.getPixelColorAtLocation(CGPoint(x:i, y:j))
}
}
print(image.size)
// Displaying original image.
let originalImageView:UIImageView = UIImageView(frame: CGRectMake(20, 20, image.size.width, image.size.height))
originalImageView.image = image
self.view.addSubview(originalImageView)
}
|
2. imageToMatrix(image: UIImage)
kindly added a function to convert an UIImage to a tuple of RGB(A) vectors (and provided a small fix to it), this is another approach (and likely faster) to get pixels from UIImage than the approach above. It has been added to iOS and tvOS example apps (in the ImageUtilityFunctions.swift file for each of the examples)
Code signature:
1
|
func imageToMatrix(image: UIImage) -> ([Float], [Float], [Float], [Float])
|
Conclusion
The extensions allowing setting and getting RGBA values in UIImage is added to DeepLearningKit tvOS and iOS app examples or the imageToMatrix method – which should make it fairly easy to do whatever image conversion you want.
Best regards,
Amund Tveit