Description
I found something that it could be improved, while I was working on my object detection applications.
I dicovered that Microsoft.ML.ImageAnalytics uses the GetPixel and SetPixel functions to access to the images' bitmap data.
It's notorious that such kind of functions are very slow compared to a raw access to the image's data buffer (up to 10 time slower).
It has a big evidence when we work on object recognition, where a huge amount of images and frames must be processed.
My proposal is to implement the raw access to speed-up all the operations.
I'm gonna submit a PR ASAP with the needed changes.
I prepared a test to check if it really speed up the process and... yes, of course, it did it.
My code for the test I did:
[TensorFlowFact]
public void TensorFlowTransformObjectDetectionLoopTest()
{
// Saved model
var modelLocation = @"D:\ObjectDetection\carp\TensorFlow\exported-model-SSD-MobileNET-v2-320x320\saved_model";
// Create the estimators pipe
var pipe =
_mlContext.Transforms.LoadImages(
inputColumnName: "ImagePath",
outputColumnName: "Image",
imageFolder: "")
.Append(_mlContext.Transforms.ResizeImages(
inputColumnName: "Image",
outputColumnName: "ResizedImage",
imageWidth: 300,
imageHeight: 300,
resizing: ImageResizingEstimator.ResizingKind.Fill))
.Append(_mlContext.Transforms.ExtractPixels(
inputColumnName: "ResizedImage",
outputColumnName: "serving_default_input_tensor:0",
interleavePixelColors: true,
outputAsFloatArray: false))
.Append(_mlContext.Model.LoadTensorFlowModel(modelLocation).ScoreTensorFlowModel(
inputColumnNames: new[] { "serving_default_input_tensor:0" },
outputColumnNames: new[]
{
"StatefulPartitionedCall:1" /* detection_boxes */,
"StatefulPartitionedCall:2" /* detection_classes */,
"StatefulPartitionedCall:4" /* detection_scores */
}));
// Collect all the path of the images in the test directory
var imagesLocation = @"D:\ObjectDetection\carp\TensorFlow\images\test";
var images =
Directory.GetFiles(imagesLocation).Where(file => new[] { ".jpg", ".jfif" }
.Any(ext => Path.GetExtension(file).ToLower() == ext))
.Select(file => new { ImagePath = file })
.ToArray();
// Create the transformer
var data = _mlContext.Data.LoadFromEnumerable(images.Take(0));
var model = pipe.Fit(data);
// Test n times the inference on the collected images
for (int i = 0, nImage = 0; i < 1000; i++, nImage = (nImage + 1) % images.Length)
model.Transform(_mlContext.Data.LoadFromEnumerable(new[] { images[nImage] })).Preview();
}
Without optimization (current)
With raw access optimization
So the speed-up ratio on a set of small images is about 182%.
On larger images it could be also more.