Skip to content

Speed-up bitmap operations on images. #5856

Closed
@darth-vader-lg

Description

@darth-vader-lg

I found something that it could be improved, while I was working on my object detection applications.
I dicovered that Microsoft.ML.ImageAnalytics uses the GetPixel and SetPixel functions to access to the images' bitmap data.
It's notorious that such kind of functions are very slow compared to a raw access to the image's data buffer (up to 10 time slower).
It has a big evidence when we work on object recognition, where a huge amount of images and frames must be processed.

My proposal is to implement the raw access to speed-up all the operations.
I'm gonna submit a PR ASAP with the needed changes.

I prepared a test to check if it really speed up the process and... yes, of course, it did it.

My code for the test I did:

[TensorFlowFact]
public void TensorFlowTransformObjectDetectionLoopTest()
{
    // Saved model
    var modelLocation = @"D:\ObjectDetection\carp\TensorFlow\exported-model-SSD-MobileNET-v2-320x320\saved_model";
    // Create the estimators pipe
    var pipe = 
        _mlContext.Transforms.LoadImages(
            inputColumnName: "ImagePath",
            outputColumnName: "Image",
            imageFolder: "")
        .Append(_mlContext.Transforms.ResizeImages(
            inputColumnName: "Image",
            outputColumnName: "ResizedImage",
            imageWidth: 300,
            imageHeight: 300,
            resizing: ImageResizingEstimator.ResizingKind.Fill))
        .Append(_mlContext.Transforms.ExtractPixels(
            inputColumnName: "ResizedImage",
            outputColumnName: "serving_default_input_tensor:0",
            interleavePixelColors: true,
            outputAsFloatArray: false))
        .Append(_mlContext.Model.LoadTensorFlowModel(modelLocation).ScoreTensorFlowModel(
            inputColumnNames: new[] { "serving_default_input_tensor:0" },
            outputColumnNames: new[]
            {
                "StatefulPartitionedCall:1" /* detection_boxes */,
                "StatefulPartitionedCall:2" /* detection_classes */,
                "StatefulPartitionedCall:4" /* detection_scores */
            }));

    // Collect all the path of the images in the test directory
    var imagesLocation = @"D:\ObjectDetection\carp\TensorFlow\images\test";
    var images =
        Directory.GetFiles(imagesLocation).Where(file => new[] { ".jpg", ".jfif" }
        .Any(ext => Path.GetExtension(file).ToLower() == ext))
        .Select(file => new { ImagePath = file })
        .ToArray();

    // Create the transformer
    var data = _mlContext.Data.LoadFromEnumerable(images.Take(0));
    var model = pipe.Fit(data);

    // Test n times the inference on the collected images
    for (int i = 0, nImage = 0; i < 1000; i++, nImage = (nImage + 1) % images.Length)
        model.Transform(_mlContext.Data.LoadFromEnumerable(new[] { images[nImage] })).Preview();
}

Without optimization (current)

WithoutOptimization

With raw access optimization

WithRawAccessOptimization

So the speed-up ratio on a set of small images is about 182%.
On larger images it could be also more.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions