Closed
Description
System information
- OS version/distro: Windows 10.0.17134
- .NET Version (eg., dotnet --info): .NET Core 2.2.202
Issue
- What did you do?
My intention was to implement object detection in video. Please see attached code sample and explanation below. - What happened?
I wasn't able to find proper API methods for configuring MLContext pipeline. - What did you expect?
I expected it should be possible to pass individual in-memory representation of image(as a Bitmap object for example) into pipeline.
Source code / logs
Let's pretend we have a task to run object detection model on frames coming from video stream.
It seems natural to pass 1 frame one by one as they arriving from camera.
I can't find such possibility with current MLContext pipeline and API available.
Here's example pipeline from machinelearning-samples :
var pipeline = mlContext.Transforms.LoadImages(outputColumnName: "image", imageFolder: imagesFolder, inputColumnName: nameof(ImageNetData.ImagePath))
.Append(mlContext.Transforms.ResizeImages(outputColumnName: "image", imageWidth: ImageNetSettings.imageWidth, imageHeight: ImageNetSettings.imageHeight, inputColumnName: "image"))
.Append(mlContext.Transforms.ExtractPixels(outputColumnName: "image"))
.Append(mlContext.Transforms.ApplyOnnxModel(modelFile: modelLocation, outputColumnNames: new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames: new[] { TinyYoloModelSettings.ModelInput }));
Having LoadImages extension method overload accepting in-memory image representation (Bitmap or whatever) would definetly solve a problem:
public static ImageLoadingEstimator LoadImages(this TransformsCatalog catalog, params Bitmap[] images);
Am I missing some way implement this without storing frames onto the disk? Or does it mean ML.NET wasn't designed for realtime video processing?
Thanks in advance.