Skip to content

[Mn] Roadmap update #4704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 28, 2020
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,10 @@ To build ML.NET from source please visit our [developers guide](docs/project-doc
|**Windows x86**|[![Build Status](https://dev.azure.com/dnceng/public/_apis/build/status/dotnet/machinelearning/MachineLearning-CI?branchName=master&jobName=Windows_x86_NetCoreApp21&configuration=Windows_x86_NetCoreApp21%20Debug_Build)](https://dev.azure.com/dnceng/public/_build/latest?definitionId=104&branchName=master)|[![Build Status](https://dev.azure.com/dnceng/public/_apis/build/status/dotnet/machinelearning/MachineLearning-CI?branchName=master&jobName=Windows_x86_NetCoreApp21&configuration=Windows_x86_NetCoreApp21%20Release_Build)](https://dev.azure.com/dnceng/public/_build/latest?definitionId=104&branchName=master)|
|**Windows NetCore3.0**|[![Build Status](https://dev.azure.com/dnceng/public/_apis/build/status/dotnet/machinelearning/MachineLearning-CI?branchName=master&jobName=Windows_x64_NetCoreApp30&configuration=Windows_x64_NetCoreApp30%20Debug_Build)](https://dev.azure.com/dnceng/public/_build/latest?definitionId=104&branchName=master)|[![Build Status](https://dev.azure.com/dnceng/public/_apis/build/status/dotnet/machinelearning/MachineLearning-CI?branchName=master&jobName=Windows_x64_NetCoreApp30&configuration=Windows_x64_NetCoreApp30%20Release_Build)](https://dev.azure.com/dnceng/public/_build/latest?definitionId=104&branchName=master)|

## Release process and versioning

Check out the [release process documentation](docs/release-notes) to understand the different kinds of ML.NET releases.

## Contributing

We welcome contributions! Please review our [contribution guide](CONTRIBUTING.md).
Expand Down
70 changes: 12 additions & 58 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,21 @@
# The ML.NET Roadmap

The goal of ML.NET project is to provide an easy to use, .NET-friendly ML platform. This document describes the tentative plan for the project in the short and long-term.
The goal of the ML.NET project is to make .NET developers great at machine learning. This document describes the plan for the project.

ML.NET is a community effort and we welcome community feedback on our plans. The best way to give feedback is to open an issue in this repo. It's always a good idea to have a discussion before embarking on a large code change to make sure there is not duplicated effort.
Many of the features listed on the roadmap already exist in the internal version of the code-base. They are marked with (*). We plan to release more and more internal features to Github over time.
ML.NET is a community effort and we welcome community feedback on our plans. The best way to give feedback is to open an issue in this repo.

In the meanwhile, we are looking for contributions. An easy place to start is to look at _up-for-grabs_ issues on [Github](https://github.com/dotnet/machinelearning/issues?q=is%3Aopen+is%3Aissue+label%3Aup-for-grabs)
We also invite contributions. The [up-for-grabs issues](https://github.com/dotnet/machinelearning/issues?q=is%3Aopen+is%3Aissue+label%3Aup-for-grabs) on GitHub are a good place to start.

## Short Term
### Training Improvements
* Deep Learning Training Support
* Integrate with leading DNN package(s)
* Support for transfer learning.
* Hybrid training of pipelines containing both DNN and non-DNN predictors.
* Fast.ai like APIs.
## Goals through June 30, 2020
### Test stability
Continuous integration builds currently have a 30% pass rate. We aim to get this pass rate up to at least 80%.

### Trained Model Management
* Export models to [ONNX](https://github.com/onnx/models) (*)
### Streaming metrics
Currently, the way ML.NET computes [metrics](https://docs.microsoft.com/dotnet/machine-learning/resources/metrics) is memory-intensive. We will compute metrics in a streaming fashion instead, thereby reducing memory consumption.

## Longer Term
### Multivariate anomaly detection
ML.NET already supports [univariate anomaly detection](https://docs.microsoft.com/dotnet/api/microsoft.ml.timeseriescatalog.detectanomalybysrcnn?view=ml-dotnet), but we will add the ability to detect anomalies in multiple variables over time.

### Training Improvements
* Add more learners, perhaps, including: (*)
* [ProtoNN and Bonsaii](https://www.microsoft.com/en-us/research/project/resource-efficient-ml-for-the-edge-and-endpoint-iot-devices/) for compact and efficient models.
* Integration with other ML packages
* Accord.NET
* etc.
* Additional ML tasks (*)
* _Sequence Classification_ - learns from a series of examples in a sequence, and each item is assigned a distinct label, akin to a multiclass classification task
* Additional Data source support
* Data from SQL Databases, such as SQL Server
* Data located on the cloud
* Apache Parquet
* Native Binary high-performance format
* Distributed Training
* Easily train models on the cloud
* Whole-pipeline optimizations for both training and inference
* Automation of more data science tasks
* Additional Trainers
* Additional tasks

### Featurization Improvements
* Improved data wrangling support
* Add auto-suggestion of training pipelines. The technology will provide intelligent ```LearningPipeline``` suggestions based on training data attributes (*)
* Additional natural language text preprocessing
* Time series and forecasting
* Support for Video, audio, and other data types

### Trained Model Management
* Model operationalization in the Cloud
* Model deployment on mobile platforms
* Ability to run [ONNX](https://github.com/onnx/models) models in the ```LearningPipeline```
* Support for the next version of ONNX
* Model deployment to IOT devices

### GUI Improvements
* Usability improvements
* Support of additional ML.NET features
* Improved code generation for training and inference
* Run the pipelines rather than just suggesting them; present to the user the pipelines and the metrics generated from running.
* Distributed runs, rather than sequential.

### Other
* Support for additional languages
* Published reproducible benchmarks against industry-leading ML toolkits on a variety of tasks and datasets
### ONNX Runtime exportability

We will expand the number of ML.NET transforms and estimators that are exportable to the [ONNX Runtime](https://github.com/Microsoft/onnxruntime).
Binary file added docs/images/include-prerelease.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
37 changes: 37 additions & 0 deletions docs/project-docs/release-process.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
ML.NET Release Process
======================

This document describes the different kinds of ML.NET releases, how those releases are versioned, and how they are built.

Types of releases
--------------------

```text ML.NET NuGets``` (of which there are approximately 25) are versioned with the following format: `A.B.C<-D>`, where `A`, `B`, and `C` are integers, and `D` is an optional string.

- `A` - **version number**: If `A` is 0, this NuGet is considered a **work in progress (WIP)**, and could be deleted at any time. If `A` is greater than 0, then we plan to support the corresponding NuGet indefinitely.
- `B` - **sub-version number**: This number is consistent within each GA release and within each WIP release. Therefore, all GA NuGets that are released at the same time will have the same sub-version number, and all WIP releases that are released at the same time will have the same sub-version number, but a WIP release and a GA release that are released at the same time may have different sub-version numbers.
- `C` - **patch index**: `C` starts at 0 and is incremented every time we introduce a bug fix between releases.
- `D` - **preview suffix**: `D` is an optional suffix which contains the word "preview" followed by an integer or by a datetime string and an integer. If D is not included and A is not 0, then the API surface is locked and will not change in future releases.

ML.NET has four kinds of releases: daily builds, previews, periodic general availability (GA), and fix. We detail each kind of release below.

1. **Daily builds:** these can be downloaded from [this NuGet feed](https://dev.azure.com/dnceng/public/_packaging?_a=feed&feed=MachineLearning), and are built automatically each time a commit is made to the `master` branch.
1. **Preview:** These releases are built from the corresponding `A.B-preview-X` GitHub branch, and are expected to meet a higher quality bar than the daily builds. These can also be downloaded from [this NuGet feed](https://dev.azure.com/dnceng/public/_packaging?_a=feed&feed=MachineLearning), or within Visual Studio, as detailed below. When we introduce new API's in a preview release, we avoid doing a GA release at the same time (unless there are patches required for the last GA release). If there are no new API's, then we go straight to a GA release and skip the preview release.
1. **GA:** These releases are built from the corresponding `A.B` GitHub branch. They are rigorously tested, stable, and meant for general use. They are also the default choice when installing ML.NET via the `Install-Package Microsoft.ML` command.
1. **Fix:** These releases include patches for bugs in either the preview or GA releases.

Versioning for releases
--------------------

The table below explains how each of the elements in our versioning schema would change, relative to the previous release, for each kind of release.

| Release type | Change in `A` | Change in `B` | Change in `C` | Change in `D` |
| -------------|-------------|-------------|-------------|-------------|
| Daily build | No change | No change | No change | A date-like stamp is added, i.e. `Microsoft.Extensions.ML 1.5.0-preview-28327-2` |
| Preview | No change | No change | No change | `preview` tag added to the most recent GA release, if this is the first preview, or preview index is incremented (i.e. `A.B.C-preview` -> `A.B.C-preview2`) |
| GA | Incremented for major releases, only for non-WIP NuGets. WIP NuGets maintain an `A` value of `0` | Incremented, or reset to 0 if `A` was incremented | Reset to 0 | `preview` tag is removed |
| Fix | No change | No change | Incremented | No change

> Note: to install the preview packages via the NuGet Package Manager in Visual Studio, you must make sure to check the "Include prerelease" checkbox:

![include-prerelease](../images/include-prerelease.png)