-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Read and write binary file documentation #2811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 1 commit
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
db837ca
Add sample file to read and write binary files
jwood803 4338862
Add cookbook section for reading and writing binary files
jwood803 57859ad
Rename class to match ML.NET names
jwood803 3d20a99
Update based off PR feedback
jwood803 d5ca7c4
Update to let code actually compile
jwood803 e28823e
A cookbook test and slight update to the doc
jwood803 aa28d83
Merge branch 'master' into binary-documentation
jwood803 5741c40
Updated for PR feedback
jwood803 90b19f9
Also update sample file for PR feedback
jwood803 f1d1bf7
Update cookbook sample code
jwood803 30c3ae0
Revert submodule maybe?
jwood803 d107706
Merge branch 'master' into binary-documentation
jwood803 ed2604e
Fix build error
jwood803 7d41674
Merge branch 'master' into binary-documentation
jwood803 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Update based off PR feedback
- Loading branch information
commit 3d20a995f4f53db82be05b33da566ab596c61ecf
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1025,33 +1025,36 @@ using (var fs = File.OpenRead(modelPath)) | |
``` | ||
|
||
## How can I read and write binary data? | ||
Other than using text files ML.NET will allow you to read and write binary data. | ||
Other than using text files, ML.NET will allow you to read and write binary data. This has a few advantages such as not having to specify a schema, can improve reading times, and are generally smaller than text files. | ||
|
||
To write binary data you need some data to be able to save. Specifically you need an instance of an `IDavaView`. Below is a code snippet that uses the iris data as an example. | ||
|
||
```csharp | ||
// Data model for the iris data | ||
public class IrisData | ||
{ | ||
public float Label; | ||
public float SepalLength; | ||
public float SepalWidth; | ||
public float PetalLength; | ||
public float PetalWidth; | ||
public float Label { get; set; }; | ||
public float SepalLength { get; set; }; | ||
public float SepalWidth { get; set; }; | ||
public float PetalLength { get; set; }; | ||
public float PetalWidth { get; set; }; | ||
} | ||
|
||
// An array of iris data points | ||
var dataArray = new[] { | ||
new IrisData{Label=1, PetalLength=1, SepalLength=1, PetalWidth=1, SepalWidth=1}, | ||
new IrisData{Label=0, PetalLength=2, SepalLength=2, PetalWidth=2, SepalWidth=2} | ||
var dataArray = new[] | ||
{ | ||
new IrisData { Label=1, PetalLength=1, SepalLength=1, PetalWidth=1, SepalWidth=1 }, | ||
new IrisData { Label=0, PetalLength=2, SepalLength=2, PetalWidth=2, SepalWidth=2 } | ||
}; | ||
|
||
// Create the ML.NET context. | ||
var context = new MLContext(); | ||
|
||
// Create the data view. | ||
// This method will use the definition of IrisData to understand what columns there are in the | ||
// data view. | ||
// This method will use the definition of IrisData to understand what columns there are | ||
// in the data view. However, the objects in ML.NET are only "promises" of data since | ||
// ML.NET operations are lazy. One way to get a look at the data is with Schema Comprehension. | ||
// Refer to this document for more information - https://github.com/dotnet/machinelearning/blob/master/docs/code/SchemaComprehension.md | ||
var data = context.CreateDataView(dataArray); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use the language of Schema Comprehension here, as done above in "How do I look at the intermediate data?". Also can link to this doc: https://github.com/dotnet/machinelearning/blob/master/docs/code/SchemaComprehension.md #Resolved |
||
|
||
// Use a FileStream to create a file. Use the stream and the data view in the "SaveAsBinary" method. | ||
|
@@ -1061,7 +1064,7 @@ using(var stream = new FileStream("./iris.idv", FileMode.Create)) | |
} | ||
``` | ||
|
||
To read a binary file, simply use the `context.Data.ReadFromBinary` method and pass in the path of the binary file to read in. | ||
To read a binary file, simply use the `context.Data.ReadFromBinary` method and pass in the path of the binary file to read in. Notice that the schema of the data does not need to be defined here. | ||
|
||
```csharp | ||
var data = context.Data.ReadFromBinary("./iris.idv"); | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cookbook examples all have corresponding tests in
Tests/Scenarios/Api/CookbookSamples/
. Could you please add one for this too? #ResolvedUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must be missing something when I try to add the example. It's not finding the
CreateDataView
method on the ML context. It may be missing a reference for it, but I'm not sure which one it needs. #ResolvedUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rogancarr I switched to
LoadFromEnumerable
which is what the sample is already using. Hopefully the test looks good now. #Resolved