-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add doc for MF #1450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add doc for MF #1450
Conversation
src/Microsoft.ML.Recommender/doc.xml
Outdated
}); | ||
|
||
// Train a matrix factorization model. | ||
var model = pipeline.Fit(dataView); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move this in the Samples project, on its own file. See:
and
https://github.com/dotnet/machinelearning/blob/master/src/Microsoft.ML.StandardLearners/Standard/SdcaStatic.cs#L41
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example is removed. I just add a pointer to that test.
In reply to: 229412969 [](ancestors = 229412969)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is not a complete example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I don't understand. Could you tell me which part is missing?
Following Senja's suggestion, a new example in Sample project is created.
In reply to: 229474507 [](ancestors = 229474507)
src/Microsoft.ML.Recommender/doc.xml
Outdated
Assume that <i>R</i> is a m-by-n matrix and the rank of the two factor matrices are <i>P</i> (m-by-k matrix) and <i>Q</i> (n-by-k), where k is the approximation rank. | ||
The predicted entry value at the u-th row and the v-th column in <i>R</i> would be the inner product of the u-th row of P and the v-th row of Q; that is, | ||
<i>R</i> is approximated by <i>P</i> and <i>Q</i>. This trainer implements a stochastic gradient method for finding <i>P</i> and <i>Q</i> via minimizing | ||
the distance between <i>R</i> and the product of <i>P</i>'s transpose and Q. For users interested in the mathematical details, please see the references below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use
to structure, maybe? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/Microsoft.ML.Recommender/doc.xml
Outdated
<remarks> | ||
The basic idea of matrix factorization is finding two low-rank factor marcies to apporimate the training matrix. In this module, the expected | ||
training data is a list of tuples. Every tuple consists of a column index, a row index, and the value at the location specified by the two indexes. | ||
For an example of the data structure you can use to encode those tuples, please see <a href='https://github.com/dotnet/machinelearning/pull/1407/files#diff-ce59b50bd87003b0ffb26912fc4a0e65R144'>MatrixElement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about how I feel pointing our documentation to the tests. @JRAlexander, what do yo u think? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must have code reference for this sentence and that code need to be tested. I believe a doc has to be self-contained. Because I documented that test as detailed as a doc, I'd like to reuse it here.
In reply to: 229414267 [](ancestors = 229414267)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather have an actual sample that users could use right away. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That test is the minimal end-to-end example user can use. It even includes data preparation.
In reply to: 229453835 [](ancestors = 229453835)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the link, I would just add this from the test: // The following variables defines the shape of a matrix. Its shape is _synthesizedMatrixRowCount-by-_synthesizedMatrixColumnCount.
// The variable _synthesizedMatrixFirstRowIndex indicates the integer that would be mapped to the first row index. If user data uses
// 0-based indices for rows, _synthesizedMatrixFirstRowIndex can be set to 0. Similarly, for 1-based indices, _synthesizedMatrixFirstRowIndex
// could be 1.
#Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is an explanation of what you were trying to show with the test, correct? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about data preparation, not directly related the trainer.
[Update] Yes, you're right.
In reply to: 229475909 [](ancestors = 229475909)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better for the user to include the comment instead of the link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will add a real C# data structure and comments here. In addition, links to code are removed.
[Update] In addition, we now have a reference to file which may be automatically converted to example by Microsoft's doc site.
In reply to: 229483731 [](ancestors = 229483731)
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
// line by line. | ||
namespace Microsoft.ML.Samples.Static | ||
{ | ||
public partial class TrainerSamples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TrainerSamples [](start = 25, length = 14)
in Senja code it's TrainersSamples not TrainerSamples #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Senja says up to me. TrainerSamples (this is also used in Dynamic SDCA'cs) looks better because I don't feel we need two "s" in the name.
In reply to: 229807913 [](ancestors = 229807913,229806788)
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
docs/samples/Microsoft.ML.Samples/Dynamic/MatrixFactorization.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please separate code fixes into another PR #Resolved |
Add doc for matrix factorization trainer so that users can see what it is. Fixes #1401.