Cleaning TrainCatalog and RecommenderCatalog #2973

artidoro · 2019-03-15T01:03:25Z

In this PR I clean TrainCatalog (commit 1) and RecommenderCatalog (commit 2).
The other commits are simply the adjustments in the code that need to be done to compile.

codecov · 2019-03-15T01:39:07Z

Codecov Report

Merging #2973 into master will increase coverage by 0.36%.
The diff coverage is 76.81%.

@@            Coverage Diff             @@
##           master    #2973      +/-   ##
==========================================
+ Coverage   72.35%   72.71%   +0.36%     
==========================================
  Files         803      803              
  Lines      143290   145370    +2080     
  Branches    16155    16766     +611     
==========================================
+ Hits       103673   105710    +2037     
- Misses      35191    35231      +40     
- Partials     4426     4429       +3

Flag	Coverage Δ
#Debug	`72.71% <76.81%> (+0.36%)`	⬆️
#production	`68.43% <71.42%> (+0.36%)`	⬆️
#test	`88.8% <100%> (+0.28%)`	⬆️

Impacted Files	Coverage Δ
...cenariosWithDirectInstantiation/TensorflowTests.cs	`90.81% <100%> (ø)`	⬆️
test/Microsoft.ML.Tests/Scenarios/Api/TestApi.cs	`97.61% <100%> (ø)`	⬆️
test/Microsoft.ML.Functional.Tests/Validation.cs	`100% <100%> (ø)`	⬆️
...ests/TrainerEstimators/MatrixFactorizationTests.cs	`97.17% <100%> (+0.25%)`	⬆️
...s/Api/CookbookSamples/CookbookSamplesDynamicApi.cs	`93.12% <100%> (-0.41%)`	⬇️
...soft.ML.Data/DataLoadSave/DataOperationsCatalog.cs	`71.11% <100%> (-2.13%)`	⬇️
test/Microsoft.ML.Functional.Tests/Evaluation.cs	`100% <100%> (ø)`	⬆️
src/Microsoft.ML.Recommender/RecommenderCatalog.cs	`70.83% <42.85%> (ø)`	⬆️
src/Microsoft.ML.Data/TrainCatalog.cs	`84.11% <72.72%> (ø)`	⬆️
...rosoft.ML.Data/DataLoadSave/CompositeDataLoader.cs	`73.23% <0%> (-16.35%)`	⬇️
... and 23 more

test/Microsoft.ML.Tests/Scenarios/Api/CookbookSamples/CookbookSamplesDynamicApi.cs

abgoswam

src/Microsoft.ML.Data/TrainCatalog.cs

src/Microsoft.ML.Recommender/RecommenderCatalog.cs

wschin · 2019-03-15T23:57:32Z

src/Microsoft.ML.Recommender/RecommenderCatalog.cs

-            IDataView data, IEstimator<ITransformer> estimator, int numFolds = 5, string labelColumn = DefaultColumnNames.Label,
-            string stratificationColumn = null, int? seed = null)
+            IDataView data, IEstimator<ITransformer> estimator, int numberOfFolds = 5, string labelColumnName = DefaultColumnNames.Label,
+            string samplingKeyColumn = null, int? seed = null)


samplingKeyColumn [](start = 19, length = 17)

Would you consider FoldColumnName? #Resolved

Wei-Sheng suggested partitionColumnName`, which I think reflects the role of the column better.

In reply to: 266177816 [](ancestors = 266177816)

#2537
#2536
We had discussion regarding name of this column.
If you want to change it again, I would advice ask opinion of @rogancarr and @TomFinley

So, I also considered in the issues @Ivanidzo4ka linked the merits of "partition" in my own thinking, though I did not write it down (as I had internally rejected it). I will now write down why I rejected it here, and you can see if you agree or disagree.

The major objection is that, if I look forward, I anticipate a future where this same sort of key is useful in any sampling utilities. (There are multiple ways to sample a dataset than simple partitionings!) For the sake of that more forward looking perspective, I wanted it named something else. And we ultimately settled on "sampling key."

So I would prefer if we did not act on this suggestion.

Incidentally I wouldn't consider it a disaster if we did this. (Keeping it as stratification would have been a disaster! 😄) It merely is slightly misleading I think. But of course I also see @wschin's point.

Thank you @Ivanidzo4ka, and @TomFinley for your comments on this, I am reverting the commit to leave the name of samplingKeyColumnName as is.

wschin

Only have a comment to naming. Please let know if it doesn't make sense to you.

artidoro added the API Issues pertaining the friendly API label Mar 15, 2019

artidoro added this to the 0319 milestone Mar 15, 2019

artidoro self-assigned this Mar 15, 2019

artidoro requested review from Ivanidzo4ka, wschin, abgoswam and sfilipi March 15, 2019 01:03

abgoswam reviewed Mar 15, 2019

View reviewed changes

test/Microsoft.ML.Tests/Scenarios/Api/CookbookSamples/CookbookSamplesDynamicApi.cs Show resolved Hide resolved

abgoswam approved these changes Mar 15, 2019

View reviewed changes

abgoswam reviewed Mar 15, 2019

View reviewed changes

src/Microsoft.ML.Data/TrainCatalog.cs Outdated Show resolved Hide resolved

wschin reviewed Mar 15, 2019

View reviewed changes

src/Microsoft.ML.Recommender/RecommenderCatalog.cs Outdated Show resolved Hide resolved

wschin reviewed Mar 15, 2019

View reviewed changes

wschin approved these changes Mar 15, 2019

View reviewed changes

artidoro added 5 commits March 18, 2019 11:46

scrubbed traincatalog

bcf82d2

modified the rest of the files accordingly

3757b1b

scrubbed recommenderCatalog

2b0be59

modified the rest of the files accordingly

5ef42ab

revew comments

9021a94

artidoro force-pushed the scrubbing branch 2 times, most recently from 822686e to 9021a94 Compare March 19, 2019 06:27

artidoro merged commit 71693b3 into dotnet:master Mar 19, 2019

wschin mentioned this pull request Mar 20, 2019

Polish train catalog (renaming only) #3030

Merged

ghost locked as resolved and limited conversation to collaborators Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleaning TrainCatalog and RecommenderCatalog #2973

Cleaning TrainCatalog and RecommenderCatalog #2973

artidoro commented Mar 15, 2019

codecov bot commented Mar 15, 2019 •

edited

Loading

abgoswam left a comment

wschin Mar 15, 2019 •

edited by artidoro

Loading

artidoro Mar 18, 2019

Ivanidzo4ka Mar 18, 2019

TomFinley Mar 19, 2019

TomFinley Mar 19, 2019

artidoro Mar 19, 2019

wschin left a comment

Cleaning TrainCatalog and RecommenderCatalog #2973

Cleaning TrainCatalog and RecommenderCatalog #2973

Conversation

artidoro commented Mar 15, 2019

codecov bot commented Mar 15, 2019 • edited Loading

Codecov Report

abgoswam left a comment

Choose a reason for hiding this comment

wschin Mar 15, 2019 • edited by artidoro Loading

Choose a reason for hiding this comment

artidoro Mar 18, 2019

Choose a reason for hiding this comment

Ivanidzo4ka Mar 18, 2019

Choose a reason for hiding this comment

TomFinley Mar 19, 2019

Choose a reason for hiding this comment

TomFinley Mar 19, 2019

Choose a reason for hiding this comment

artidoro Mar 19, 2019

Choose a reason for hiding this comment

wschin left a comment

Choose a reason for hiding this comment

codecov bot commented Mar 15, 2019 •

edited

Loading

wschin Mar 15, 2019 •

edited by artidoro

Loading