Skip to content

Learning with counts (Dracula) transformer #4514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Jun 4, 2020
Merged
Changes from 1 commit
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
70e3480
count table transform
yaeldMS Nov 7, 2019
26169c8
Dracula with unit tests
yaeldMS Dec 2, 2019
b818101
Fix entry point catalog test
yaeldMS Dec 2, 2019
08a7108
Address code review comments
yaeldMS Dec 3, 2019
2b2428b
create estimator from trained transformer
yaeldMS Dec 10, 2019
0ef935b
switch from three dimensional array of counts to two dimensional arra…
yaeldMS Dec 11, 2019
097f2f1
change mechanism for loading a pre-trained count table
yaeldMS Dec 16, 2019
73fd1ff
Add a sample
yaeldMS Dec 18, 2019
741b5ad
fix entrypoint catalog
yaeldMS Dec 25, 2019
8d00f47
documentation
yaeldMS Dec 27, 2019
c13ff8d
count table transform
yaeldMS Nov 7, 2019
f756a20
Dracula with unit tests
yaeldMS Dec 2, 2019
8880cc5
Address code review comments
yaeldMS Dec 3, 2019
330c6c5
create estimator from trained transformer
yaeldMS Dec 10, 2019
5c33181
change mechanism for loading a pre-trained count table
yaeldMS Dec 16, 2019
c8a9df5
Add a sample
yaeldMS Dec 18, 2019
a93ea18
documentation
yaeldMS Dec 27, 2019
9c921ce
fix unit tests
yaeldMS Dec 28, 2019
d7616a7
Delete unused file
yaeldMS Jan 1, 2020
90803b7
make CountTable* classes internal
yaeldMS Jan 10, 2020
c9cf4ce
Possible solution for adding noise only when training a pipeline
yaeldMS Jan 28, 2020
3d23f80
Fix bug
yaeldMS Jan 29, 2020
96e0041
Make all APIs and classes internal.
yaeldMS Feb 5, 2020
3fc56ec
Exclude dracula sample.
yaeldMS Feb 9, 2020
66f7865
Switch to using HashingTransformer instead of HashJoiningTransform.
yaeldMS May 18, 2020
91887da
Fix EntryPointCatalog test
yaeldMS May 19, 2020
c36e5b3
Address code review comments.
yaeldMS Jun 1, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add a sample
  • Loading branch information
yaeldMS committed May 18, 2020
commit c8a9df53a81212c980e0a1e21e0ac72bfb9d86f2
6 changes: 5 additions & 1 deletion src/Microsoft.ML.Transforms/Dracula/DraculaTransform.cs
Original file line number Diff line number Diff line change
Expand Up @@ -123,8 +123,12 @@ internal CountTargetEncodingEstimator(IHostEnvironment env, string labelColumnNa
{
Contracts.CheckValue(env, nameof(env));
_host = env.Register(nameof(CountTargetEncodingEstimator));
_host.CheckValue(initialCounts, nameof(initialCounts));
_host.CheckNonEmpty(columns, nameof(columns));
_host.Check(columns.All(c => initialCounts.HashJoin.InputSchema.GetColumnOrNull(c.InputColumnName) != null), nameof(columns));
_host.Check(columns.All(c => initialCounts.HashJoin.OutputSchema.GetColumnOrNull(c.OutputColumnName) != null), nameof(columns));

_estimator = new CountTableEstimator(_host, labelColumnName, initialCounts.CountTable, columns);
_estimator = new CountTableEstimator(_host, labelColumnName, initialCounts.CountTable, columns.Select(c => new InputOutputColumnPair(c.OutputColumnName, c.OutputColumnName)).ToArray());
_hashJoin = initialCounts.HashJoin;
}

Expand Down