Skip to content

Ignore hidden columns in AutoML schema checks of validation data #4491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
daholste opened this issue Nov 20, 2019 · 0 comments · Fixed by #4490
Closed

Ignore hidden columns in AutoML schema checks of validation data #4491

daholste opened this issue Nov 20, 2019 · 0 comments · Fixed by #4490
Assignees
Labels
AutoML.NET Automating various steps of the machine learning process

Comments

@daholste
Copy link
Contributor

daholste commented Nov 20, 2019

When the AutoML API consumes data, it validates schema consistency between the train and validation data.

There are two bugs in this logic:

  1. The API asserts that the count of columns in the train and validation data must be equal. This throws an exception if the two data views have the same number of active columns but a different number of hidden columns. This should be updated to assert that the # of active (not hidden) columns in the train and validation data are equal.

  2. If either the train or validation data has a hidden column with a type that differs from an active column of the same name, an exception is thrown. Type consistency checks should be restricted to active columns only.

@daholste daholste self-assigned this Nov 20, 2019
@daholste daholste added the AutoML.NET Automating various steps of the machine learning process label Nov 20, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
AutoML.NET Automating various steps of the machine learning process
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant