Skip to content

Exception is thrown if NDCG > 10 is used with LightGbm for evaluating ranking #3993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nicolehaugen opened this issue Jul 12, 2019 · 2 comments · Fixed by #4081
Closed

Exception is thrown if NDCG > 10 is used with LightGbm for evaluating ranking #3993

nicolehaugen opened this issue Jul 12, 2019 · 2 comments · Fixed by #4081
Assignees

Comments

@nicolehaugen
Copy link

  • Version: ML.NET 1.2.0

The current code in the RankingEvaluator.cs file has the MaxTruncationLevel for NDCG (Normalized Cumulative Gain Metric) set to 10. Also, the code currently throws an exception if the NDCG is set to a value > 10. This is a blocking issue for ranking because it prevents the ability to measure the quality of ranking with result sets > 10. For example, if you were attempting to rank a group of 100 results, with the MaxTruncationLevel of 10, you could only measure whether the first 10 results were ranked correctly.

Here's the code:

         public RankingEvaluator(IHostEnvironment env, Arguments args)
            : base(env, LoadName)
        {
            // REVIEW: What kind of checking should be applied to labelGains?
            if (args.DcgTruncationLevel <= 0 || args.DcgTruncationLevel > Aggregator.Counters.MaxTruncationLevel)
                throw Host.ExceptUserArg(nameof(args.DcgTruncationLevel), "DCG Truncation Level must be between 1 and {0}", Aggregator.Counters.MaxTruncationLevel);
            Host.CheckUserArg(args.LabelGains != null, nameof(args.LabelGains), "Label gains cannot be null");
...
}

It appears from the //Review comment in the above code that this functionality hasn't been fully completed.

While I'm unsure what the MaxTruncationLevel value should be, I have seen on a ranking contest\example on Kaggle.com where one contest was measuring NDCG with a truncation level of up to 38.

I also noticed that in other parts of this file, the code indicates that a value between 0-100 should be allowed:

 public Transform(IHostEnvironment env, IDataView input, string labelCol, string scoreCol, string groupCol,
                int truncationLevel, Double[] labelGains)
                : base(env, input, labelCol, scoreCol, groupCol, RegistrationName)
            {
                Host.CheckParam(0 < truncationLevel && truncationLevel < 100, nameof(truncationLevel),
                    "Truncation level must be between 1 and 99");
...
}

Also, refer to the linked bug since it is related to this scenario: [Ranker Evaluate doesn't allow you specify metric parameters.] (#2728)

@harishsk
Copy link
Contributor

@nicolehaugen Apart from fixing the exception related to the truncation level, it appears that there is a request for making RankingEvaluator (and hence RankingEvaluator.Arguments) public (so that you don't have to do it through reflection).

Is that correct?

@nicolehaugen
Copy link
Author

@harishsk - Yes, you are correct - we need to be able to set the Arguments without reflection.

harishsk added a commit that referenced this issue Oct 10, 2019
Summary of changes:

    Added RankingEvaluatorOptions class to control the output of evaluation
    Removed hard coded truncation limit for the max truncation level
    Added corresponding unit tests and maml tests

Fixes #3993
@ghost ghost locked as resolved and limited conversation to collaborators Mar 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants