-
Notifications
You must be signed in to change notification settings - Fork 18
dimension drops to 0 when number of samples is high #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you share some code to reproduce this ? And which estimator has this issue ? |
I observed a similar issue where having outliers decreased the estimated intrinsic dimension. Perhaps this happens because outliers can increase the highest eigenvalues and lead to a lower dimension estimate. Here's a minimal example using
|
Thanks for reporting. One possibility would be to have an option to use robust PCA With that said, lPCA has a bit of a confusing API. I kept it as a But the original paper referenced uses it as a local PCA applied in kNN around each point ( |
Thanks for the clarification, and for pointing to the One final thought: when aggregating the pointwise ID estimates, maybe using the mean might not always be the best choice. Perhaps, using cover sets (Algorithm 2 in [Fan2010]) yields a more reliable (and computationally efficient) global estimate. Thanks again for your time and for maintaining this very useful package! |
Hello and thanks for the package !
I am currently running few tests and when I run estimator on large datasets, the returned dimension is 0. When I lower the number of sample, it returns a higher dimension. Any ideas of why this behavior happens ?
Best,
Etienne
The text was updated successfully, but these errors were encountered: