cleanlab helps you clean data and labels by automatically detecting issues in a ML dataset. To facilitate machine learning with messy, real-world data, this data-centric AI package uses your existing models to estimate dataset problems that can be fixed to train even better models. cleanlab cleans your data's labels via state-of-the-art confident learning algorithms, published in this paper and blog. See some of the datasets cleaned with cleanlab at labelerrors.com. This package helps you find label issues and other data issues, so you can train reliable ML models. All features of cleanlab work with any dataset and any model. Yes, any model: PyTorch, Tensorflow, Keras, JAX, HuggingFace, OpenAI, XGBoost, scikit-learn, etc. If you use a sklearn-compatible classifier, all cleanlab methods work out-of-the-box.

Features

  • Binary and multi-class classification
  • Multi-label classification (e.g. image/document tagging)
  • Token classification (e.g. entity recognition in text)
  • Classification with data labeled by multiple annotators
  • Active learning with multiple annotators (suggest which data to label or re-label to improve model most)
  • Outlier and out of distribution detection

Project Samples

Project Activity

See All Activity >

License

Affero GNU Public License

Follow Cleanlab

Cleanlab Web Site

Other Useful Business Software
Get Avast Free Antivirus with 24/7 AI-powered online scam detection Icon
Get Avast Free Antivirus with 24/7 AI-powered online scam detection

Get protection for today’s online threats. Free.

Award-winning antivirus protection, as well as protection against online scams, dangerous Wi-Fi connections, hacked accounts, and ransomware. It includes Avast Assistant, your built-in AI partner, which gives you help with suspicious online messages, offers, and more.
Free Download
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Cleanlab!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Data Labeling Tool, Python Data Quality Tool

Registered

2023-05-23