Skip to content

ydataai/academy

Repository files navigation

🧠 YData Academy: ydata-sdk Tutorials & Use Cases

Welcome to YData Academy, a collection of hands-on tutorials and use-cases built entirely with the ydata-sdk.

Whether you're just getting started with synthetic data or diving into advanced data anonymization and generative AI, this repository will guide you through each core capability of ydata-sdk.

📦 About ydata-sdk

ydata-sdk is a Python package designed to simplify data-centric AI development. It includes tools for:

  • Data exploration and profiling
  • Synthetic data generation and evaluation
  • Data anonymization and privacy preservation
  • Integrations with generative AI for document analysis and Q&A (questions and answers pairs)

🗂️ Repository Structure

Folder Description
1. Data & Connectors Working with connectors, datasets, schema definitions, and metadata exploration
2. Data Profiling Using ydata-sdk to profile datasets for structure, quality, and distributions
3. Synthetic Data Generation Creating synthetic data using ydata's generative models
4. Generative AI - Documents & Q&A Generation of synthetic documents and Q&A pairs from existing documents
5. Synthetic Data Evaluation Measuring utility, fidelity, and privacy of synthetic data
6. Anonymizer Applying anonymization techniques to protect sensitive data
7. Data Preparation & Cleaning Data preparation auxiliar methods
8. Use Cases A set of ready to use-cases templated with ydata-sdk

🚀 Getting Started

📦 Installation

pip install ydata-sdk

🧪 Requirements

  • Python 3.9+

Running the Notebooks

Clone the repo and start Jupyter:

git clone https://github.com/ydataai/academy
cd academy
jupyter notebook

Open any notebook under the folders to explore and run the code examples interactively.

🤝 Contributing

Contributions are welcome! Please see our contribution guide for guidelines.

About

Tutorials for YData's Fabric platform

Topics

Resources

License

Stars

Watchers

Forks

Contributors 14