Cookiecutter Data Science

A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Modify from http://drivendata.github.io/cookiecutter-data-science/

Requirements to use the cookiecutter template:

Python 2.7 or 3.5
Cookiecutter Python package >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:

$ pip install cookiecutter

or

$ conda config --add channels conda-forge
$ conda install cookiecutter

To start a new project, run:

cookiecutter https://github.com/syhsu/cookiecutter-data-science

The resulting directory structure

The directory structure of your new project looks like this:

{{cookiecutter.repo_name}}
├── .gitignore                          <- GitHub's excellent Python .gitignore customized for this project
├── LICENSE                             <- Your project's license. 
├── requirements.txt                    <- The required packages for reproducing the analysis environment
├── README.md                           <- The top-level README for developers using this project.
├── Dockerfile                          <- The Dockerfile for running the codes in src
│
├── data 
│   ├── raw                             <- The original, immutable data dump; .json/.yaml pointing to the raw data dump 
│   │   └── metadata.json               <- Format still requires to define but just update from the previous version, i.e. only-one record is kept.
│   ├── external
│   │   └── metadata.json     
│   ├── interim                         <- Intermediate data that has been transformed.
│   │   └── metadata.json
│   └── final                           <- The final, canonical data sets for modeling.
│       └── metadata.json
│
├── docs                                <- Documentations, reports, References, and all other explanatory materials
│   
│
├── notebooks                           <- Jupyter notebooks. Naming convention is a number (for ordering),
│                                          the creator's initials, and a short `_` delimited description, e.g.
│                                          `01_cp_exploratory_data_analysis.ipynb`. NOTE: clean outputs before pushing to git!!
│
├── models                              <- Trained and serialized models, model predictions, or model summaries
│   └── metadata.json
│
├── pipelines                           <- Pipelines and data workflows. Add subfolder from the used orchestration tool, e.g. airflow, kubeflow
│   ├── kubeflow
│   │   ├── {{cookiecutter.repo_name}}  <- compiled kubeflow pipelines    
│   └── airflow
│
├── src                                 <- Source folder for training, analyzing codes
│   └── components                      <- kubeflow components or airflow custom operators [Align with Dbox design]
│
├── tests                               <- Testing codes        
│   └──components                      
└── setup.py

Installing development requirements

pip install -r requirements.txt

Running the tests

py.test tests

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
{{ cookiecutter.repo_name }}		{{ cookiecutter.repo_name }}
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cookiecutter.json		cookiecutter.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Cookiecutter Data Science

Requirements to use the cookiecutter template:

To start a new project, run:

The resulting directory structure

Installing development requirements

Running the tests

About

Uh oh!

Releases

Packages

Languages

License

syhsu/cookiecutter-data-science

Folders and files

Latest commit

History

Repository files navigation

Cookiecutter Data Science

Requirements to use the cookiecutter template:

To start a new project, run:

The resulting directory structure

Installing development requirements

Running the tests

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages