Skip to content

soniacq/DatasetsVis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DatasetsSummarizer

Datasets Summarizer is compatible with Jupyter Notebooks. Need the x and y values based on any similarity metric to generated the similarity plot between datasets. Supports the metadata format generated by datamart-profiler library to generate the Detail View to explore each dataset.

System screen

( Click one dataset from the list of results to open the Detail View.)

Demo

Live demo (Google Colab):

In Jupyter Notebook:

import DatasetsSummarizer
data = DatasetsSummarizer.get_taxi_data()
DatasetsSummarizer.plot_datasets_summary(data)

Install

Option 1: install via pip:

pip install datasets-summarizer

Custom similarity metric

Use a subset or add a new entry (x and y values ) based on a different similatiry metric. For example, here we added x and y values based on a similarity metric using a modified version of the titles. Note that modif_title_x and modif_title_y must be included in the dataframe.

new_similarity_metrics = [{'name': 'Title', 'x': 'title_x', 'y': 'title_y'},
                          {'name': 'ModifiedTitle', 'x': 'modif_title_x', 'y': 'modif_title_y'}
                         ]

Then, we can pass this new similarity metrics as a parameter of our visualization

DatasetsSummarizer.plot_datasets_summary(dataframe, new_similarity_metrics)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages