Skip to content

computerGeologist/Project_4

Repository files navigation

Key files:

Scraping_Torah.pynb NLP.ipynb visualization.py

These files are intended to be run in order.

Scraping_Torah.ipynb

When this notebook is run, it retrieves verses from the Jewish Virtual Library, and packages them into Torah_Verses, Torah_Chapters, and Chapter_Indices. Additionally, it generates a labeling scheme based on https://en.wikipedia.org/wiki/Composition_of_the_Torah, which is stored in Verse_Labels.csv.

NLP.ipynb

When this notebook is run, it uses Torah_Verses.csv and Verse_Labels.csv to produce a trained vectorizer, topic modeler, and classification algorithm, which are stored in model.p.

visualization.py

When this is run using

streamlit run visualization.py

it runs a streamlit application that allows users to enter verses and determine what

Related Files:

Torah_Chapters.csv: Contains all Torah verses, grouped by chapter. Generated by Scraping_Torah.ipynb

Chapter_Indices.csv: For each chapter, labels it with its chapter number and the book it is from. Intended for visualization purposes. Generated by Scraping_Torah.ipynb

Torah_Verses.csv: Contains all Torah verses, as individual rows. Generated by Scraping_Torah.ipynb

Verse_Labels.csv: Simple array, containing either a 'p', 'y', or 'y', corresponding to the source for the appropriate verse. Generated by Scraping_Torah.ipynb

model.p: A pickled tuple containing a vectorizer, topic modeler, and classification algorithm. All have been trained appropriately on the torah verses. Generated by NLP.ipynb.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published