Main file: hier_cluster.py
main input: 10w raw file
a interval of string length and stopword threshold,
main output: a dendrogram picture.
path: /sms_sp_class/model
directory needed:
../data/ raw data stored here
../pickles/ process of function create_corpus() saved here
../pic/ picture outputs saved here
../outputs/ text ouputs saved here
created by Alicia Ge, 2018