Comparison of libraries to extract content from HTML
$ ./download.sh
This will store html files in html
dir.
$ pip install -r requirements.txt
$ python extract.py
This will extract contents in content_*
dir.
$ pip install -r requirements.py3.txt
$ python extract_py3.py
This will extract contents in py3_content_*
dir.