Contains scripts for PropertySuggester to preprocess the wikidata dump
- use dumpconverter.py to convert a wikidata dump to csv
- use analyzer.py to create a csv file with the suggestion data that can be loaded into a sql table
- the PropertySuggester extension provides a maintenance script (maintenance/UpdateTable.php) that allows to load the csv into the database
python dumpconverter.py wikidatawiki-20140226-pages-articles.xml.bz2 dump.csv
python analyzer.py dump.csv wbs_propertypairs.csv
php extensions/PropertySuggester/maintenance/UpdateTable.php wbs_propertypairs.csv
sudo apt-get install build-essential python-pip python-dev
pip install -r requirements.txt
nosetests
- Consider classifying Properties
- use Json dumps for analysis
- Generate associationrules for qualifier and references
- Improve ranking to avoid suggestions of human properties
- remove very unlikely rules (<1%)
- Converts a wikidata dump to a csv file with associationrules between properties