Skip to content

Commit b9acd98

Browse files
committed
host clirmatrix downloads json file on jhu server
1 parent 7b60eab commit b9acd98

File tree

2 files changed

+6
-4
lines changed

2 files changed

+6
-4
lines changed

ir_datasets/datasets/clirmatrix.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,8 +29,10 @@ def _init():
2929
base_path = ir_datasets.util.home_path()/NAME
3030

3131
def _dlc_init():
32+
import json, gzip
3233
dlc = DownloadConfig.context(NAME, base_path)
33-
clirmatrix_dlc = _DownloadConfig(dlc['downloads'].path(), parser='json')
34+
with gzip.open(dlc['downloads'].path(), 'rb') as f:
35+
clirmatrix_dlc = _DownloadConfig(contents = json.load(f))
3436
return clirmatrix_dlc
3537

3638
_dlc = ir_datasets.util.Lazy(_dlc_init)

ir_datasets/etc/downloads.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -107,9 +107,9 @@ clinicaltrials:
107107

108108
clirmatrix:
109109
downloads:
110-
url: 'https://macavaney.us/clirmatrix_downloads.json' # TODO: move this to JHU server?
111-
expected_md5: '9e70cd85ec45caa8c16061c42d1ce9b8'
112-
cache_path: 'clirmatrix_downloads.json'
110+
url: 'http://www.cs.jhu.edu/~shuosun/clirmatrix/data/downloads.json.gz'
111+
expected_md5: '371cc532aca236759bd3602eb6ce2181'
112+
cache_path: 'clirmatrix_downloads.json.gz'
113113

114114

115115
clueweb09:

0 commit comments

Comments
 (0)