Skip to content

Commit 2c4211a

Browse files
committed
Merge branch 'master' of github.com:weichenzhao/CS544_Project
2 parents 2f32e0b + 2e0167e commit 2c4211a

File tree

1 file changed

+21
-0
lines changed

1 file changed

+21
-0
lines changed

Tag_Extract.py

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# this code is to directly extrac tag from title
2+
import json
3+
import sys
4+
from collections import defaultdict
5+
6+
input=sys.argv[1] # train data
7+
tag_dic=sys.argv[2] # dictionary for tag 100
8+
9+
data=json.load(open(input))
10+
dic=json.load(open(tag_dic))
11+
12+
Y=defaultdict()
13+
14+
for i in data:
15+
title=data[i][0]
16+
Y[i]=[]
17+
for j in title.split():
18+
if j in dic:
19+
Y[i].append(dic[j])
20+
for z in Y:
21+
print(z,Y[z])

0 commit comments

Comments
 (0)