You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
soup_base: a BeautifulSoup object with the html data. At the moment this method assumes that the soup_base was prepared statically.
129
+
soup_base: a BeautifulSoup object with the html data.
130
+
At the moment this method assumes that the soup_base was prepared statically.
130
131
max: the maximum number of pages to be scraped.
131
132
Returns:
132
133
The number of pages to be scraped.
@@ -148,7 +149,9 @@ def get_title(self, soup):
148
149
Args:
149
150
soup: BeautifulSoup base to scrape the title from.
150
151
Returns:
151
-
The job title scraped from soup. Note that this function may throw an AttributeError if it cannot find the title. The caller is expected to handle this exception.
152
+
The job title scraped from soup.
153
+
Note that this function may throw an AttributeError if it cannot find the title.
154
+
The caller is expected to handle this exception.
152
155
"""
153
156
returnsoup.find('a', attrs={
154
157
'data-tn-element': 'jobTitle'}).text.strip()
@@ -159,7 +162,9 @@ def get_company(self, soup):
159
162
Args:
160
163
soup: BeautifulSoup base to scrape the company from.
161
164
Returns:
162
-
The company scraped from soup. Note that this function may throw an AttributeError if it cannot find the company. The caller is expected to handle this exception.
165
+
The company scraped from soup.
166
+
Note that this function may throw an AttributeError if it cannot find the company.
167
+
The caller is expected to handle this exception.
163
168
"""
164
169
returnsoup.find('span', attrs={
165
170
'class': 'company'}).text.strip()
@@ -170,7 +175,9 @@ def get_location(self, soup):
170
175
Args:
171
176
soup: BeautifulSoup base to scrape the location from.
172
177
Returns:
173
-
The job location scraped from soup. Note that this function may throw an AttributeError if it cannot find the location. The caller is expected to handle this exception.
178
+
The job location scraped from soup.
179
+
Note that this function may throw an AttributeError if it cannot find the location.
180
+
The caller is expected to handle this exception.
174
181
"""
175
182
returnsoup.find('span', attrs={
176
183
'class': 'location'}).text.strip()
@@ -181,7 +188,9 @@ def get_tags(self, soup):
181
188
Args:
182
189
soup: BeautifulSoup base to scrape the location from.
183
190
Returns:
184
-
The job location scraped from soup. Note that this function may throw an AttributeError if it cannot find the location. The caller is expected to handle this exception.
191
+
The job location scraped from soup.
192
+
Note that this function may throw an AttributeError if it cannot find the location.
The job date scraped from soup. Note that this function may throw an AttributeError if it cannot find the date. The caller is expected to handle this exception.
206
+
The job date scraped from soup.
207
+
Note that this function may throw an AttributeError if it cannot find the date.
208
+
The caller is expected to handle this exception.
198
209
"""
199
210
returnsoup.find('span', attrs={
200
211
'class': 'date'}).text.strip()
@@ -205,7 +216,9 @@ def get_id(self, soup):
205
216
Args:
206
217
soup: BeautifulSoup base to scrape the id from.
207
218
Returns:
208
-
The job id scraped from soup. Note that this function may throw an AttributeError if it cannot find the id. The caller is expected to handle this exception.
219
+
The job id scraped from soup.
220
+
Note that this function may throw an AttributeError if it cannot find the id.
221
+
The caller is expected to handle this exception.
209
222
"""
210
223
# id regex quantifiers
211
224
id_regex=re.compile(r'id=\"sj_([a-zA-Z0-9]*)\"')
@@ -218,7 +231,9 @@ def get_link(self, job_id):
218
231
Args:
219
232
job_id: The id to be used to construct the link for this job.
220
233
Returns:
221
-
The constructed job link. Note that this function does not check the correctness of this link. The caller is responsible for checking correcteness.
234
+
The constructed job link.
235
+
Note that this function does not check the correctness of this link.
236
+
The caller is responsible for checking correcteness.
0 commit comments