Skip to content

Commit cac543c

Browse files
committed
web scrapper beautiful soup
1 parent bfffb8d commit cac543c

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

File.scrapper-beautisoup

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
from bs4 import BeautifulSoup
2+
import requests
3+
4+
url= "https://www.crummy.com/software/BeautifulSoup/"
5+
r= requests.get(url)
6+
html_doc = r.text
7+
soup = BeautifulSoup(html_doc, features="html.parser")
8+
9+
#print the full html code
10+
#print(soup.prettify())
11+
12+
#extract only the titles
13+
print(soup.title)
14+
15+
#extract only the text with no markup
16+
print(soup.get_text())
17+
18+
#extract only the links
19+
for link in soup.find_all('a'):
20+
print(link.get("href"))

0 commit comments

Comments
 (0)