Skip to content

SlyCodePanda/Article-Scraper

Repository files navigation

Article-Scraper

Currently just working for the site Geeks for Geeks, but the idea is to scrape sites I frequent for Python specific articles and return them in it's own html file with the headline, summary, and link to the full article.

Modules Used

  • BeautifulSoup 4 for web scraping.
  • Requests library for making HTTP requests.
  • lxml parser used in the BeautifulSoup object.
  • os for opening the html file.

To-Do

  • Scrape more than just the one site for articles.
  • Work on the over all display of the html file, it's not overly well formatted currently.
  • Test on Linux, I know it will have trouble when trying to run the os.startfile() line.

Usage

Download the files, go to terminal and simply run:

python geeksScrape.py

About

Scrapes some of my favourite sites for articles about Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published