Created by: hydrargyrum
What is this Python project?
It's a framework for scraping HTML sites, and aggregating data from multiple sites from a same category (e.g. banking sites, news sites, video sites, etc.). There are ready-made modules for popular websites and ready-apps to interact with them. Think youtube-dl applied to other domains than video!
What's the difference between this Python project and similar ones?
- It's possible to scrape new websites with declarative-style extraction rules
- It provides a standardized API for categories of sites for dedicated tasks (e.g. banking, web forums, video sites, news sites, music lyrics sites, etc.)
- Scraped websites are grouped in those categories
- Scraped websites are grouped in categories for a dedicated task:
- The project comes with many existing backends for real-life websites
- It has an internal upgrade system