-
spider_chromiumoxide_cdp
Contains all the generated types for chromiumoxide
-
git-leave
Check for unsaved or uncommitted changes on your machine
-
spider-client
Spider Cloud client
-
kodegen_tools_citescrape
KODEGEN.ᴀɪ: Memory-efficient, Blazing-Fast, MCP tools for code generation agents
-
firecrawl
Rust SDK for Firecrawl API
-
spider-cloud-cli
The Spider Cloud CLI for web crawling and scraping
-
siteprobe
Rust-based CLI tool that fetches all URLs from a given
sitemap.xmlurl, checks their existence, and generates a performance report. It supports various features such as authentication… -
mq-crawler
Directory crawler for batch Markdown file processing
-
toai
path crawler, that copies all SRC files into a singe output to send it to a ai (toai)
-
fav_core
Fav's core crate; A collection of traits
-
yps
Yggdrasil Port Scanner
-
robotstxt
A native Rust port of Google's robots.txt parser and matcher C++ library
-
iocaine
The deadliest poison known to AI
-
spider_transformations
Transformation utils to use for spider
-
pulsarss
RSS Aggregator for Gemini Protocol
-
spider_chromiumoxide_types
Contains the essential types necessary for using chromiumoxide
-
wdict
Create dictionaries by scraping webpages or crawling local files
-
tarzi
Rust-native lite search for AI applications
-
product-os-crawler
Product OS : Crawler is a browser based cralwer that utilises Product OS : Browser to perform advanced url crawling leveraging headless browsing and automation
-
firecrawl-sdk
Rust SDK for Firecrawl API
-
spider_worker
The fastest web crawler as a worker or proxy
-
crawlurls
A fast async Rust crawler that discovers and filters URLs by pattern without scraping content
-
spider_cli
The fastest web crawler CLI written in Rust
-
seaward
grep-like tool for the web
-
website_crawler
gRPC tokio based web crawler built with spider
-
ferrisfetcher
A cutting-edge, high-level web scraping library crafted in Rust
-
wrake
Collect links from the given URL
-
actix_block_ai_crawling
Actix Middleware that blocks Generative AI crawlers
-
ungoliant
The pipeline for the OSCAR corpus
-
robotxt
Robots.txt (or URL exclusion) protocol with the support of crawl-delay, sitemap and universal match extensions
-
omnivore-cli
Universal web scraper and code extractor CLI - crawl websites, analyze repositories, build knowledge graphs
-
capp
Common things i use to build Rust CLI tools for web crawlers
-
crawly
A lightweight async Web crawler in Rust, optimized for concurrent scraping while respecting
robots.txtrules -
shader-prepper
Shader include parser and crawler
-
aquatic-crawler
Crawler tool for the Aquatic BitTorrent tracker API
-
scoutlang
A web crawling programming language
-
unobtanium-crawler
The default web-crawler for unobtanium
-
coma
lightweight command-line tool designed for crawling websites
-
unobtanium
Opinioated Web search engine library with crawler and viewer companion
-
s5_importer_http
HTTP importer for S5
-
headless_chrome_fork
Control Chrome programatically
-
json-crawler
Wrapper for serde_json that provides nicer errors when crawling through large json files
-
voyager
Web crawler and scraper
-
turboscraper
A high-performance, concurrent web scraping framework for Rust with built-in support for retries, storage backends, and concurrent request handling
-
mitsuba
Lightweight 4chan board archive software (like Foolfuuka), in Rust
-
recursive_scraper
Constant-frequency recursive CLI web scraper with frequency, filtering, file directory, and many other options for scraping HTML, images and other files
-
scout-lexer
A web crawling programming language
-
firecrawl_rs
Rust SDK for Firecrawl API
-
surly-spider
A command line interface for crawling websites
-
spider_scraper
A css scraper using html5ever
-
chan-downloader
CLI to download all images/webms of a 4chan thread
-
wappu
fast and flexible web scraping library for Rust, designed to efficiently navigate and extract data from websites. Perfect for data mining, content aggregation, and web automation tasks.
-
indexea
OpenAPI of Indexea
-
frangipani
Scraping framework for rust
-
omnivore-core
Core crawler and knowledge graph engine for Omnivore - web scraping, AI extraction, browser automation
-
quick_crawler
QuickCrawler is a Rust crate that provides a completely async, declarative web crawler with domain-specific request rate-limiting built-in
-
brchd
Data exfiltration toolkit
-
async_job
async cron job crate for Rust
-
spacebar
An anti-plagiarism tool based on null width characters
-
maman
Rust Web Crawler
-
rsfile
operate files or web pages easily and quickly
-
web-crawler
Finds every page, image, and script on a website (and downloads it)
-
spidery
Rust SDK for Spidery API
-
ntwrk
TODO
-
ptt-crawler
A crawler for the web version of PTT, the largest online community in Taiwan
-
finde-rs
Multi-threaded filesystem crawler
-
jsdom
javascript dom parser for web scraping
-
stream_crawler
scraping web pages and extracting URLs and endpoints
-
url-crawl
URL crawler for HTML code
-
krate
Get information and metadata for published Rust crates
-
actix-prerender
Actix middleware that sends requests to Prerender.io or a custom Prerender service URL
-
ssufid
SSU Announcement Crawler for Everyone
-
od-get
recursively crawling & downloading data from open directories
-
rust-rock-rover
Concert web crawler in Rust
-
kodict
Korean Dictionary Implements and Crawler for Rust
-
doublesite
Alternative for httrack
-
wls
Easily crawl multiple sitemaps and list URLs
-
crawl
Rust crawl
-
lolchive
local liminal archiver for webpages
-
source-demo-tool-crawler
WIP: a gui tool for opening (editing planned) source engine demo files
-
spire
The flexible scraper framework powered by tokio and tower
-
emails
A web scraper to extract email addresses from websites
-
sws-crawler
Web crawler with plugable scraping logic
-
scraper_query
Ergonomic Query for HTML with Scraper
-
flatcrawl-crawler
set of webpage crawlers. New crawlers can be easily configured and the output can be written to an AMQP queue.
-
task_deport
Organize simple task queue
-
gar-crawl
High level HTML crawler with concise builder
-
ac_crawler_types
normalized types for the anti capital public data crawlers
-
labisu
implementing algorithms finding large bipartite subgraphs
-
dblp_crawler
DBLP Crawler
-
spire-macros
Macros for spire
-
crawler_data_client
client for programmatic download of crawler data
-
pop-os/apt-repo-crawler
crawling through files in an apt repo
-
scrupy
fast, modern spider framework written in and for Rust. The framework implements the functionalities of Scrapy, but is low-level and typesafe. It exposes an elegant API and uses zero unsafe code.
-
karkinos
Powerful and flexible web scraper with YAML configuration, supporting pagination, data transformations, caching, and multiple output formats
Try searching with DuckDuckGo.