#web-crawler

  1. spider_chromiumoxide_cdp

    Contains all the generated types for chromiumoxide

    v0.7.7 3.0K #chromiumoxide #cdp #generated #dev-tools #protocols #chrome #web-crawler
  2. git-leave

    Check for unsaved or uncommitted changes on your machine

    v1.6.4 1.0K #web-crawler #uncommitted #cli
  3. spider-client

    Spider Cloud client

    v0.1.82 #artificial-intelligence #web-crawler #web-indexer
  4. kodegen_tools_citescrape

    KODEGEN.ᴀɪ: Memory-efficient, Blazing-Fast, MCP tools for code generation agents

    v0.10.19 #web-crawler #mcp #claude #terminal
  5. firecrawl

    Rust SDK for Firecrawl API

    v1.2.1 #web-crawler #artificial-intelligence #sdk #ai-api #format #markdown #llm #web-search #structured-data #scrape
  6. spider-cloud-cli

    The Spider Cloud CLI for web crawling and scraping

    v0.1.82 #web-scraping #web-indexer #web-crawler
  7. siteprobe

    Rust-based CLI tool that fetches all URLs from a given sitemap.xml url, checks their existence, and generates a performance report. It supports various features such as authentication…

    v1.1.0 #sitemap #web-crawler #performance #http-monitoring #url-checker
  8. mq-crawler

    Directory crawler for batch Markdown file processing

    v0.5.8 #web-crawler #query #markdown #jq
  9. toai

    path crawler, that copies all SRC files into a singe output to send it to a ai (toai)

    v0.2.0 #artificial-intelligence #path #stdout #src #ignore #web-crawler #relative-path
  10. fav_core

    Fav's core crate; A collection of traits

    v0.1.8 2.0K #resources #status-flags #web-crawler #fetch #protobuf #cookies #collection-traits #logout #visualize #fetched
  11. yps

    Yggdrasil Port Scanner

    v0.1.3 180 #tcp #yggdrasil #udp #web-crawler #search
  12. robotstxt

    A native Rust port of Google's robots.txt parser and matcher C++ library

    v0.3.0 6.8K #web-crawler #parser
  13. iocaine

    The deadliest poison known to AI

    v3.1.0 #poison #reverse-proxy #artificial-intelligence #web-scraping #generator #web-crawler #scripting-engine #costs
  14. spider_transformations

    Transformation utils to use for spider

    v2.37.112 1.3K #web-crawler #transformation #crawler
  15. pulsarss

    RSS Aggregator for Gemini Protocol

    v0.1.4 180 #gemini-protocol #rss #web-crawler #gemini-gemtext
  16. spider_chromiumoxide_types

    Contains the essential types necessary for using chromiumoxide

    v0.7.4 3.1K #chromiumoxide #chromium #dev-tools #protocols #chrome #websocket #async-api #headless-chrome #web-crawler
  17. wdict

    Create dictionaries by scraping webpages or crawling local files

    v0.1.20 #web-page #dictionary #web-crawler #local #word-list
  18. tarzi

    Rust-native lite search for AI applications

    v0.1.6 1.4K #web-crawler #rag #search
  19. product-os-crawler

    Product OS : Crawler is a browser based cralwer that utilises Product OS : Browser to perform advanced url crawling leveraging headless browsing and automation

    v0.0.16 450 #product-os #web-crawler
  20. firecrawl-sdk

    Rust SDK for Firecrawl API

    v0.3.1 550 #sdk #scrape #web-crawler #firecrawl #api #cargo-run
  21. spider_worker

    The fastest web crawler as a worker or proxy

    v2.38.110 #web-scraping #web-crawler #spider-cli
  22. crawlurls

    A fast async Rust crawler that discovers and filters URLs by pattern without scraping content

    v0.1.1 #web-crawler #url #web #rust
  23. spider_cli

    The fastest web crawler CLI written in Rust

    v2.38.110 130 #web-crawler #web-scraping #crawler
  24. seaward

    grep-like tool for the web

    v1.1.0 290 #web-crawler #rustcrawler #cli
  25. website_crawler

    gRPC tokio based web crawler built with spider

    v0.9.9 #web-indexer #web-crawler #site-map-generator #crawler
  26. ferrisfetcher

    A cutting-edge, high-level web scraping library crafted in Rust

    v0.1.0 #web-scraping #web-crawler #scraper
  27. wrake

    Collect links from the given URL

    v0.4.2 #web-crawler #web #crawler
  28. actix_block_ai_crawling

    Actix Middleware that blocks Generative AI crawlers

    v0.2.11 #artificial-intelligence #generative-ai #web-crawler #block #actix-middleware #ip-address #user-agent #openai
  29. ungoliant

    The pipeline for the OSCAR corpus

    v2.0.0 #corpus #common-crawl #oscar #pipeline #web-crawler #fasttext #gz #packaging #identification
  30. robotxt

    Robots.txt (or URL exclusion) protocol with the support of crawl-delay, sitemap and universal match extensions

    v0.6.1 600 #web-crawler #web-framework #scraper
  31. omnivore-cli

    Universal web scraper and code extractor CLI - crawl websites, analyze repositories, build knowledge graphs

    v0.2.0 #git #web-crawler #code-analysis #web-scraping
  32. capp

    Common things i use to build Rust CLI tools for web crawlers

    v0.4.3 650 #web-crawler #async-executor #async
  33. crawly

    A lightweight async Web crawler in Rust, optimized for concurrent scraping while respecting robots.txt rules

    v0.1.9 #web-crawler #robots-txt #web-scraping #rate-limiting #builder-pattern #concurrency #depth-first-search #respecting
  34. shader-prepper

    Shader include parser and crawler

    v0.3.0-pre.3 5.3K #shader-compiler #web-crawler #build-system #virtual-filesystem #provider #graphics
  35. aquatic-crawler

    Crawler tool for the Aquatic BitTorrent tracker API

    v0.1.0 #bittorrent #web-crawler #parser #magnet #aquatic
  36. scoutlang

    A web crawling programming language

    v0.7.2 370 #web-crawler #web-scraping #programming-language #web-crawling
  37. unobtanium-crawler

    The default web-crawler for unobtanium

    v3.0.0 #web-crawler #unobtanium #index #search-engine
  38. coma

    lightweight command-line tool designed for crawling websites

    v0.2.3 310 #web-crawler #web-scraping #web-discovery
  39. unobtanium

    Opinioated Web search engine library with crawler and viewer companion

    v3.0.0 #search-engine #web-crawler #web-search #database #shared-data-structures
  40. s5_importer_http

    HTTP importer for S5

    v1.0.0-beta.1 #importer #s5 #base-url #http #web-crawler #parallel-processing #content-length
  41. headless_chrome_fork

    Control Chrome programatically

    v1.0.2 #headless-chrome #headless-browser #dev-tools #puppeteer #web-scraping #web-crawler #fetching
  42. json-crawler

    Wrapper for serde_json that provides nicer errors when crawling through large json files

    v0.0.11 220 #serde-json #web-crawler #json-error #youtube-music #pointers
  43. voyager

    Web crawler and scraper

    v0.2.1 #web-crawler #web-scraping #state-machine
  44. turboscraper

    A high-performance, concurrent web scraping framework for Rust with built-in support for retries, storage backends, and concurrent request handling

    v0.1.1 190 #web-scraping #web-crawler #web #async
  45. mitsuba

    Lightweight 4chan board archive software (like Foolfuuka), in Rust

    v1.10.0 #downloader #archive #web-crawler #web-archive
  46. recursive_scraper

    Constant-frequency recursive CLI web scraper with frequency, filtering, file directory, and many other options for scraping HTML, images and other files

    v0.6.2 #web-scraping #recursion #web-crawler #web #scraper
  47. scout-lexer

    A web crawling programming language

    v0.7.2 #web-crawler #web-scraping #programming-language #web-crawling
  48. firecrawl_rs

    Rust SDK for Firecrawl API

    v0.1.1 #web-crawler #sdk #structured-data #llm #api-sdk #markdown #web-data
  49. surly-spider

    A command line interface for crawling websites

    v1.0.2 #command-line-interface #web-crawler #surly #domain #flags
  50. spider_scraper

    A css scraper using html5ever

    v0.1.2 1.7K #web-scraping #css-selectors #html-parser #serialization #element-attributes #web-crawler
  51. chan-downloader

    CLI to download all images/webms of a 4chan thread

    v0.3.0 #download #web-crawler #4chan #4plebs #cli
  52. wappu

    fast and flexible web scraping library for Rust, designed to efficiently navigate and extract data from websites. Perfect for data mining, content aggregation, and web automation tasks.

    v0.3.0 490 #web-scraping #html-parser #web-content #web-crawler #extract #data-mining #web-page #web-data #fetch-and-parse #navigate
  53. indexea

    OpenAPI of Indexea

    v1.0.0 #oauth #widgets #payment #apps-api #account-api #search-api #logging #invoice #web-crawler #autocomplete
  54. frangipani

    Scraping framework for rust

    v0.3.1 #web-scraping #continuous-crawler #web-crawler #scraper #scraping
  55. omnivore-core

    Core crawler and knowledge graph engine for Omnivore - web scraping, AI extraction, browser automation

    v0.1.1 #web-crawler #knowledge-graph #browser #async #web-scraping
  56. quick_crawler

    QuickCrawler is a Rust crate that provides a completely async, declarative web crawler with domain-specific request rate-limiting built-in

    v0.1.2 #web-crawler #rate-limiting #domain-specific #web-scraping #web-page
  57. brchd

    Data exfiltration toolkit

    v0.1.0 #toolkit #exfiltration #upload #uploader #0-1 #web-crawler
  58. async_job

    async cron job crate for Rust

    v0.1.4 1.1K #cron-job #web-crawler #crawler
  59. spacebar

    An anti-plagiarism tool based on null width characters

    v0.3.0-rc1 #character #database #clipboard #tool #width #blog #http-errors #web-crawler
  60. maman

    Rust Web Crawler

    v0.13.1 #http #web-crawler #web #crawler
  61. rsfile

    operate files or web pages easily and quickly

    v0.1.2 #web-crawler #file-utility #web-page #text-file #web-page-helper #text-file-helper #csv-file-helper #crawler
  62. web-crawler

    Finds every page, image, and script on a website (and downloads it)

    v0.1.3 #download #web-page #find #image #script
  63. spidery

    Rust SDK for Spidery API

    v1.0.0 #sdk #web-crawler #llm #crawl #format #scrape #data-url
  64. ntwrk

    TODO

    v0.1.1 #browser-automation #web-crawler #debugging #web-scraping #debugging-tool
  65. ptt-crawler

    A crawler for the web version of PTT, the largest online community in Taiwan

    v0.1.0 #web-crawler #ptt #crawler
  66. finde-rs

    Multi-threaded filesystem crawler

    v0.1.4 #web-crawler #thread-pool #channel #cli
  67. jsdom

    javascript dom parser for web scraping

    v0.0.11-alpha.1 120 #web-scraping #web-crawler
  68. stream_crawler

    scraping web pages and extracting URLs and endpoints

    v0.1.1 #web-crawler #web-scraping #endpoint #web
  69. url-crawl

    URL crawler for HTML code

    v0.2.0 #kvarn #url #web-crawler #crawl #push #web-server
  70. Try searching with DuckDuckGo.

  71. krate

    Get information and metadata for published Rust crates

    v1.0.0 #metadata #io-api #contract #data-model #web-crawler
  72. actix-prerender

    Actix middleware that sends requests to Prerender.io or a custom Prerender service URL

    v0.2.4 #web-crawler #service-url #prerender #send #io #actix-web #user-agent #actix-middleware
  73. ssufid

    SSU Announcement Crawler for Everyone

    v0.1.0 #ssu #web-crawler #announcement
  74. od-get

    recursively crawling & downloading data from open directories

    v0.3.1 #download #web-crawler #open-directory #recursion-depth #file-pattern #verbosity #logging
  75. rust-rock-rover

    Concert web crawler in Rust

    v0.1.0 #concert #web #web-crawler #cargo-generate #template #git #ci
  76. kodict

    Korean Dictionary Implements and Crawler for Rust

    v0.2.1 #dictionary #korean #web-crawler #hangul
  77. doublesite

    Alternative for httrack

    v0.1.0 #content #httrack #loading #website #cli #backup #har #mirroring #web-crawler
  78. wls

    Easily crawl multiple sitemaps and list URLs

    v0.1.0 #sitemap #web-crawler #url
  79. crawl

    Rust crawl

    v0.2.1 #web-crawler #http #spider
  80. lolchive

    local liminal archiver for webpages

    v0.2.0 #web-page #archiver #local #liminal #web-crawler #date
  81. source-demo-tool-crawler

    WIP: a gui tool for opening (editing planned) source engine demo files

    v0.8.2 #demo-file #source-engine #web-crawler #tool #editing #changelog #file-content
  82. spire

    The flexible scraper framework powered by tokio and tower

    v0.1.0 #web-framework #web-crawler #scraper
  83. emails

    A web scraper to extract email addresses from websites

    v1.0.0 #email #web-crawler #web
  84. sws-crawler

    Web crawler with plugable scraping logic

    v0.1.0 #web-crawler #web-scraping-logic #sws #sitemap #seed #plugable #scrap #web-page
  85. scraper_query

    Ergonomic Query for HTML with Scraper

    v0.4.0 200 #web-scraping #query #html #document #class #web-crawler
  86. flatcrawl-crawler

    set of webpage crawlers. New crawlers can be easily configured and the output can be written to an AMQP queue.

    v1.0.0 #amqp #web-crawler #web-scraping #flatcrawl #flats #web-page
  87. task_deport

    Organize simple task queue

    v0.1.0 #task-queue #redis #in-memory-storage #processing #web-crawler #redis-queue #health-check #health-monitoring #concurrency
  88. gar-crawl

    High level HTML crawler with concise builder

    v0.1.16 #web-crawler #high #level #propagator #builder #allow-list
  89. ac_crawler_types

    normalized types for the anti capital public data crawlers

    v0.1.5 #web-crawler #capital #public #normalized #anti
  90. labisu

    implementing algorithms finding large bipartite subgraphs

    v0.1.1 #subgraph #web-crawler #bipartite #algorithm #finding #undirected-graph
  91. dblp_crawler

    DBLP Crawler

    v0.1.2 #web-crawler #dblp #chatgpt #database
  92. spire-macros

    Macros for spire

    v0.1.0 #web-framework #web-crawler #scraper
  93. crawler_data_client

    client for programmatic download of crawler data

    v0.0.9 #web-crawler #download #client #market #programmatic #zstd
  94. pop-os/apt-repo-crawler

    crawling through files in an apt repo

    GitHub 0.1.0 #web-crawler #apt #repo
  95. scrupy

    fast, modern spider framework written in and for Rust. The framework implements the functionalities of Scrapy, but is low-level and typesafe. It exposes an elegant API and uses zero unsafe code.

    v0.1.6 #framework #low-level #type-safe #elegant #downloader #scrapy #web-crawler
  96. karkinos

    Powerful and flexible web scraper with YAML configuration, supporting pagination, data transformations, caching, and multiple output formats

    v0.0.1 #web-scraping #web-crawler #html-parser #scraper