DrQA download | SourceForge.net

DrQA is an open-domain question answering system that reads large text corpora—famously Wikipedia—to answer natural language questions with extractive spans. It follows a two-stage pipeline: a fast document retriever first narrows down candidate articles, and a neural machine reader then predicts the exact answer span from those passages. The retriever relies on classic IR features (like TF-IDF and n-gram statistics) to remain lightweight and scalable to millions of documents. The reader is a neural model trained on supervised QA data to estimate start and end positions within a paragraph, and it can be adapted to new domains through fine-tuning or distant supervision. The repository includes scripts to build the Wikipedia index, train the reader, and evaluate end-to-end performance. DrQA popularized a practical recipe for combining IR and neural reading, and it remains a strong baseline for open-domain QA research and production prototypes.

Features

Scalable TF-IDF–based retriever over large corpora
Neural span extractor trained for precise start/end predictions
End-to-end pipeline from indexing to answering questions
Tools for distant supervision and domain adaptation
Reproducible training and evaluation scripts for standard datasets
Modular components enabling IR or reader swaps and custom corpora

Project Samples

Project Activity

See All Activity >

License

BSD License

Follow DrQA

DrQA Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.

Start Free

Rate This Project

User Reviews

Be the first to post a review of DrQA!

Additional Project Details

Operating Systems

Linux, Mac

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2025-10-07

Similar Business Software

Vectara

Vectara is LLM-powered search-as-a-service. The platform provides a complete ML search pipeline from extraction and indexing to retrieval, re-ranking and calibration. Every element of the platform is API-addressable. Developers can embed the most advanced NLP models for app and site search in...

See Software
Character.AI

Character.AI is bringing to life the science-fiction dream of open-ended conversations and collaborations with computers. We are building the next generation of dialog agents; with a long-tail of applications spanning entertainment, education, general question-answering and others. Our dialog...

See Software
xMagic

Chat with your documents. Single or multiple documents, easy or complex questions, and get answers instantly. Simply upload your file onto xMagic and start asking questions. When you first upload the document, xMagic will automatically provide you with a clear summary of the text on your file...

See Software
Snowflake Cortex AI

Snowflake Cortex AI is a fully managed, serverless platform that enables organizations to analyze unstructured data and build generative AI applications within the Snowflake ecosystem. It offers access to industry-leading large language models (LLMs) such as Meta's Llama 3 and 4, Mistral, and...

See Software
AskPaper

Ask Paper allows you to read and extract information from papers more quickly. It allows you to upload papers either by URL or by uploading a PDF file, and then ask natural language questions about the paper. AskPaper is a tool powered by a Large Language Model. This is a Neural Network that was...

See Software
LMCache

LMCache is an open source Knowledge Delivery Network (KDN) designed as a caching layer for large language model serving that accelerates inference by reusing KV (key-value) caches across repeated or overlapping computations. It enables fast prompt caching, allowing LLMs to “prefill” recurring...

See Software

Report inappropriate content

DrQA

Reading Wikipedia to Answer Open-Domain Questions

Get an email when there's a new version of DrQA

Features

Project Samples

Project Activity

Categories

License

Follow DrQA

User Reviews

Additional Project Details

Operating Systems

Programming Language

Related Categories

Registered