Tox-Agents is an AI-driven inference platform for exploring molecular toxicity at scale. The ecosystem brings together state-of-the-art deep learning models, transfer-learning frameworks, and reaction-network tooling so that chemists and toxicologists can move from raw structures to defensible insights with minimal friction. The current release (September 22, 2025) unifies the full-stack web experience with offline executables, enabling both rapid experimentation and production-grade studies.
- ToxD4C – a from-scratch toxicity model trained on a large, diverse dataset for robust endpoint prediction. (GitHub)
- Uni-Mol Transfer Learning – leverages powerful pre-trained molecular encoders to boost inference accuracy. (Uni-Mol Tools)
- Reaction network integration – tight coupling with Molreac and ReacNet Analyzer for pathway-aware toxicity analysis.
- Interactive visualization – frontend widgets expose molecular properties, descriptors, and inference diagnostics in real time.
- Tox-D4C dataset: https://doi.org/10.6084/m9.figshare.30156718.v1
- Uni-Mol processed data:
data/data/original/processed_final8k213_original.csv - Reacnet dataset: https://doi.org/10.6084/m9.figshare.30171562
Computational chemistry descriptors—especially those tied to electronic structure and reactivity—are surfaced directly in the UI to help you interpret model outputs beyond a single toxicity score.
The hosted instance is fully open source and ready to explore:
For heavy workloads, offline use delivers better stability, avoids LLM API quotas, and reduces server costs. The bundled package includes:
- The all-in-one ToxD4C inference executable.
- The Uni-Mol inference framework with support for loading multiple pre-trained checkpoints.
Author tip: We strongly recommend the standalone ToxD4C executable. Although it forgoes massive pre-training, its large in-domain dataset yields highly reliable predictions.
Getting started offline
- Download the
.exebundle from the ToxD4C releases page. - Double-click to launch and allow a minute for initialization before the interface loads.
ToxD4C excels at tracking subtle shifts in toxicity throughout a reaction pathway. Pair it with Molreac for generating reaction networks and ReacNet Analyzer for visual inspection to follow how reactants evolve into products.
Combining 3D conformer searches with reaction-network analysis delivers rich insight that SMILES-only workflows miss. Low-energy structures are directly linked to observed toxicity trends.
Curious whether BPA degradation products remain toxic? Follow this pipeline:
- Acquire a 3D structure: Download the 3D SDF for BPA from PubChem.
- Prepare the input: Copy coordinates into a new
.xyzfile, or generate the 3D structure from SMILES via an empirical force field. - Simulate the network: Load the
.xyzfile into Molreac:ONE to generate reaction pathways (.reacnet). - Analyze the network: Use ReacNet Analyzer to transform
.reacnetfiles into interactive HTML visualizations. - Extract pathways: Identify products that match experiments or select compelling branches and export the minimum-energy structures as
.xyzfiles. - Predict toxicity: Batch the
.xyzfiles through ToxD4C to evaluate how all 31 toxicity endpoints evolve along the reaction path.
With the resulting structural and descriptor data, you can perform manual interpretation, feed the outputs into Tox-Agents for further computation, or escalate to downstream platforms such as DeepSeek for meta-analysis.
(Commits on Oct 19, 2025)
Also welcome to try the Next Generation Reaction Network Explorer - a fast first principles computation and reaction network exploration tool!

- Avoid using 2D PubChem structures for inference; always obtain a 3D geometry.
- Reject unrealistic or highly distorted conformations—the predictions will not be meaningful.
- Transition states are insightful but not validated as ground-truth inputs for toxicity; treat them as exploratory evidence only.
- Optimize geometries with an empirical force field before prediction.
- Sample low-energy conformers, ideally across an entire reaction pathway, to reveal micro-level toxicity shifts.
src/– production runtime for the intelligent agent, including the FastAPI backend (frontend/backend), Next.js SPA (frontend), orchestration scripts, shared predictors, visualizers, and chatbot utilities.data/– sanitized example datasets used in demos (training corpora are hosted separately).ToxD4C_framework/,trainfordl/,trainforml/– research and training code for the ToxD4C deep model, UniMol transfer learning, and traditional ML baselines.requirements.txt/requirements_full.txt– minimal runtime stack vs. the full research environment (including optional UniMol + LightRAG extras).README_original_gradio.md– legacy documentation for the original Gradio prototype.
The previous README tracked an older file layout; the sections below reflect the modern
srcbundle.
- Python 3.8+
- Node.js 18+ with npm
- Optional: PyMOL for local 3D visualization
cd src
pip install -r ../requirements_full.txt # or requirements.txt for a lean runtime
npm install --prefix frontend- Place UniMol checkpoints under
src/models/(for example,models/ToxPred_modelmini,models/MD_model,models/refscale.npz). - Add any NPZ, CSV, or descriptor files required for your workflow.
- Missing assets no longer crash the UI—the frontend highlights the required file and target directory.
python start_full_system.pyThe launcher clears ports 3000, 8000, and 50001-50003, validates the environment, installs frontend dependencies on demand, and boots the stack:
- FastAPI backend →
http://localhost:8000 - Next.js frontend →
http://localhost:3000
Backend logs stream to the console. Once you see ✅ 后端服务启动成功, the API is ready. Terminate both services together with Ctrl+C.
- API health check:
curl http://localhost:8000/health - Frontend smoke test: open
http://localhost:3000 - End-to-end test:
python frontend/test_real_prediction.py(run fromsrc/frontend).
cd src/frontend/backend
uvicorn main_fixed:app --host 0.0.0.0 --port 8000 --reloadKey modules:
main_fixed.pylazily loads predictors fromsrc/and exposes conversion, prediction, visualization, export, and chat endpoints.simple_rag_service.pyserves a lightweight document store located atsrc/simple_rag_storage/.chatbot.pyimplements the Gradio interface and request assembly logic.
Override default model paths by exporting environment variables before launch:
export BINARY_MODEL_PATH="models/ToxPred_modelmini"
export PROPERTY_MODEL_PATH="models/MD_model"
export REFSCALE_PATH="models/refscale.npz"If a referenced model is missing, the backend returns a clear message detailing which directory to populate.
cd src/frontend
npm install # first run
npm run dev # serves http://localhost:3000To connect a remote backend, configure the API endpoint before starting the dev server:
export NEXT_PUBLIC_API_URL="https://your-backend.example.com"
npm run devOptional UI hints can be set via environment variables (the backend still governs actual model loading):
NEXT_PUBLIC_BINARY_MODEL_PATH=models/ToxPred_modelmini
NEXT_PUBLIC_PROPERTY_MODEL_PATH=models/MD_model
NEXT_PUBLIC_REFERENCE_PATH=models/refscale.npz- UniMol checkpoints – copy into
src/models/(for example, reuseToxPred_modelmini/andMD_model/from production deployments). - ToxD4C weights and datasets – download from the shared drive (TOXRIC, TDC, Wu et al.) and place under
ToxD4C_framework/datato retrain. - Sample labels –
data/DATA_labels.csvcontains cleaned labels derived from21sttox10k.
Converters in src/interface.py support XYZ, NPZ, SDF, MOL, and SMILES inputs for streamlined data preparation.
src/chatbot.py currently forwards user turns directly to the configured LLM. To align responses with the ToxD4C analysis policy, store prompt metadata in src/frontend/backend/llm_report_config.json (or another shared location) and load it before sending the first request. A representative configuration:
{
"llm_model": "TBD",
"llm_model_version": "TBD",
"prompts": {
"A1_system_prompt": {
"role": "Chem Risk Analyst aligned to the ToxD4C workflow; produce auditable, uncertainty-aware toxicity interpretations from molecular images, structured descriptor JSON, and optional assay/context files.",
"grounding": "Use SHAP thresholds (Table 1 digest) as the only quantitative rule base; do not invent data.",
"evidence_style": "All claims must be backed by an Evidence Matrix (descriptor → value → threshold → direction → reliability).",
"uncertainty_and_applicability_domain": "Note gaps (units, missing fields). If ECFP4 similarity or embedding Mahalanobis are provided, flag AD in/out; otherwise state 'AD unknown'.",
"tools": [
"User KB: ingest CSV/JSON/PDF/image; cite file names.",
"Web: search authoritative sources; cite links.",
"Optional plugins: literature.search, cheminfo.lookup, sim.qm, sim.docking, sim.md (generate job cards/protocols; never claim execution without tool confirmation)."
],
"required_outputs": [
"Quick verdict",
"Evidence Matrix",
"Mechanism hypotheses",
"AD/uncertainty note",
"Next actions (docking/MD/QM plans with parameters)",
"Reproducibility facts (seed, version if provided)"
],
"reasoning_policy": "No chain of thought; provide decision records (rules applied and outcomes)."
},
"A2_shap_thresholds_digest": [
{"descriptor": "XLogP", "threshold": 3.05893, "direction": "higher → higher risk", "reliability": 0.979},
{"descriptor": "HOMO–LUMO gap (a.u.)", "threshold": 0.33105, "direction": "lower → higher risk", "reliability": 0.999},
{"descriptor": "ALIE Ave (a.u.)", "threshold": 0.50461, "direction": "lower → higher risk", "reliability": 0.999},
{"descriptor": "Quadrupole moment (a.u.)", "threshold": 21.1766, "direction": "higher → higher risk", "reliability": 0.9998},
{"descriptor": "Weight (Da)", "threshold": 246.334, "direction": "higher → higher risk", "reliability": 0.992},
{"descriptor": "LUMO (a.u.)", "threshold": -0.00517, "direction": "more negative → higher risk", "reliability": 0.986},
{"descriptor": "ALIEmin (eV)", "threshold": 11.2949, "direction": "lower minima → higher risk", "reliability": 0.489},
{"descriptor": "Negative ESP surface (Bohr²)", "threshold": 359.924, "direction": "higher → higher risk", "reliability": 0.999},
{"descriptor": "Heavy atom count", "threshold": 14.3053, "direction": "higher → higher risk", "reliability": 0.990},
{"descriptor": "Complexity", "threshold": 184.588, "direction": "higher → higher risk", "reliability": 1.0},
{"descriptor": "Rotatable bonds", "threshold": 2.52924, "direction": "too high → entropy penalty; near-threshold optimal", "reliability": 0.9999},
{"descriptor": "ESPmin (kcal/mol)", "threshold": -36.8484, "direction": "more negative → higher risk", "reliability": 0.958},
{"descriptor": "HOMO (a.u.)", "threshold": -0.29269, "direction": "less negative (higher) → higher risk", "reliability": 0.999},
{"descriptor": "LEA Var (eV)", "threshold": 0.06576, "direction": "higher → higher risk", "reliability": 0.9996},
{"descriptor": "Molecular radius (Å)", "threshold": 6.30992, "direction": "higher → higher risk", "reliability": 0.996},
{"descriptor": "LEA Ave (a.u.)", "threshold": -0.97949, "direction": "more negative → higher risk", "reliability": 0.814}
]
}
}Implementation checklist:
- Load
A1_system_promptas the system message before the first user turn. - Make the SHAP threshold table visible to the model so the Evidence Matrix can cite it explicitly.
- Persist decision records (rules applied and threshold comparisons) alongside chat transcripts for auditing.
- Surface "AD unknown" whenever the backend does not provide applicability-domain metrics.
- ToxD4C training –
ToxD4C_framework/train.py - UniMol fine-tuning –
trainfordl/3528_datasets/3528_train.py - Classical ML baselines –
unimol_pipeline/run_fingerprint_training.py
Datasets referenced above require external downloads; consult ToxD4C_framework/README for detailed instructions.
Released under the MIT License (see LICENSE).
Need help or have feedback? Please open an issue or reach out via the project discussions. Happy experimenting!






