Skip to content

CTINexus is a novel framework that leverages optimized in-context learning of LLMs to enable data-efficient extraction of cyber threat intelligence and the construction of high-quality cybersecurity knowledge graphs.

License

Notifications You must be signed in to change notification settings

eljeffeg/CTINexus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logo

Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

License: MIT

This is a fork of the CTINexus demo which converts it into a containerized micro-service as a proof-of-concept use case, allowing the user to submit text and return the processed framework results along with a visualization of the graph. It uses Gradio for the UI. Please see their project here.

framework

The repository of CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models (LLMs) for data-efficient CTI knowledge extraction and high-quality cybersecurity knowledge graph (CSKG) construction. CTINexus requires neither extensive data nor parameter tuning and can adapt to various ontologies with minimal annotated examples.

framework

News

🌟 [2025/06/14] Community spotlight — Jeff’s fork turns CTINexus into a containerized micro-service PoC with a Gradio UI. Submit text and instantly see the extracted intel and interactive graph!

🔥 [2025/04/21] We released the camera-ready paper on arxiv.

🔥 [2025/02/12] CTINexus is accepted at 2025 IEEE European Symposium on Security and Privacy (Euro S&P).

Introduction

CTINexus composes of the following modules:

  • IE: A carefully designed automatic prompt construction strategy with optimal demonstration retrieval for extracting a wide range of cybersecurity entities and relations;
  • A hierarchical entity alignment technique that canonicalizes the extracted knowledge and removes redundancy;
    • ET: Groups mentions of the same type.
    • EM: Merges mentions referring to the same entity with IOC protection.
  • LP: An long-distance relation prediction technique to further complete the CSKG with missing links.

Quick Start

1. Prerequisites

pip install -r requirements.txt

2. Cybersecurity Triplet Extraction

  1. Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to perform triplet extraction:
    sh tools/scripts/ie.sh

3. Hierarchical Entity Alignment

3.1 Course-grained Entity Typing

  1. Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to perform triplet extraction:
    sh tools/scripts/et.sh

3.2 Fine-grained Entity Merging

  1. Update the configuration files (config1, config2). To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to perform entity alignment:
    sh tools/scripts/em.sh

4. Long-Distance Relation Prediction

  1. Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
  2. Run the following script to predict long-distance relations:
    sh tools/scripts/lp.sh

Citation

@inproceedings{cheng2025ctinexusautomaticcyberthreat,
      title={CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models}, 
      author={Yutong Cheng and Osama Bajaber and Saimon Amanuel Tsegai and Dawn Song and Peng Gao},
      booktitle={2025 IEEE European Symposium on Security and Privacy (EuroS\&P)},
      year={2025},
      organization={IEEE}
}

License

The source code is licensed under the MIT License. We warmly welcome industry collaboration. If you’re interested in building on CTINexus or exploring joint initiatives, please email [email protected]—we’d be happy to set up a brief call to discuss ideas.

About

CTINexus is a novel framework that leverages optimized in-context learning of LLMs to enable data-efficient extraction of cyber threat intelligence and the construction of high-quality cybersecurity knowledge graphs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.6%
  • Jinja 13.2%
  • Dockerfile 0.2%