Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

The repository of CTINexus, a novel framework leveraging optimized in-context learning (ICL) of large language models (LLMs) for data-efficient CTI knowledge extraction and high-quality cybersecurity knowledge graph (CSKG) construction. CTINexus requires neither extensive data nor parameter tuning and can adapt to various ontologies with minimal annotated examples.

News

🌟 [2025/06/14] Community spotlight — Jeff’s fork turns CTINexus into a containerized micro-service PoC with a Gradio UI. Submit text and instantly see the extracted intel and interactive graph!

🔥 [2025/04/21] We released the camera-ready paper on arxiv.

🔥 [2025/02/12] CTINexus is accepted at 2025 IEEE European Symposium on Security and Privacy (Euro S&P).

Introduction

CTINexus composes of the following modules:

IE: A carefully designed automatic prompt construction strategy with optimal demonstration retrieval for extracting a wide range of cybersecurity entities and relations;
A hierarchical entity alignment technique that canonicalizes the extracted knowledge and removes redundancy;
- ET: Groups mentions of the same type.
- EM: Merges mentions referring to the same entity with IOC protection.
LP: An long-distance relation prediction technique to further complete the CSKG with missing links.

Quick Start

1. Prerequisites

pip install -r requirements.txt

2. Cybersecurity Triplet Extraction

Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
Run the following script to perform triplet extraction:
```
sh tools/scripts/ie.sh
```

3. Hierarchical Entity Alignment

3.1 Course-grained Entity Typing

Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
Run the following script to perform triplet extraction:
```
sh tools/scripts/et.sh
```

3.2 Fine-grained Entity Merging

Update the configuration files (config1, config2). To use the optimal settings, simply insert your OpenAI API key.
Run the following script to perform entity alignment:
```
sh tools/scripts/em.sh
```

4. Long-Distance Relation Prediction

Update the configuration file. To use the optimal settings, simply insert your OpenAI API key.
Run the following script to predict long-distance relations:
```
sh tools/scripts/lp.sh
```

Citation

We hope our work serves as a foundation for further LLM applications in the CTI analysis community. If you find it helpful for your research, please consider citing our paper! ❤️

@inproceedings{cheng2025ctinexusautomaticcyberthreat,
      title={CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models}, 
      author={Yutong Cheng and Osama Bajaber and Saimon Amanuel Tsegai and Dawn Song and Peng Gao},
      booktitle={2025 IEEE European Symposium on Security and Privacy (EuroS\&P)},
      year={2025},
      organization={IEEE}
}

License

The source code is licensed under the MIT License. We warmly welcome industry collaboration. If you’re interested in building on CTINexus or exploring joint initiatives, please email [email protected]—we’d be happy to set up a brief call to discuss ideas.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

News

Introduction

Quick Start

1. Prerequisites

2. Cybersecurity Triplet Extraction

3. Hierarchical Entity Alignment

3.1 Course-grained Entity Typing

3.2 Fine-grained Entity Merging

4. Long-Distance Relation Prediction

Citation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
EM		EM
ET		ET
IE		IE
LP		LP
assets		assets
data		data
tools		tools
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

License

0060lulu/CTINexus

Folders and files

Latest commit

History

Repository files navigation

Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

News

Introduction

Quick Start

1. Prerequisites

2. Cybersecurity Triplet Extraction

3. Hierarchical Entity Alignment

3.1 Course-grained Entity Typing

3.2 Fine-grained Entity Merging

4. Long-Distance Relation Prediction

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages