Automating Information Extraction from Emails using Large Language Models

This project is still in progress

Prerequisites

Python 3.12 installed
ollama installed (Installation Guide)
pip installed (comes with Python)
.msg email files available for processing
LangSmith api key
Huggingface api key

Setup & Execution:

Create a Virtual Environment:

Run the following command in your project's root directory:

For Linux/macOS:

python3 -m venv myvenv

or

conda create -n myvenv python=3.12 -y

For Windows:

python -m venv myvenv

or

conda create -n myvenv python=3.12 -y

Activate the Virtual Environment:

For Linux/macOS:

source myvenv/bin/activate

or

conda activate myvenv

For Windows (Command Prompt):

myvenv\Scripts\activate

or

conda activate myvenv

Install Dependencies:

pip install -r requirements.txt

Download the LLM Model

Before running the program, download the required model (or any model you wish to use):

ollama pull llama3.1

Execute program:

Different approaches for preprocessing the emails

# Synchronous model inference (serial execution)
python email_preprocessing_sync.py /path/to/your/data/directory

# Asynchronous model inference                                
python email_preprocessing_async.py /path/to/your/data/directory

# Distributed model inference using vllm                         
python email_preprocessing_vllm_unordered.py /path/to/your/data/directory

# Distributed model inference using accelerate with distinct prompts 
# (will be transformed into an agent later)
python email_preprocessing_agent /path/to/your/data/directory

First approach: RAG on Vector Database

# Create a vector database from a deduplicated email list
python create_embeddings.py
# Run Retrieval-Augmented Generation (RAG) on the vector database
python rag_embedDB.py

Second approach: RAG on Knowledge Graph

# Create Neo4j knowledge graph from emails
python create_kg.py
# Run the bot with an agent implementing RAG on the knowledge graph
python bot.py

Note:

/path/to/your/data/directory should be replaced with the actual path where your .msg files are stored.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Automating Information Extraction from Emails using Large Language Models

Prerequisites

Setup & Execution:

Create a Virtual Environment:

For Linux/macOS:

For Windows:

Activate the Virtual Environment:

For Linux/macOS:

For Windows (Command Prompt):

Install Dependencies:

Download the LLM Model

Execute program:

Different approaches for preprocessing the emails

First approach: RAG on Vector Database

Second approach: RAG on Knowledge Graph

Note:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ChrysMan/llm-automated-email-parser

Folders and files

Latest commit

History

Repository files navigation

Automating Information Extraction from Emails using Large Language Models

Prerequisites

Setup & Execution:

Create a Virtual Environment:

For Linux/macOS:

For Windows:

Activate the Virtual Environment:

For Linux/macOS:

For Windows (Command Prompt):

Install Dependencies:

Download the LLM Model

Execute program:

Different approaches for preprocessing the emails

First approach: RAG on Vector Database

Second approach: RAG on Knowledge Graph

Note:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages