KPI FICE RAG CHAT-BOT

Piece	What it does	Key libs / models
Vector DB	Stores chunks + embeddings	Haystack ChromaDocumentStore
Embeddings	UA-capable sentence vectors	lang-uk/ukr-paraphrase-mpnet-base (based on Roberta)
Index pipeline	✓ convert (txt/pdf/docx/md/html) → docs ✓ clean & split → chunks ✓ embed → vectors ✓ write to Chroma	Haystack converters, splitters, embedders
QA pipeline	1) embed user question → vector 2) retrieve top-10 similar docs 3) build Ukrainian prompt 4) call LLM → answer	Retriever + ChatPromptBuilder → OllamaChatGenerator or OpenAIChatGenerator
LLM choices	• Local: any Ollama model (LLM_MODEL, default llama3.1:8b, ~8 GB RAM) • Cloud: ChatGPT via OPENAI_API_KEY	Ollama / OpenAI
Telegram bot	/start welcome → every text is passed to QA pipeline, first reply sent back	python-telegram-bot v13

Structure:

data/fiot_files/ ← unzip faculty docs here
chroma_dir/ ← auto-generated vector DB
qa_pipeline.py ← RAG inference
data_pipeline.py ← indexing/embeddings
telegram_bot.py ← Telegram interface
.env ← keys & model names

To start the bot:

Unzip fiot_files.zip file in data dir, so faculty documents will be placed under data/fiot_files dir, you can also add another files with faculty info in txt, pdf, docx, markdown & html formats;
Create new telegram bot via @BotFather and update TELEGRAM_TOKEN value in .env file with a new bot's token;
Download Ollama to run free LLMs locally;
Run in terminal: ollama run llama3.1:8b - download & start llama3.1 model with 8 billion params locally, which requires 8GM of RAM. You can also choose other model from Ollama site, then replace LLM_MODEL name inside .env file;
Run pip install -r requirements.txt inside working directory to download all dependencies;
Run python telegram_bot.py in terminal.

Now you can go to your telegram bot and ask it questions about FICE :)

Troubleshooting:

if you don't have Nvidia GPU remove line device = ComponentDevice.from_str("cuda") & device variable usage from qa_pipeline.py and data_pipeline.py files)

Improve Bot's Answers:

To improve bot's answers you should use "stronger" LLM. To run this bot with OpenAI (ChatGPT) models:

Generate new OpenAI api key here (you should also have credits on the account);
Update OPENAI_API_KEY in .env file with your OpenAI api key;
Go to qa_pipeline.py file and uncomment qa_pipeline.add_component("llm", OpenAIChatGenerator()) line, then comment qa_pipeline.add_component("llm", OllamaChatGenerator(...)) lines;
Re-start the bot: python telegram_bot.py.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
data		data
.env		.env
.gitignore		.gitignore
data_pipeline.png		data_pipeline.png
data_pipeline.py		data_pipeline.py
fice-chat-bot.iml		fice-chat-bot.iml
qa_pipeline.png		qa_pipeline.png
qa_pipeline.py		qa_pipeline.py
readme.md		readme.md
requirements.txt		requirements.txt
telegram_bot.py		telegram_bot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KPI FICE RAG CHAT-BOT

Structure:

To start the bot:

Troubleshooting:

Improve Bot's Answers:

To improve bot's answers you should use "stronger" LLM. To run this bot with OpenAI (ChatGPT) models:

Check qa_pipeline.png and data_pipeline.png: pipelines visualization.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

shutuper/fice-chat-bot

Folders and files

Latest commit

History

Repository files navigation

KPI FICE RAG CHAT-BOT

Structure:

To start the bot:

Troubleshooting:

Improve Bot's Answers:

To improve bot's answers you should use "stronger" LLM. To run this bot with OpenAI (ChatGPT) models:

Check qa_pipeline.png and data_pipeline.png: pipelines visualization.

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages