streamlit_about
is an open source application showcasing a Retrieval-Augmented Generation (RAG) system that delivers personalised, AI-driven, real-time question and answer experiences by Miklós. Written by Miklós in Python, it integrates modern Natural Language Processing (NLP) techniques with state-of-the-art Large Language Models (LLMs) engineering, vector search, along with a robust and modular backend design to provide an informative and interactive user experience.
- Live App: about-miklos.streamlit.app
- Source Code: github.com/ikko/streamlit_about
- Contact: linkedin.com/in/miklosbeky
Combines vector-based semantic search with generative AI to provide accurate and contextually relevant answers.
- Vector Store: Utilizes FAISS for efficient similarity search.
- BM25 Indexing: Employs Best Matching 25 (BM25) algorithm for keyword-based retrieval.
- Metadata Management: Stores and retrieves document metadata for enhanced context.
Structured for scalability and maintainability, separating concerns across different modules.
- Knowledge Base: Handles data ingestion and vector index creation.
- Backend Logic: Manages query processing and response generation.
- Dynamic Pages: Generates content-rich pages dynamically based on user interactions.
Leverages Streamlit's capabilities to create an intuitive and responsive user interface.
- Dynamic Content: Updates in real-time based on user queries.
- AI Generated Knowledge Base: Unuspervisied Learning enabled Clustering of the Knowledge Topics and Large Language Model generated Q&A ensures generalized, easy to adopt and delicate articulate content.
- Multi-Page Navigation: Organizes content across multiple pages for better UX.
Designed for seamless deployment and scalability.
- Streamlit Sharing: Easily deployable via Streamlit's sharing platform.
- Development: Includes handy scripts for streamlined local deployment.
- Requirements Management: Specifies dependencies for reproducibility.
Integrates information retrieval with generative models to enhance response accuracy. This hybrid approach ensures that generated answers are grounded in factual data.
Adopts a clear separation of concerns, facilitating easier maintenance and scalability. Each module handles a specific responsibility, promoting code clarity.
Utilizes Streamlit to quickly develop and deploy interactive web applications, reducing the time from concept to deployment.
Employs FAISS for high-speed similarity searches in large datasets, enabling efficient retrieval of relevant information.
- Python 3.8 or higher
- pip package manager
git clone https://github.com/ikko/streamlit_about.git
cd streamlit_about
pip install -r requirements.txt
bash start.sh
This will launch the Streamlit app locally.
streamlit_about/
├── accomplishments/ # Content related to achievements
├── backend/ # Backend logic and processing
├── dynamic_pages/ # Dynamically generated pages
├── images/ # Static images used in the app
├── pages/ # AI generated pages
├── Knowledge_Base.py # Handles data ingestion and indexing
├── rag_bm25result.pickle # Serialized BM25 index
├── rag_openai_faiss.index # FAISS vector index
├── rag_openai_metadata.json # Metadata for indexed documents
├── requirements.txt # Python dependencies
├── start.sh # Startup script for the app
├── style.css # Custom CSS styling
└── README.md # Project documentation
- Hybrid Retrieval Techniques: Combining BM25 and vector search offers a balance between keyword matching and semantic understanding.
- Modular Architecture: Facilitates experimentation with different components, such as swapping out the vector store or language model.
- Streamlit Integration: Demonstrates how to build user-friendly interfaces for complex AI systems without extensive frontend development.
- Scalability Considerations: The separation of backend logic and frontend presentation allows for scaling individual components as needed.
For any inquiries or contributions, feel free to reach out via LinkedIn.
Copyright (c) 2025 Miklós Béky
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.