When we interact with AI agents—like chatbots, assistants, or content generators—we usually expect fast and natural responses. In traditional response generation, the agent processes the input, generates the entire reply internally, and only then sends the complete result to the user. This can feel slow, especially with large responses.
Streaming in AI agents is a technique where the response is sent back piece-by-piece (usually token-by-token) as it is being generated by the model. Instead of waiting for the full output, users can see the response appear in real time—just like watching someone type it out.
This approach is particularly valuable when:
- Working with large language models (LLMs) like GPT-4, Gemini, or Claude.
- Building real-time user experiences such as chat applications, AI copilots, or live data summarizers.
- Integrating with UI frameworks that support incremental updates (e.g., Streamlit, React, WebSockets).
With streaming:
- The user experience feels faster and more responsive.
- Developers can build interactive and engaging applications.
- Agents become more conversational, mimicking real-time communication with a human.
- Delivers results as they are generated.
- Reduces user waiting time.
- Creates a dynamic and engaging experience.
- Feels more natural in chat interfaces.
- Great for interactive assistants, chatbots, and collaborative agents.
- Allows frontend to render each part of the response incrementally.
- Useful for building loaders, typing indicators, or real-time displays.
- Immediate feedback enables users to interrupt or adjust input early.
- Reduces unnecessary token generation in case of user-initiated stops.
- Supported by frameworks like LangChain, FastAPI, and OpenAI API using streaming endpoints.
- Works with WebSockets or server-sent events (SSE) for seamless real-time communication.
- Token-by-token streaming of responses using OpenAI API.
- Real-time UI integration with typing animation.
- Cancel or interrupt ongoing responses.
- Multi-agent streaming pipeline (for tool-using agents).
Before getting started, make sure you have the following:
- Python 3.8+ installed on your system.
- Basic knowledge of Python programming.
- Familiarity with virtual environments and dependency management.
- Git installed to clone the repository.
- 🤖 Custom AI Agent: Built using the Agent and Runner pattern for modularity and scalability.
- 🌐 Google Gemini 2.0 Flash Integration: Utilizes Google's API for access to the powerful Gemini language model.
- 🔑 Secure API Key Management: Employs
python-dotenv
for safe and convenient API key handling. - 🧠 Dynamic Prompt Handling: Processes and responds effectively to a wide range of user prompts.
- ⚡ Asynchronous Communication: Leverages
AsyncOpenAI
for optimizedasynchronous communication
with OpenAI, improving speed and responsiveness. - 🐍 Clean Python Codebase: Maintained with a focus on readability, modularity, and best practices.
- Clone the Repository:
git clone https://github.com/waheed444/OpenAI_SDK_Streaming_Agent.git
cd OpenAI_SDK_Streaming_Agent
- Create and Activate a Virtual Environment:
python -m venv venv
source venv/bin/activate # For Windows: venv\Scripts\activate
- Install Dependencies:
pip install -r requirements.txt
OR Install Dependencies:
pip install openai-agents python-dotenv
(If a requirements file is not available, check pyproject.toml
for dependency instructions.)
- Set up
.env
file:
Create a .env
file in the root directory and add your Gemini API key:
GIMINI_API_KEY = your_actual_gemini_api_key_here
Note: Replace your_actual_gemini_api_key_here
with your actual Gemini API key.
- Run the Agent
After setting up the project, you can run the AI agent by executing the main.py
file.
For Example:
python main.py # For uv users : uv run main.py
This project is licensed under the MIT License - see the LICENSE file for details.
We welcome contributions to improve this project! Please follow these steps:
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Make your changes and ensure they adhere to the project's coding style and best practices.
- Commit your changes:
git commit -m "Add feature"
- Push to the branch:
git push origin feature-name
- Submit a pull request with a clear description of your changes and their benefits. If you find any issues or want to improve this project, feel free to open a GitHub issue or submit a pull request.
This repo is only for learning and exploring new things, feel free to fork it, explore, or give suggestions!
Star ⭐ the repo if it helps you!