1 unstable release
| 0.1.0 | Oct 25, 2025 |
|---|
#910 in Network programming
144 downloads per month
130KB
2K
SLoC
Rafka
A High-Performance Distributed Message Broker Built in Rust
Rafka is a blazing-fast, experimental distributed asynchronous message broker inspired by Apache Kafka. Built with Rust and leveraging Tokio's async runtime, it delivers exceptional performance through its peer-to-peer mesh architecture and custom in-memory database for unparalleled scalability and low-latency message processing.
๐ Key Features
- High-Performance Async Architecture: Built on Tokio for maximum concurrency and throughput
- gRPC Communication: Modern protocol buffers for efficient inter-service communication
- Partitioned Message Processing: Hash-based partitioning for horizontal scalability
- In-Memory Storage Engine: Custom-built storage with retention policies and metrics
- Offset Tracking: Consumer offset management for reliable message delivery
- Retention Policies: Configurable message retention based on age and size
- Real-time Metrics: Built-in monitoring and performance metrics
- Modular Design: Clean separation of concerns across multiple crates
๐๏ธ Architecture Overview
System Architecture Diagram
graph TB
subgraph "Client Layer"
P[Producer]
C[Consumer]
end
subgraph "Broker Cluster"
B1[Broker 1<br/>Partition 0]
B2[Broker 2<br/>Partition 1]
B3[Broker 3<br/>Partition 2]
end
subgraph "Storage Layer"
S1[In-Memory DB<br/>Partition 0]
S2[In-Memory DB<br/>Partition 1]
S3[In-Memory DB<br/>Partition 2]
end
P -->|gRPC Publish| B1
P -->|gRPC Publish| B2
P -->|gRPC Publish| B3
B1 -->|Store Messages| S1
B2 -->|Store Messages| S2
B3 -->|Store Messages| S3
C -->|gRPC Consume| B1
C -->|gRPC Consume| B2
C -->|gRPC Consume| B3
B1 -->|Broadcast Stream| C
B2 -->|Broadcast Stream| C
B3 -->|Broadcast Stream| C
Message Flow Sequence Diagram
sequenceDiagram
participant P as Producer
participant B as Broker
participant S as Storage
participant C as Consumer
P->>B: PublishRequest(topic, key, payload)
B->>B: Hash key for partition
B->>B: Check partition ownership
B->>S: Store message with offset
S-->>B: Return offset
B->>B: Broadcast to subscribers
B-->>P: PublishResponse(message_id, offset)
C->>B: ConsumeRequest(topic)
B->>B: Create broadcast stream
B-->>C: ConsumeResponse stream
loop Message Processing
B->>C: ConsumeResponse(message)
C->>B: AcknowledgeRequest(message_id)
C->>B: UpdateOffsetRequest(offset)
end
๐ Project Structure
rafka/
โโโ Cargo.toml # Workspace manifest
โโโ config/
โ โโโ config.yml # Configuration file
โโโ scripts/ # Demo and utility scripts
โ โโโ helloworld.sh # Basic producer-consumer demo
โ โโโ partitioned_demo.sh # Multi-broker partitioning demo
โ โโโ retention_demo.sh # Message retention demo
โ โโโ offset_tracking_demo.sh # Consumer offset tracking demo
โ โโโ kill.sh # Process cleanup script
โโโ src/
โ โโโ bin/ # Executable binaries
โ โโโ start_broker.rs # Broker server
โ โโโ start_producer.rs # Producer client
โ โโโ start_consumer.rs # Consumer client
โ โโโ check_metrics.rs # Metrics monitoring
โโโ crates/ # Core library crates
โ โโโ core/ # Core types and gRPC definitions
โ โ โโโ src/
โ โ โ โโโ lib.rs
โ โ โ โโโ message.rs # Message structures
โ โ โ โโโ proto/
โ โ โ โโโ rafka.proto # gRPC service definitions
โ โ โโโ build.rs # Protocol buffer compilation
โ โโโ broker/ # Broker implementation
โ โ โโโ src/
โ โ โโโ lib.rs
โ โ โโโ broker.rs # Core broker logic
โ โโโ producer/ # Producer implementation
โ โ โโโ src/
โ โ โโโ lib.rs
โ โ โโโ producer.rs # Producer client
โ โโโ consumer/ # Consumer implementation
โ โ โโโ src/
โ โ โโโ lib.rs
โ โ โโโ consumer.rs # Consumer client
โ โโโ storage/ # Storage engine
โ โโโ src/
โ โโโ lib.rs
โ โโโ db.rs # In-memory database
โโโ docs/
โ โโโ getting_started.md # Getting started guide
โโโ tasks/
โ โโโ Roadmap.md # Development roadmap
โโโ Dockerfile # Container configuration
โโโ LICENSE # MIT License
๐ Quick Start
Prerequisites
- Rust: Latest stable version (1.70+)
- Cargo: Comes with Rust installation
- Protocol Buffers: For gRPC compilation
Installation
- Clone the repository:
git clone https://github.com/yourusername/rafka.git
cd rafka
- Build the project:
cargo build --release
- Run the basic demo:
./scripts/helloworld.sh
Manual Setup
- Start a broker:
cargo run --bin start_broker -- --port 50051 --partition 0 --total-partitions 3
- Start a consumer:
cargo run --bin start_consumer -- --port 50051
- Send messages:
cargo run --bin start_producer -- --message "Hello, Rafka!" --key "test-key"
๐ง Configuration
Broker Configuration
The broker can be configured via command-line arguments:
cargo run --bin start_broker -- \
--port 50051 \
--partition 0 \
--total-partitions 3 \
--retention-seconds 604800
Available Options:
--port: Broker listening port (default: 50051)--partition: Partition ID for this broker (default: 0)--total-partitions: Total number of partitions (default: 1)--retention-seconds: Message retention time in seconds (default: 7 days)
Configuration File
Edit config/config.yml for persistent settings:
server:
host: "127.0.0.1"
port: 9092
log:
level: "info" # debug, info, warn, error
broker:
replication_factor: 3
default_topic_partitions: 1
storage:
type: "in_memory"
๐๏ธ Core Components
1. Core (rafka-core)
Purpose: Defines fundamental types and gRPC service contracts.
Key Components:
- Message Structures:
Message,MessageAck,BenchmarkMetrics - gRPC Definitions: Protocol buffer definitions for all services
- Serialization: Serde-based serialization for message handling
Key Files:
message.rs: Core message types and acknowledgment structuresproto/rafka.proto: gRPC service definitions
2. Broker (rafka-broker)
Purpose: Central message routing and coordination service.
Key Features:
- Partition Management: Hash-based message partitioning
- Topic Management: Dynamic topic creation and subscription
- Broadcast Channels: Efficient message distribution to consumers
- Offset Tracking: Consumer offset management
- Retention Policies: Configurable message retention
- Metrics Collection: Real-time performance metrics
Key Operations:
publish(): Accept messages from producersconsume(): Stream messages to consumerssubscribe(): Register consumer subscriptionsacknowledge(): Process message acknowledgmentsupdate_offset(): Track consumer progress
3. Producer (rafka-producer)
Purpose: Client library for publishing messages to brokers.
Key Features:
- Connection Management: Automatic broker connection handling
- Message Publishing: Reliable message delivery with acknowledgments
- Error Handling: Comprehensive error reporting
- UUID Generation: Unique message identification
Usage Example:
let mut producer = Producer::new("127.0.0.1:50051").await?;
producer.publish("my-topic".to_string(), "Hello World".to_string(), "key-1".to_string()).await?;
4. Consumer (rafka-consumer)
Purpose: Client library for consuming messages from brokers.
Key Features:
- Subscription Management: Topic subscription handling
- Stream Processing: Asynchronous message streaming
- Automatic Acknowledgment: Built-in message acknowledgment
- Offset Tracking: Automatic offset updates
- Channel-based API: Clean async/await interface
Usage Example:
let mut consumer = Consumer::new("127.0.0.1:50051").await?;
consumer.subscribe("my-topic".to_string()).await?;
let mut rx = consumer.consume("my-topic".to_string()).await?;
while let Some(message) = rx.recv().await {
println!("Received: {}", message);
}
5. Storage (rafka-storage)
Purpose: High-performance in-memory storage engine.
Key Features:
- Partition-based Storage: Separate queues per partition
- Retention Policies: Age and size-based message retention
- Offset Management: Efficient offset tracking and retrieval
- Acknowledgment Tracking: Consumer acknowledgment management
- Metrics Collection: Storage performance metrics
- Memory Optimization: Efficient memory usage with cleanup
Storage Architecture:
graph LR
subgraph "Storage Engine"
T[Topic]
P1[Partition 0]
P2[Partition 1]
P3[Partition 2]
T --> P1
T --> P2
T --> P3
P1 --> Q1[Message Queue]
P2 --> Q2[Message Queue]
P3 --> Q3[Message Queue]
end
๐ Message Flow
Publishing Flow
- Producer sends
PublishRequestto Broker - Broker hashes the message key to determine partition
- Broker checks partition ownership
- Broker stores message in Storage with unique offset
- Broker broadcasts message to subscribed consumers
- Broker returns
PublishResponsewith message ID and offset
Consumption Flow
- Consumer sends
ConsumeRequestto Broker - Broker creates broadcast stream for the topic
- Broker streams messages via gRPC to Consumer
- Consumer processes message and sends acknowledgment
- Consumer updates offset to track progress
- Storage cleans up acknowledged messages based on retention policy
๐ Performance Features
Partitioning Strategy
Rafka uses hash-based partitioning for efficient message distribution:
fn hash_key(&self, key: &str) -> u32 {
key.bytes().fold(0u32, |acc, b| acc.wrapping_add(b as u32))
}
fn owns_partition(&self, message_key: &str) -> bool {
let hash = self.hash_key(message_key);
hash % self.total_partitions == self.partition_id
}
Retention Policies
Configurable message retention based on:
- Time-based: Maximum age (default: 7 days)
- Size-based: Maximum storage size (default: 1GB)
Metrics Collection
Built-in metrics for monitoring:
- Total messages stored
- Total bytes consumed
- Oldest message age
- Consumer offset positions
๐งช Demo Scripts
1. Hello World Demo
./scripts/helloworld.sh
Basic producer-consumer interaction demonstration.
2. Partitioned Demo
./scripts/partitioned_demo.sh
Multi-broker setup with hash-based partitioning.
3. Retention Demo
./scripts/retention_demo.sh
Demonstrates message retention policies.
4. Offset Tracking Demo
./scripts/offset_tracking_demo.sh
Shows consumer offset management and recovery.
๐ ๏ธ Development
Building from Source
# Clone repository
git clone https://github.com/yourusername/rafka.git
cd rafka
# Build all crates
cargo build
# Run tests
cargo test
# Build release version
cargo build --release
Running Tests
# Run all tests
cargo test
# Run specific crate tests
cargo test -p rafka-storage
cargo test -p rafka-broker
Code Structure
The project follows Rust best practices with:
- Workspace Organization: Multiple crates in a single workspace
- Separation of Concerns: Each component in its own crate
- Async/Await: Modern async Rust with Tokio
- Error Handling: Comprehensive error types and handling
- Testing: Unit tests for all major components
๐ง Current Status
โ ๏ธ Early Development - Not Production Ready
Rafka is currently in active development. The current implementation provides:
โ Completed Features:
- Basic message publishing and consumption
- Hash-based partitioning
- In-memory storage with retention policies
- Consumer offset tracking
- gRPC-based communication
- Metrics collection
- Demo scripts and examples
๐ In Progress:
- Peer-to-peer mesh networking
- Distributed consensus algorithms
- Kubernetes deployment configurations
- Performance optimizations
๐ Planned Features:
- Replication across multiple brokers
- Fault tolerance and recovery
- Security and authentication
- Client SDKs for multiple languages
- Comprehensive monitoring and alerting
๐ค Contributing
We welcome contributions! Here are some areas where you can help:
High Priority
- P2P Mesh Implementation: Distributed node discovery and communication
- Consensus Algorithms: Leader election and cluster coordination
- Replication: Cross-broker message replication
- Fault Tolerance: Node failure detection and recovery
Medium Priority
- Performance Optimization: Message batching and compression
- Security: TLS encryption and authentication
- Monitoring: Prometheus metrics and Grafana dashboards
- Documentation: API documentation and tutorials
Getting Started
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
- Apache Kafka for inspiration on messaging systems
- Tokio for the excellent async runtime
- Tonic for gRPC implementation
- @wyattgill9 for the initial proof of concept
- The Rust community for their excellent libraries and support
Built with โค๏ธ in Rust
Dependencies
~11โ16MB
~275K SLoC