Built a real-time analytics platform that processes user behavior events, aggregates them into time-series data, and presents actionable insights through an interactive dashboard.
The pipeline uses Kafka Streams for real-time processing and ClickHouse for analytical queries.
Architecture
Lambda architecture with batch and speed layers. Kafka Streams for real-time aggregation. ClickHouse for OLAP queries.
Key Challenges
Handling late-arriving events and windowing. Implemented watermark-based event time processing.
Scaling Decisions
Kafka partitioning by event type. ClickHouse materialized views for pre-aggregation. Horizontal scaling of consumers.