IoT Infrastructure on Kubernetes
Kubernetes · Kafka · Kong · A.Y. 2025-2026
A cloud-native, event-driven microservices architecture for industrial IoT monitoring, built and deployed on Kubernetes. Course project for Cloud Computing Technologies (A.Y. 2025/2026).
Architecture Overview
The system is organized into three logical namespaces - kong, kafka, and metrics - following a strict Separation of Concerns principle. Data flows from edge sensors through an API Gateway, into a persistent streaming layer, and finally into an optimized time-series database.
The event-driven design fully decouples ingestion from processing: if the consumer slows down or goes offline, messages accumulate durably in Kafka and are processed upon recovery - with zero data loss.
Key Components
Kong API Gateway - Edge Security
Kong acts as the single entry point for all sensor traffic, implementing the Gateway Offloading pattern:
- Authentication: API Key validation at the edge - invalid requests are rejected before reaching any backend service
- Rate Limiting: 5 requests/second per client, returning
429 Too Many Requestson excess - Load Balancing: Round-robin distribution across Producer replicas
The entire configuration is Infrastructure as Code via Kubernetes CRDs - no manual UI, fully reproducible.
Apache Kafka - Reliable Streaming
Kafka runs in KRaft mode (no ZooKeeper), reducing operational complexity and attack surface. Two dedicated topic strategies handle different QoS requirements:
| Topic | Partitions | Compression | Retention | Guarantee |
|---|---|---|---|---|
sensor-telemetry | 3 | LZ4 | 7 days | At-least-once |
sensor-alerts | 2 | - | 30 days | Zero Data Loss (min.insync.replicas: 2) |
LZ4 compression reduces JSON network traffic by up to 60% with minimal CPU overhead.
MongoDB - Time Series Storage
Data is stored using MongoDB’s native Time Series Collections, which provide:
- Automatic Zstd compression (~70% storage reduction on historical data)
- Clustered indexes on
timestampfor efficient range queries - TTL auto-expiry at 30 days (
expireAfterSeconds: 2592000)
MongoDB is deployed as a StatefulSet to ensure stable pod identity and persistent volume reattachment across crashes.
Python Microservices
Three stateless/stateful services handle the full pipeline:
- Producer - Flask REST API; enriches each message with a UUID idempotency token and Kafka topic routing
- Consumer - Stateful worker (3 replicas, one per partition); normalizes and persists data to MongoDB
- Metrics Service - Exposes 5 REST analytics endpoints (temperature averages by zone, alert breakdown, 7-day activity trend, firmware status)
Non-Functional Properties
| Property | Implementation |
|---|---|
| Security | TLS on port 9093, SASL/SCRAM-SHA-512, Kubernetes Secrets (no hardcoded credentials) |
| Fault Tolerance | Kafka as async buffer; consumer crash → messages preserved → zero data loss on recovery |
| Self-Healing | Kubernetes restarts crashed pods automatically |
| Scalability | HPA configured at 50% CPU threshold; scales Producer 1→4 replicas under load |