Building the Foundation: Livepeer NaaP Analytics
Iβm excited to share an update on what the Livepeer Cloud SPE has been working on: our treasury proposal for Network-as-a-Product (NaaP) MVP β SLA Metrics, Analytics, and Public Infrastructure has passed, and weβre deep into Milestone 1.
π Treasury Proposal (On-Chain): Livepeer Explorer
π Forum Discussion: Metrics and SLA Foundations for NaaP
The Problem: You Canβt Optimize What You Canβt Measure
Livepeerβs AI video infrastructure is growing fast. Gateways route inference jobs to orchestrators. GPUs process real-time video streams. The network is doing work.
But hereβs the challenge: we lack a shared, network-wide view of performance, reliability, and demand that participants can use to assess Livepeer for production use.
Right now, thereβs no centralized way to answer questions like:
- Whatβs the average prompt-to-first-frame latency for a given orchestrator?
- Which GPUs are delivering consistent 20+ FPS performance?
- Whatβs the jitter coefficient across the network under load?
- Are orchestrators meeting SLA targets for uptime and reliability?
Without this data, weβre flying blind. Gateway providers canβt set SLAs. Orchestrators canβt benchmark themselves. And external developers struggle to evaluate Livepeer for serious workloads.
This proposal changes that.
What Weβre Building
The NaaP MVP delivers a focused, end-to-end metrics system for observability and learning. Itβs designed to make the Livepeer network measurable, comparable, and trustworthy.
The Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LIVEPEER NETWORK β
β ββββββββββββββββ ββββββββββββββββ β
β β Gateway β β Orchestrator β β
β β (Daydream) β β (GPU Node) β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β Events β Events β
βββββββββββΌβββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββ
β β
ββββββββββββ¬ββββββββββββββββ
β
ββββββββββββΌβββββββββββ
β KAFKA β β Event streaming backbone
ββββββββββββ¬βββββββββββ
β
ββββββββββββΌβββββββββββ
β APACHE FLINK β β Stream processing & correlation
ββββββββββββ¬βββββββββββ
β
ββββββββββββ΄βββββββββββ
βΌ βΌ
ββββββββββββ βββββββββββ
βClickHouseβ β MinIO β
β (hot) β β (cold) β
ββββββ¬ββββββ βββββββββββ
β
βββββββββββββββββββββββββββββββ
βΌ βΌ
ββββββββββββ βββββββββββββββ
β Grafana β β Public APIs β
βDashboard β β /gpu/metricsβ
ββββββββββββ β /sla/compliance
βββββββββββββββ
β
βΌ
βββββββββββββββββββββββββ
β Apps & Gateways β
β (Orchestrator Selection)
βββββββββββββββββββββββββ
The Stack
| Component | Role |
|---|---|
| Apache Kafka 3.9 | Durable event log β ingests all streaming events |
| Apache Flink 1.20 | Parses, transforms, and correlates events in real-time |
| ClickHouse 24.11 | Fast columnar database for analytics queries |
| MinIO | S3-compatible cold storage for audit trail and replay |
| Grafana 11.x | Visualization layer β dashboards for operators and network |
This isnβt a toy. Itβs designed to handle 1000+ events per second, with sub-second query latency and 90-day retention.
Proposal Deliverables
The NaaP MVP covers five key areas:
1. Core SLA Metrics (MVP Scope)
A standardized set of network, performance, and reliability metrics sufficient to evaluate orchestrator and GPU behavior across workflows.
2. Network Test & Verification Signals
Reference load-test gateways generating consistent, reproducible performance signals. Public test scenarios captured in GitHub for community verification.
3. Analytics & Aggregation Layer
Lightweight ETL pipelines transforming raw metrics into network-level views. Derived indicators like jitter coefficient, latency percentiles, and uptime scores.
4. Public Dashboard & APIs
A standalone public dashboard presenting live and historical metrics. Crucially, read-only APIs that any application can consume for aggregate SLA scores, GPU performance data, and network demand metrics.
5. Operations & Stewardship
Ongoing operation of testing, analytics, and dashboard infrastructure. Maintenance and community support for 1 year.
The API Layer: Enabling Smart Orchestrator Selection
This is the part Iβm most excited about. The APIs arenβt just for dashboards β theyβre building blocks for the entire ecosystem.
The proposal includes public API endpoints that applications can consume:
| Endpoint | Purpose |
|---|---|
/gpu/metrics | Real-time per-GPU performance metrics (FPS, latency, jitter) |
/network/demand | Aggregate network demand and capacity data |
/sla/compliance | SLA compliance scores for any orchestrator |
/datasets | Public load test datasets for verification |
What This Enables
For Gateway Providers:
- Query real-time SLA scores before routing workloads
- Select high-performing orchestrators automatically
- Avoid underperforming GPUs based on historical data
For Orchestrators:
- Benchmark against network averages
- Identify performance gaps
- Prove their reliability with verifiable data
For Builders:
- Build custom tools on top of the metrics API
- Create specialized dashboards for specific use cases
- Integrate network health into their applications
This is the data layer that will power intelligent workload routing. When a gateway needs to select an orchestrator, it wonβt be guessing β itβll be querying live performance data.
Timeline & Milestones
Duration: ~6 months (work began November 2025)
| Milestone | Target | Status |
|---|---|---|
| M1: Metrics Collection & Aggregation | February 2026 | π‘ In Progress |
| M2: Test Signals & Derived Analytics | March 2026 | Upcoming |
| M3: Stabilization & Review | April 2026 | Upcoming |
Milestone 1 Progress (Current Focus)
Weβve built the initial infrastructure for data ingestion, processing, and deployment:
- β Event ingestion pipeline (Kafka + Flink + ClickHouse)
- β Schema design for all 7 event types
- β Basic Grafana dashboard provisioning
- β End-to-end data flow validated
- π E2E latency calculation (stream trace correlation)
- π DLQ (Dead Letter Queue) for failed event parsing
Code: Cloud-SPE/livepeer-naap-analytics
Project Board: GitHub Projects
Milestone Tracking: GitHub Milestone
The Bigger Picture: Network-as-a-Product Vision
This work is part of a broader vision from Livepeer Inc and the Livepeer Foundation to transform the protocol into a true Network-as-a-Product (NaaP).
The NaaP vision defines three core components:
- Permissionless Livepeer Protocol β Orchestrators enroll GPUs with clear SLA requirements and compensation structures
- Public Monitorable SLA Framework β Users get assurance that inference requests meet pre-agreed, published SLAs
- Workload Management Utility β Tools for deploying, executing, analyzing, and managing AI workloads
What weβre building (Milestone 1) is the foundation for Component #2 β the measurement layer that makes everything else possible.
Future Milestones (Beyond This Proposal)
The NaaP roadmap extends well beyond metrics:
- Milestone 2: SLA-based scoring, selection algorithms, and incentive frameworks
- Milestone 3: Workload control plane with deployment, lifecycle, and security management
- Milestone 4: Complex multi-GPU workload handling with cluster-based redundancy
Our job right now is to get the fundamentals right. You canβt build SLA-aware routing without SLA data. You canβt score orchestrators without metrics. You canβt scale intelligently without visibility.
Why This Matters
Livepeer is transitioning from a transcoding network to a full AI video compute platform. Thatβs a massive shift.
But you canβt run a production-grade network without production-grade observability. SLAs require data. Optimization requires benchmarks. Trust requires transparency.
This infrastructure lays the groundwork for:
- Gateway SLAs β Providers can offer quality guarantees backed by real data
- Orchestrator accountability β Operators can prove their performance
- Demand routing β Route jobs to the best-performing GPUs
- Network visibility β Everyone can see how the network behaves
As the proposal states: βBy focusing on shared measurement rather than enforcement or protocol change, this work aims to give the Livepeer ecosystem a common understanding of network behavior today β and a solid foundation for deciding what to build next.β
Get Involved
This is a Cloud SPE initiative, but the code is open and contributions are welcome.
- On-Chain Proposal: Livepeer Explorer
- Forum Thread: forum.livepeer.org
- Code: github.com/Cloud-SPE/livepeer-naap-analytics
- Project Board: Cloud-SPE Projects
If youβre an orchestrator operator, a gateway provider, or just interested in decentralized infrastructure β Iβd love to hear your thoughts.
This is my primary focus over the next few weeks. Building the foundation. Shipping milestones. Making the network measurable.
Letβs keep the conversation going!
Share your thoughts or ask questions on Twitter (now X.com) at @mikezupper.