Enabling 8GB+ NVIDIA GPUs to Run LLM Workloads on the Livepeer Network

I just published a new, highly detailed post on the Livepeer Cloud SPE website — and it’s an important one for the ecosystem.

👉 Read the full guide here:
https://www.livepeer.cloud/self-hosting-livepeers-llm-pipeline-deploying-an-ollama-based-gpu-runner-for-ai-orchestrators/


What This Article Covers

This deep-dive shows how Livepeer AI Orchestrators can now run Large Language Model (LLM) inference on the Livepeer AI Network — even with older NVIDIA GPUs like the GTX 1080, 1070 Ti, 2060, and other 8–12GB VRAM cards.

Traditionally, the official Livepeer docs recommended GPUs with 16GB+ VRAM for AI workloads.
But as part of the Cloud SPE, I built and deployed a custom Ollama-based AI Runner that:

  • Works on 8GB+ GPUs
  • Is easy to deploy using Docker
  • Enables orchestrators to accept LLM inference jobs
  • Uses efficient Ollama quantized models to reduce VRAM usage
  • Lets underutilized GPUs get back into action — especially older cards previously dedicated to transcoding

This dramatically lowers the barrier for orchestrators to participate in the growing Livepeer AI ecosystem.


Why This Matters

Livepeer is expanding far beyond video transcoding — into:

  • AI video generation
  • LLM inference
  • Real-time AI Video
  • Image Generation
  • and much more!

Livepeer Node Operators already have idle GPUs capable of running these new workloads.
By making LLM inference viable on mid-range cards, we increase:

  • Network decentralization
  • Operator participation
  • Job throughput
  • Earnings opportunities

This is a huge step toward a broader, more inclusive GPU marketplace.


What You’ll Learn in the Full Article

The Livepeer Cloud article includes:

  • Full Docker Compose stack for the Ollama GPU runner
  • Detailed steps to deploy the runner
  • How to pull the Llama 3.1 8B model with Ollama
  • How to configure your AI Orchestrator to accept LLM jobs
  • Required settings for aiModels.json
  • How to confirm jobs are flowing correctly
  • What healthy logs look like (for both runner + Ollama)
  • GPU verification using nvidia-smi
  • An architecture diagram of the full pipeline
  • FAQ and tips for orchestrators

If you’re an operator or builder in the Livepeer ecosystem, this guide will help you start receiving LLM jobs in just a few minutes.


Want to Get Involved or Need Help?

If you’re interested in running AI workloads on your GPU — or have questions about the setup — you can join the Livepeer Discord:

👉 https://discord.gg/xpKATpA7

Ask for @mike_zoop in the #orchestrating channel.


Thanks for reading — and stay tuned for more GPU infrastructure guides, AI workflows, and Livepeer ecosystem content.