Tensor LabsTENSORLABS

Build a Text-to-Image Generation Service with the Reve 2.0 API

On June 3, 2026, Reve 2.0 jumped to second place on the text-to-image Arena leaderboard. A leaderboard position is a benchmark, not a product.

June 19, 20267 min read12 sectionsBy Tensor Labs
Build a Text-to-Image Generation Service with the Reve 2.0 API

Introduction

On June 3, 2026, Reve 2.0 jumped to second place on the text-to-image Arena leaderboard. A leaderboard position is a benchmark, not a product. What turns a strong model into something your application can use is the boring layer around it: an endpoint to call, a queue to absorb the load, and durable storage for whatever comes back. This tutorial builds that layer. By the end you will have a FastAPI service that accepts a prompt, calls Reve 2.0, and returns a stored image URL, with the slow generation handled out of the request path. About 130 lines.

Most teams wire an image model into the request-response cycle, the user waits while a model generates, and then wonder why the endpoint times out under load. Generation is slow and bursty. The service has to be built around that fact, not in spite of it.

What Reve 2.0 is

Reve 2.0 is a text-to-image model exposed through a REST API. You send a prompt and parameters, you get back an image. The Arena ranking tells you the output quality is competitive with the top closed models as of June 2026. For this build, treat it as what it is: an HTTP endpoint that takes seconds, not milliseconds, to respond. Everything about the service design follows from that latency.

What a generation service actually needs

A real text-to-image service is three things the model is not. A submit step that returns immediately with a job id, so the client is not holding a connection open for ten seconds. A worker that does the slow call and stores the result. A status step the client polls or subscribes to. Skip any of these and you have a demo that falls over the first time two people click generate at once.

Set up the project

code
python -m venv venv && source venv/bin/activate
pip install fastapi uvicorn requests
export REVE_API_KEY="your-reve-api-key"

FastAPI for the service, requests for the Reve 2.0 call, uvicorn to run it. No database in this version; jobs live in memory so you can see the whole machine in one file.

Call Reve 2.0

The first build step is the model call itself, isolated in one function. Keep the transport layer separate from the web layer so you can test it on its own.

python
import os
import requests
REVE_ENDPOINT = "https://api.reve.com/v1/image/generations"
def generate_image(prompt: str, width: int = 1024, height: int = 1024) -> bytes:
response = requests.post(
REVE_ENDPOINT,
headers={"Authorization": f"Bearer {os.environ['REVE_API_KEY']}"},
json={
"model": "reve-2.0",
"prompt": prompt,
"width": width,
"height": height,
},
timeout=120,
)
response.raise_for_status()
image_url = response.json()["data"][0]["url"]
return requests.get(image_url, timeout=60).content

The timeout=120 is generous on purpose. Image generation under load can sit in a queue on the provider side, and a 30-second default timeout will fail valid requests during a traffic spike. Confirm the exact request shape against the current Reve API reference; the pattern is what transfers, not the field names.

Store the result somewhere durable

Returning raw image bytes from your API is a mistake that surfaces later as memory pressure and uncacheable responses. Write the image once, hand back a URL. This version writes to local disk; the same function signature swaps to S3 with three lines.

python
import uuid
from pathlib import Path
OUTPUT_DIR = Path("./generated")
OUTPUT_DIR.mkdir(exist_ok=True)
def store_image(image_bytes: bytes) -> str:
name = f"{uuid.uuid4().hex}.png"
(OUTPUT_DIR / name).write_bytes(image_bytes)
return f"/images/{name}"

A UUID filename avoids two requests overwriting each other, which a timestamp-based name does not guarantee under concurrency.

Move generation off the request path

This is the step most tutorials skip, and it is the one that makes the service hold up. FastAPI's BackgroundTasks runs the slow work after the response is already sent. The submit endpoint returns a job id in milliseconds; the worker fills in the result.

python
from fastapi import FastAPI, BackgroundTasks
from pydantic import BaseModel
app = FastAPI()
JOBS: dict[str, dict] = {}
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
def run_job(job_id: str, req: GenerateRequest):
try:
image_bytes = generate_image(req.prompt, req.width, req.height)
JOBS[job_id] = {"status": "done", "url": store_image(image_bytes)}
except Exception as exc: # the worker must never die silently
JOBS[job_id] = {"status": "failed", "error": str(exc)}
@app.post("/generate")
def submit(req: GenerateRequest, background: BackgroundTasks):
job_id = uuid.uuid4().hex
JOBS[job_id] = {"status": "pending"}
background.add_task(run_job, job_id, req)
return {"job_id": job_id}

Catching Exception in the worker is what keeps a failed Reve call from leaving a job stuck on "pending" forever. A failed job that says so is recoverable. A silent one is a support ticket.

Add the status endpoint

The client submitted a prompt and got a job id. Now it needs to ask whether the image is ready.

python
from fastapi import HTTPException
@app.get("/status/{job_id}")
def status(job_id: str):
job = JOBS.get(job_id)
if job is None:
raise HTTPException(status_code=404, detail="unknown job")
return job

A client polls GET /status/{job_id} every second or two until it sees done with a URL, or failed with a reason. That is the whole contract.

Reve 2.0 API vs self-hosting a diffusion model

The build is identical either way. What differs is who runs the GPU.

DimensionReve 2.0 APISelf-hosted diffusion model
Time to first imageMinutesDays, plus a GPU
Cost at low volumePay per imageIdle GPU you pay for anyway
Cost at high volumeScales with usageFixed once saturated
Output qualityArena #2, no tuningDepends on the model you pick
Control over the modelNoneFull, including fine-tunes

The API wins until your volume is steady and high enough that a rented GPU stays busy. Because the slow call sits behind one function, the day the economics flip you replace generate image and the rest of the service does not change.

What this service does not do

In memory JOBS means a restart loses every pending and completed job, so production needs Redis or a database behind that dict. There is no rate limiting, so one client can saturate your Reve quota and starve everyone else. There is no content filtering on the prompt, which for a public endpoint is not optional. And Background Tasks runs in the same process, so true scale needs a real queue like Celery or RQ. This is the smallest thing that demonstrates the right shape, not the thing you expose to the open internet.

When to use it

Build this when you need image generation inside a product and you do not want users staring at a spinner on a held-open connection: a marketing tool that drafts visuals, an app that generates thumbnails, an internal service that turns briefs into concepts. The submit-and-poll shape is the part that survives contact with real traffic, and it is worth getting right even for a prototype, because retrofitting async onto a synchronous endpoint is more work than building it async from the start. Keep the model call behind one function, keep the storage behind another, and the service outlives any single model's turn at the top of the leaderboard.

Full working example

python
import os
import uuid
import requests
from pathlib import Path
from fastapi import FastAPI, BackgroundTasks, HTTPException
from pydantic import BaseModel
REVE_ENDPOINT = "https://api.reve.com/v1/image/generations"
OUTPUT_DIR = Path("./generated")
OUTPUT_DIR.mkdir(exist_ok=True)
app = FastAPI()
JOBS: dict[str, dict] = {}
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
def generate_image(prompt: str, width: int, height: int) -> bytes:
resp = requests.post(
REVE_ENDPOINT,
headers={"Authorization": f"Bearer {os.environ['REVE_API_KEY']}"},
json={"model": "reve-2.0", "prompt": prompt,
"width": width, "height": height},
timeout=120,
)
resp.raise_for_status()
url = resp.json()["data"][0]["url"]
return requests.get(url, timeout=60).content
def store_image(image_bytes: bytes) -> str:
name = f"{uuid.uuid4().hex}.png"
(OUTPUT_DIR / name).write_bytes(image_bytes)
return f"/images/{name}"
def run_job(job_id: str, req: GenerateRequest):
try:
data = generate_image(req.prompt, req.width, req.height)
JOBS[job_id] = {"status": "done", "url": store_image(data)}
except Exception as exc:
JOBS[job_id] = {"status": "failed", "error": str(exc)}
@app.post("/generate")
def submit(req: GenerateRequest, background: BackgroundTasks):
job_id = uuid.uuid4().hex
JOBS[job_id] = {"status": "pending"}
background.add_task(run_job, job_id, req)
return {"job_id": job_id}
@app.get("/status/{job_id}")
def status(job_id: str):
job = JOBS.get(job_id)
if job is None:
raise HTTPException(status_code=404, detail="unknown job")
return job

Run it with uvicorn main:app --reload, POST a prompt to /generate, poll /status/{job_id}, and you have a text-to-image service built around Reve 2.0 that does not fall over the moment two requests arrive together.