Build a Text-to-Image Generation Service with the Reve 2.0 API
On June 3, 2026, Reve 2.0 jumped to second place on the text-to-image Arena leaderboard. A leaderboard position is a benchmark, not a product.

Introduction
On June 3, 2026, Reve 2.0 jumped to second place on the text-to-image Arena leaderboard. A leaderboard position is a benchmark, not a product. What turns a strong model into something your application can use is the boring layer around it: an endpoint to call, a queue to absorb the load, and durable storage for whatever comes back. This tutorial builds that layer. By the end you will have a FastAPI service that accepts a prompt, calls Reve 2.0, and returns a stored image URL, with the slow generation handled out of the request path. About 130 lines.
Most teams wire an image model into the request-response cycle, the user waits while a model generates, and then wonder why the endpoint times out under load. Generation is slow and bursty. The service has to be built around that fact, not in spite of it.
What Reve 2.0 is
Reve 2.0 is a text-to-image model exposed through a REST API. You send a prompt and parameters, you get back an image. The Arena ranking tells you the output quality is competitive with the top closed models as of June 2026. For this build, treat it as what it is: an HTTP endpoint that takes seconds, not milliseconds, to respond. Everything about the service design follows from that latency.
What a generation service actually needs
A real text-to-image service is three things the model is not. A submit step that returns immediately with a job id, so the client is not holding a connection open for ten seconds. A worker that does the slow call and stores the result. A status step the client polls or subscribes to. Skip any of these and you have a demo that falls over the first time two people click generate at once.
Set up the project
python -m venv venv && source venv/bin/activate
pip install fastapi uvicorn requests
export REVE_API_KEY="your-reve-api-key"FastAPI for the service, requests for the Reve 2.0 call, uvicorn to run it. No database in this version; jobs live in memory so you can see the whole machine in one file.
Call Reve 2.0
The first build step is the model call itself, isolated in one function. Keep the transport layer separate from the web layer so you can test it on its own.
import os
import requests
REVE_ENDPOINT = "https://api.reve.com/v1/image/generations"
def generate_image(prompt: str, width: int = 1024, height: int = 1024) -> bytes:
response = requests.post(
REVE_ENDPOINT,
headers={"Authorization": f"Bearer {os.environ['REVE_API_KEY']}"},
json={
"model": "reve-2.0",
"prompt": prompt,
"width": width,
"height": height,
},
timeout=120,
)
response.raise_for_status()
image_url = response.json()["data"][0]["url"]
return requests.get(image_url, timeout=60).contentThe timeout=120 is generous on purpose. Image generation under load can sit in a queue on the provider side, and a 30-second default timeout will fail valid requests during a traffic spike. Confirm the exact request shape against the current Reve API reference; the pattern is what transfers, not the field names.
Store the result somewhere durable
Returning raw image bytes from your API is a mistake that surfaces later as memory pressure and uncacheable responses. Write the image once, hand back a URL. This version writes to local disk; the same function signature swaps to S3 with three lines.
import uuid
from pathlib import Path
OUTPUT_DIR = Path("./generated")
OUTPUT_DIR.mkdir(exist_ok=True)
def store_image(image_bytes: bytes) -> str:
name = f"{uuid.uuid4().hex}.png"
(OUTPUT_DIR / name).write_bytes(image_bytes)
return f"/images/{name}"A UUID filename avoids two requests overwriting each other, which a timestamp-based name does not guarantee under concurrency.
Move generation off the request path
This is the step most tutorials skip, and it is the one that makes the service hold up. FastAPI's BackgroundTasks runs the slow work after the response is already sent. The submit endpoint returns a job id in milliseconds; the worker fills in the result.
from fastapi import FastAPI, BackgroundTasks
from pydantic import BaseModel
app = FastAPI()
JOBS: dict[str, dict] = {}
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
def run_job(job_id: str, req: GenerateRequest):
try:
image_bytes = generate_image(req.prompt, req.width, req.height)
JOBS[job_id] = {"status": "done", "url": store_image(image_bytes)}
except Exception as exc: # the worker must never die silently
JOBS[job_id] = {"status": "failed", "error": str(exc)}
@app.post("/generate")
def submit(req: GenerateRequest, background: BackgroundTasks):
job_id = uuid.uuid4().hex
JOBS[job_id] = {"status": "pending"}
background.add_task(run_job, job_id, req)
return {"job_id": job_id}Catching Exception in the worker is what keeps a failed Reve call from leaving a job stuck on "pending" forever. A failed job that says so is recoverable. A silent one is a support ticket.
Add the status endpoint
The client submitted a prompt and got a job id. Now it needs to ask whether the image is ready.
from fastapi import HTTPException
@app.get("/status/{job_id}")
def status(job_id: str):
job = JOBS.get(job_id)
if job is None:
raise HTTPException(status_code=404, detail="unknown job")
return jobA client polls GET /status/{job_id} every second or two until it sees done with a URL, or failed with a reason. That is the whole contract.
Reve 2.0 API vs self-hosting a diffusion model
The build is identical either way. What differs is who runs the GPU.
| Dimension | Reve 2.0 API | Self-hosted diffusion model |
|---|---|---|
| Time to first image | Minutes | Days, plus a GPU |
| Cost at low volume | Pay per image | Idle GPU you pay for anyway |
| Cost at high volume | Scales with usage | Fixed once saturated |
| Output quality | Arena #2, no tuning | Depends on the model you pick |
| Control over the model | None | Full, including fine-tunes |
The API wins until your volume is steady and high enough that a rented GPU stays busy. Because the slow call sits behind one function, the day the economics flip you replace generate image and the rest of the service does not change.
What this service does not do
In memory JOBS means a restart loses every pending and completed job, so production needs Redis or a database behind that dict. There is no rate limiting, so one client can saturate your Reve quota and starve everyone else. There is no content filtering on the prompt, which for a public endpoint is not optional. And Background Tasks runs in the same process, so true scale needs a real queue like Celery or RQ. This is the smallest thing that demonstrates the right shape, not the thing you expose to the open internet.
When to use it
Build this when you need image generation inside a product and you do not want users staring at a spinner on a held-open connection: a marketing tool that drafts visuals, an app that generates thumbnails, an internal service that turns briefs into concepts. The submit-and-poll shape is the part that survives contact with real traffic, and it is worth getting right even for a prototype, because retrofitting async onto a synchronous endpoint is more work than building it async from the start. Keep the model call behind one function, keep the storage behind another, and the service outlives any single model's turn at the top of the leaderboard.
Full working example
import os
import uuid
import requests
from pathlib import Path
from fastapi import FastAPI, BackgroundTasks, HTTPException
from pydantic import BaseModel
REVE_ENDPOINT = "https://api.reve.com/v1/image/generations"
OUTPUT_DIR = Path("./generated")
OUTPUT_DIR.mkdir(exist_ok=True)
app = FastAPI()
JOBS: dict[str, dict] = {}
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
def generate_image(prompt: str, width: int, height: int) -> bytes:
resp = requests.post(
REVE_ENDPOINT,
headers={"Authorization": f"Bearer {os.environ['REVE_API_KEY']}"},
json={"model": "reve-2.0", "prompt": prompt,
"width": width, "height": height},
timeout=120,
)
resp.raise_for_status()
url = resp.json()["data"][0]["url"]
return requests.get(url, timeout=60).content
def store_image(image_bytes: bytes) -> str:
name = f"{uuid.uuid4().hex}.png"
(OUTPUT_DIR / name).write_bytes(image_bytes)
return f"/images/{name}"
def run_job(job_id: str, req: GenerateRequest):
try:
data = generate_image(req.prompt, req.width, req.height)
JOBS[job_id] = {"status": "done", "url": store_image(data)}
except Exception as exc:
JOBS[job_id] = {"status": "failed", "error": str(exc)}
@app.post("/generate")
def submit(req: GenerateRequest, background: BackgroundTasks):
job_id = uuid.uuid4().hex
JOBS[job_id] = {"status": "pending"}
background.add_task(run_job, job_id, req)
return {"job_id": job_id}
@app.get("/status/{job_id}")
def status(job_id: str):
job = JOBS.get(job_id)
if job is None:
raise HTTPException(status_code=404, detail="unknown job")
return jobRun it with uvicorn main:app --reload, POST a prompt to /generate, poll /status/{job_id}, and you have a text-to-image service built around Reve 2.0 that does not fall over the moment two requests arrive together.
You might also like
Keep reading from the journal.
June 16, 2026AI
The meter was always coming
We built an agent for a client that wakes up on every pull request. It reads the diff, checks it against the rules, leaves a comment, goes back to sleep. It has done this hundreds of times a week for months.
June 18, 2026AI
The Demo Was the Easy Part
The AI demo that raised your round is the cheapest thing you will build all year. The model is a weekend; the product is a year of auth, billing, and retries that never demo. Here is how to sort the roadmap so the founder only builds the part that only the founder can.
June 17, 2026AI
Build a Multimodal Invoice-to-JSON Extractor with Gemini 3.5 Flash
At Google I/O 2026 Google shipped the Gemini 3.5 series, and the version that matters for document work is Gemini 3.5 Flash: image input plus enforced structured output in one call.