Build a Text-to-Image Generation Service with the Reve 2.0 API

On June 3, 2026, Reve 2.0 jumped to second place on the text-to-image Arena leaderboard. A leaderboard position is a benchmark, not a product.

AI Automation Coding Engineering

June 19, 20267 min read12 sectionsBy Ahmed Abdullah

Build a Text-to-Image Generation Service with the Reve 2.0 API

Introduction

On June 3, 2026, Reve 2.0 jumped to second place on the text-to-image Arena leaderboard. A leaderboard position is a benchmark, not a product. What turns a strong model into something your application can use is the boring layer around it: an endpoint to call, a queue to absorb the load, and durable storage for whatever comes back. This tutorial builds that layer. By the end you will have a FastAPI service that accepts a prompt, calls Reve 2.0, and returns a stored image URL, with the slow generation handled out of the request path. About 130 lines.

Most teams wire an image model into the request-response cycle, the user waits while a model generates, and then wonder why the endpoint times out under load. Generation is slow and bursty. The service has to be built around that fact, not in spite of it.

What Reve 2.0 is

Reve 2.0 is a text-to-image model exposed through a REST API. You send a prompt and parameters, you get back an image. The Arena ranking tells you the output quality is competitive with the top closed models as of June 2026. For this build, treat it as what it is: an HTTP endpoint that takes seconds, not milliseconds, to respond. Everything about the service design follows from that latency.

What a generation service actually needs

A real text-to-image service is three things the model is not. A submit step that returns immediately with a job id, so the client is not holding a connection open for ten seconds. A worker that does the slow call and stores the result. A status step the client polls or subscribes to. Skip any of these and you have a demo that falls over the first time two people click generate at once.

Set up the project

code

python -m venv venv && source venv/bin/activate
pip install fastapi uvicorn requests
export REVE_API_KEY="your-reve-api-key"

FastAPI for the service, requests for the Reve 2.0 call, uvicorn to run it. No database in this version; jobs live in memory so you can see the whole machine in one file.

Call Reve 2.0

The first build step is the model call itself, isolated in one function. Keep the transport layer separate from the web layer so you can test it on its own.

python

import os
import requests
REVE_ENDPOINT = "https://api.reve.com/v1/image/generations"
def generate_image(prompt: str, width: int = 1024, height: int = 1024) -> bytes:
response = requests.post(
REVE_ENDPOINT,
headers={"Authorization": f"Bearer {os.environ['REVE_API_KEY']}"},
json={
"model": "reve-2.0",
"prompt": prompt,
"width": width,
"height": height,
},
timeout=120,
)
response.raise_for_status()
image_url = response.json()["data"][0]["url"]
return requests.get(image_url, timeout=60).content

The timeout=120 is generous on purpose. Image generation under load can sit in a queue on the provider side, and a 30-second default timeout will fail valid requests during a traffic spike. Confirm the exact request shape against the current Reve API reference; the pattern is what transfers, not the field names.

Store the result somewhere durable

Returning raw image bytes from your API is a mistake that surfaces later as memory pressure and uncacheable responses. Write the image once, hand back a URL. This version writes to local disk; the same function signature swaps to S3 with three lines.

python

import uuid
from pathlib import Path
OUTPUT_DIR = Path("./generated")
OUTPUT_DIR.mkdir(exist_ok=True)
def store_image(image_bytes: bytes) -> str:
name = f"{uuid.uuid4().hex}.png"
(OUTPUT_DIR / name).write_bytes(image_bytes)
return f"/images/{name}"

A UUID filename avoids two requests overwriting each other, which a timestamp-based name does not guarantee under concurrency.

Move generation off the request path

This is the step most tutorials skip, and it is the one that makes the service hold up. FastAPI's BackgroundTasks runs the slow work after the response is already sent. The submit endpoint returns a job id in milliseconds; the worker fills in the result.

python

from fastapi import FastAPI, BackgroundTasks
from pydantic import BaseModel
app = FastAPI()
JOBS: dict[str, dict] = {}
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
def run_job(job_id: str, req: GenerateRequest):
try:
image_bytes = generate_image(req.prompt, req.width, req.height)
JOBS[job_id] = {"status": "done", "url": store_image(image_bytes)}
except Exception as exc: # the worker must never die silently
JOBS[job_id] = {"status": "failed", "error": str(exc)}
@app.post("/generate")
def submit(req: GenerateRequest, background: BackgroundTasks):
job_id = uuid.uuid4().hex
JOBS[job_id] = {"status": "pending"}
background.add_task(run_job, job_id, req)
return {"job_id": job_id}

Catching Exception in the worker is what keeps a failed Reve call from leaving a job stuck on "pending" forever. A failed job that says so is recoverable. A silent one is a support ticket.

Add the status endpoint

The client submitted a prompt and got a job id. Now it needs to ask whether the image is ready.

python

from fastapi import HTTPException
@app.get("/status/{job_id}")
def status(job_id: str):
job = JOBS.get(job_id)
if job is None:
raise HTTPException(status_code=404, detail="unknown job")
return job

A client polls GET /status/{job_id} every second or two until it sees done with a URL, or failed with a reason. That is the whole contract.

Reve 2.0 API vs self-hosting a diffusion model

The build is identical either way. What differs is who runs the GPU.

Dimension	Reve 2.0 API	Self-hosted diffusion model
Time to first image	Minutes	Days, plus a GPU
Cost at low volume	Pay per image	Idle GPU you pay for anyway
Cost at high volume	Scales with usage	Fixed once saturated
Output quality	Arena #2, no tuning	Depends on the model you pick
Control over the model	None	Full, including fine-tunes

The API wins until your volume is steady and high enough that a rented GPU stays busy. Because the slow call sits behind one function, the day the economics flip you replace generate image and the rest of the service does not change.

What this service does not do

In memory JOBS means a restart loses every pending and completed job, so production needs Redis or a database behind that dict. There is no rate limiting, so one client can saturate your Reve quota and starve everyone else. There is no content filtering on the prompt, which for a public endpoint is not optional. And Background Tasks runs in the same process, so true scale needs a real queue like Celery or RQ. This is the smallest thing that demonstrates the right shape, not the thing you expose to the open internet.

When to use it

Build this when you need image generation inside a product and you do not want users staring at a spinner on a held-open connection: a marketing tool that drafts visuals, an app that generates thumbnails, an internal service that turns briefs into concepts. The submit-and-poll shape is the part that survives contact with real traffic, and it is worth getting right even for a prototype, because retrofitting async onto a synchronous endpoint is more work than building it async from the start. Keep the model call behind one function, keep the storage behind another, and the service outlives any single model's turn at the top of the leaderboard.

Full working example

python

import os
import uuid
import requests
from pathlib import Path
from fastapi import FastAPI, BackgroundTasks, HTTPException
from pydantic import BaseModel
REVE_ENDPOINT = "https://api.reve.com/v1/image/generations"
OUTPUT_DIR = Path("./generated")
OUTPUT_DIR.mkdir(exist_ok=True)
app = FastAPI()
JOBS: dict[str, dict] = {}
class GenerateRequest(BaseModel):
prompt: str
width: int = 1024
height: int = 1024
def generate_image(prompt: str, width: int, height: int) -> bytes:
resp = requests.post(
REVE_ENDPOINT,
headers={"Authorization": f"Bearer {os.environ['REVE_API_KEY']}"},
json={"model": "reve-2.0", "prompt": prompt,
"width": width, "height": height},
timeout=120,
)
resp.raise_for_status()
url = resp.json()["data"][0]["url"]
return requests.get(url, timeout=60).content
def store_image(image_bytes: bytes) -> str:
name = f"{uuid.uuid4().hex}.png"
(OUTPUT_DIR / name).write_bytes(image_bytes)
return f"/images/{name}"
def run_job(job_id: str, req: GenerateRequest):
try:
data = generate_image(req.prompt, req.width, req.height)
JOBS[job_id] = {"status": "done", "url": store_image(data)}
except Exception as exc:
JOBS[job_id] = {"status": "failed", "error": str(exc)}
@app.post("/generate")
def submit(req: GenerateRequest, background: BackgroundTasks):
job_id = uuid.uuid4().hex
JOBS[job_id] = {"status": "pending"}
background.add_task(run_job, job_id, req)
return {"job_id": job_id}
@app.get("/status/{job_id}")
def status(job_id: str):
job = JOBS.get(job_id)
if job is None:
raise HTTPException(status_code=404, detail="unknown job")
return job

Run it with uvicorn main:app --reload, POST a prompt to /generate, poll /status/{job_id}, and you have a text-to-image service built around Reve 2.0 that does not fall over the moment two requests arrive together.

Keep reading from the journal.

July 6, 2026

Test the merge, not the branch

Merge queues test the world that actually ships

A red and black control panel with rows of numbered connector ports

July 27, 2026

Engineering

Six of your four hundred security alerts are real

Call-graph reachability analysis ranks a security backlog by whether the vulnerable code can ever run, turning 417 alerts into six with evidence.

Build a Screenshot-to-React Service with Kimi K2.7 Code HighSpeed

July 6, 2026

Engineering

Build a Screenshot-to-React Service with Kimi K2.7 Code HighSpeed

In late June 2026, Moonshot AI added a HighSpeed serving tier for Kimi K2.7 Code, the one-trillion-parameter open-weight coding model it published under a Modified MIT license on June 12.