Tensor Labs

Schema before prompt

A stage director can rehearse the play to perfection. The actors hit their marks. The lighting cues are timed to the breath. But if the prop master walks out with a sword when the script calls for a letter, the scene collapses, and the audience walks out blaming the actor.`

May 5, 20263 min read5 sectionsBy Tensor Labs
Schema before prompt

Introduction

That is most “the model isn’t following instructions” tickets.

The actor is reading the right script

The instinct, when an LLM-powered feature returns the wrong answer, is to look at the prompt. Maybe it needs to be more specific. Maybe the system message should be longer. Maybe a few-shot example would help. The team rewrites. Tests pass on the cherry-picked example. Six hours later, a new class of input breaks again.

This pattern repeats because the prompt is the easy lever. It is the part of the system the engineer wrote, the part they understand, and the part they can change without a migration. Of course they reach for it.

But the model is rarely the misbehaving actor. It is reading the script you gave it. The problem is the props.

When a column holds three things

On a recent client project, one field in a production database had a generic name and a single declared type. In practice, it held free-text answers for some inputs, JSON arrays of selected options for others, and JSON blobs with nested objects for the rest. There was no discriminator. No type column to switch on. Nothing in the row indicated which of the three shapes the value would take.

The work was to write a prompt that classified errors based on the input. The request was reasonable. The data was not. No prompt, however carefully tuned, was going to read three different shapes from one column and consistently extract the same kind of feature. The model wasn’t failing. It was being asked to do a job the schema had made impossible.

A two-day migration to split the column into typed fields produced a bigger accuracy gain than a week of prompt iteration would have. The hard part of LLM-powered features is almost never the prompt. It is the data the prompt is asked to read

The lever you can actually pull

Most teams that complain their model isn’t following instructions have not opened a single sample of the data they are asking it to read. They have read the prompt fifty times. They have not looked at five rows of the raw input. (Anyone who has worked on LLM-powered features long enough has done this. So has every engineer they know.) The shape of the problem reveals itself in the data, not in the prompt log.

To be fair, prompt choice does matter at the margins. With a clean schema, the difference between a sloppy prompt and a careful one is real. And small teams without ML backgrounds reasonably reach for prompts first because that is the lever they understand. None of this is wrong. It just isn’t where the gain lives.

The gain lives in the schema. In the rename of an overloaded column. In the discriminator field nobody added because the original developer didn’t know the data would later be read by an LLM. In the migration that should have happened last quarter.

Walk into the prop room

The director can rehearse all night. If the prop master is handing out the wrong objects, the play does not improve. Send the director home for the evening. Walk into the prop room. Look at what is actually on the shelves. That is where the show gets fixed.

Plenty of practitioners have spent afternoons rewriting prompts that should have been afternoons of writing migrations. The model was always going to be fine. The schema was the show.