Refinery

Iterative prompt improvement powered by BRAID.

Refinery takes a draft prompt and automatically improves it through repeated A/B evaluation cycles. It keeps refining until the improved version consistently outperforms the original — no manual tuning required.

Use Refinery when you want to:

  • Improve a prompt without manually rewriting it
  • Find the best-performing variation of a prompt through automated iteration
  • Apply BRAID-structured reasoning to any prompt

What Is BRAID?

BRAID (Bounded Reasoning for Autonomous Inference and Decisions) is a structured reasoning framework that adds decision flowcharts to prompts. It helps LLMs follow complex logic more reliably by breaking reasoning into explicit steps.

If your draft doesn't already use BRAID, Refinery generates a BRAID flowchart for it automatically before starting the improvement loop.

How It Works

  1. Select a draft — Pick the prompt draft you want to improve.
  2. Configure the model — Choose the execution model, reasoning effort, and text verbosity for prompt generation.
  3. Generate variables — Raison auto-generates sample variable values from the prompt template. You can also enter values manually.
  4. Set output schema (optional) — Define a JSON schema for structured output. Raison can auto-generate one from the prompt content.
  5. Start the refinement — Refinery runs an automated loop:

The Improvement Loop

┌─────────────────────────────────────────────────┐
│  1. Generate BRAID flowchart (if needed)        │
│  2. Evaluate original prompt (baseline score)   │
│  3. Generate improved BRAID version             │
│  4. A/B evaluate: improved vs. original         │
│                    │                            │
│              ┌─────┴─────┐                      │
│              ▼           ▼                      │
│           BRAID       Original                  │
│           wins        wins                      │
│              │           │                      │
│              ▼           ▼                      │
│          Streak      Improver rewrites          │
│            +1        the BRAID prompt           │
│              │           │                      │
│              ▼           ▼                      │
│          3 wins      Loop continues             │
│          in a        with new version           │
│          row?                                   │
│              │                                  │
│              ▼                                  │
│           Done ✓                                │
└─────────────────────────────────────────────────┘
  • The original prompt is evaluated once and the output is cached as a fixed baseline.
  • Each iteration evaluates the current BRAID version against the original.
  • If the BRAID version wins, the consecutive win streak increments.
  • If the BRAID version loses, an improver model rewrites the BRAID prompt and the streak resets.
  • The loop stops when the BRAID version wins 3 consecutive times (configurable) or hits the maximum iteration limit.

Monitoring Progress

The detail view shows:

  • Current prompts — The original draft and the latest BRAID version side by side.
  • Charts — Scores, cost, tokens, and duration across iterations.
  • Iteration list — Navigate to any iteration to see outputs, scores, judge feedback, and the BRAID prompt used.
  • Win streak — Current consecutive wins toward the target.

You can pause a running refinement and resume it later.

Configuration

Setting Default Range Description
Consecutive wins required 3 1–10 How many consecutive wins before the refinement completes
Max iterations 30 1–100 Upper limit on iteration count
Execution model gpt-5-nano gpt-5-nano, gpt-5-mini, gpt-5, gpt-5.2 Model used to execute prompts
Reasoning effort Medium Varies by model How much the model reasons before responding
Text verbosity Medium Low, Medium, High Controls response length

The judge and improver models are fixed at gpt-5.2 with high reasoning effort to ensure accurate scoring and high-quality rewrites.

Refinement Lifecycle

Status Description
Pending Refinement created, waiting to start
Generating BRAID Building the BRAID flowchart for the prompt
Evaluating Original Scoring the original prompt to establish a baseline
Running Iterating: generating improvements and evaluating
Paused Manually paused; can be resumed
Completed Improvement loop finished (consistent winner found)
Failed An error occurred during refinement
Max Iterations Hit the iteration limit without a consistent winner

Billing

Refinery iterations consume AI Prompt Builder (BRAID) credits. Each iteration counts toward your BRAID message quota for the billing period.

Plan BRAID messages / seat
Free
Team 100
Team Plus 1,000
Enterprise Unlimited

When your BRAID message limit is reached, Refinery iterations will be blocked until the next billing period or until you upgrade your plan.

Access

Refinery is available on Team, Team Plus, and Enterprise plans.