Skip to content

AfterImage

Generate synthetic multi-turn chat datasets from your documents. Point AfterImage at a corpus — PDFs, markdown files, internal wikis — and get SFT-ready JSONL out.

Open Source

AfterImage is free and open source. View on GitHub →

Why AfterImage

When you fine-tune a model on a specific domain, you usually have the documents but not the conversations. Generating flat Q&A pairs is easy — but models trained on single-turn data sound like templates. They can't handle follow-ups, clarifications, or how users actually navigate a topic.

AfterImage generates multi-turn dialogues grounded strictly in your source documents. A simulated user (Correspondent) and a simulated assistant (Respondent) hold a real back-and-forth, both anchored to context chunks from your corpus.

Honest caveats:

  • Quality degrades past 5–6 turns — keep conversations short
  • Weak models (< 70B class) generate repetitive loops — use GPT-4o, Claude 3.5, or a capable local 70B
  • This doesn't replace frameworks like distilabel for general pipelines — it's a narrow tool for one thing

Installation

bash
pip install afterimage

Python 3.11+ required.

Optional extras:

bash
pip install "afterimage[embeddings]"   # quality gating via embeddings
pip install "afterimage[server]"       # FastAPI / Gradio UI
pip install "afterimage[training]"     # training tool integrations

Quick Start

CLI

bash
# Generate conversations from a folder of documents
afterimage generate --documents ./my-docs/ --num-dialogs 100

# Export to ShareGPT format for Axolotl / Unsloth
afterimage export --format sharegpt --output dataset.jsonl

# Push to HuggingFace Hub
afterimage push --repo your-org/your-dataset

Python API

python
from afterimage import ConversationGenerator
from afterimage.callbacks import ContextualInstructionGeneratorCallback

generator = ConversationGenerator(
    respondent_prompt="You are a helpful assistant for {domain} questions.",
    api_key="your-api-key",
    model_name="gpt-4o",
)

dialogs = await generator.generate(
    num_dialogs=50,
    max_turns=4,
    max_concurrency=10,
)

Conversation Generation

The ConversationGenerator class drives the dual-agent loop.

Initialization

python
from afterimage import ConversationGenerator
from afterimage.callbacks import (
    ContextualInstructionGeneratorCallback,
    PersonaInstructionGeneratorCallback,
    WithContextRespondentPromptModifier,
)

generator = ConversationGenerator(
    respondent_prompt="You are a technical support agent for {product}.",
    api_key="sk-...",
    model_name="gpt-4o",
    instruction_generator=ContextualInstructionGeneratorCallback(),
    prompt_modifier=WithContextRespondentPromptModifier(),
)

Generation Parameters

ParameterDefaultDescription
num_dialogsHow many conversations to generate
max_turns4Max turns per conversation (keep ≤ 5)
max_concurrency10Parallel generation workers
stopping_criteriaNoneFunction to end a conversation early

Instruction Generators (Callbacks)

These control how the simulated user (Correspondent) is prompted at each turn.

CallbackUse when
ContextualInstructionGeneratorCallbackYou want grounded, document-aware user questions
PersonaInstructionGeneratorCallbackYou want persona-varied users (different styles, expertise levels)
ToolCallingInstructionGeneratorCallbackYou're building tool-calling / function-call datasets

Prompt Modifiers

These inject context into the Respondent (assistant) prompt.

ModifierUse when
WithContextRespondentPromptModifierYou want the assistant grounded in document chunks
WithRAGRespondentPromptModifierYou have a Qdrant vector store and want live retrieval

Persona Variation

Without persona variation, your synthetic users all sound like the same prompt engineer. PersonaGenerator builds a diverse pool from your documents so conversations reflect different expertise levels, roles, and communication styles.

python
from afterimage import PersonaGenerator

persona_gen = PersonaGenerator(
    api_key="sk-...",
    model_name="gpt-4o",
)

# Generate personas from your corpus
personas = await persona_gen.generate_from_documents(
    documents=["./docs/"],
    target_count=50,
)

Then pass them to the instruction generator:

python
from afterimage.callbacks import PersonaInstructionGeneratorCallback

generator = ConversationGenerator(
    respondent_prompt="...",
    api_key="sk-...",
    model_name="gpt-4o",
    instruction_generator=PersonaInstructionGeneratorCallback(personas=personas),
)

How it works: AfterImage generates exactly 5 personas per LLM call, deduplicates, and retries. The pool size formula is S(n) = 5(5^(n+1) - 1) / 4. Deeper layers are pruned when oversupplied and depth-weighted when undersupplied.

Export

AfterImage outputs to 8 formats, covering every major fine-tuning framework.

CLI Export

bash
afterimage export --format sharegpt --output dataset.jsonl
afterimage export --format alpaca --output alpaca.jsonl
afterimage export --format dpo --output dpo.jsonl

Supported Formats

FormatUse with
sharegptAxolotl, LLaMA-Factory
alpacaGeneral instruction tuning
messagesOpenAI fine-tuning API
oumiOumi training framework
llama_factoryLLaMA-Factory
openaiOpenAI Batch fine-tuning
dpoDPO / RLHF / ORPO preference training
rawCustom pipelines

Axolotl Example

yaml
# axolotl.yaml
datasets:
  - path: dataset.jsonl
    type: sharegpt
    conversation: human

Unsloth Example

python
from datasets import load_dataset

dataset = load_dataset("json", data_files="dataset.jsonl")

Push to HuggingFace Hub

bash
afterimage push \
  --repo your-org/your-dataset \
  --format sharegpt \
  --private

Local Models

AfterImage works with any OpenAI-compatible endpoint — no API key needed for local generation.

vLLM

bash
# Start vLLM server
vllm serve Qwen/Qwen3-1.7B --port 8000
python
generator = ConversationGenerator(
    respondent_prompt="...",
    base_url="http://localhost:8000/v1",
    api_key="not-needed",
    model_name="Qwen/Qwen3-1.7B",
    provider="local",
)

Ollama

bash
ollama pull llama3.2
ollama serve
python
generator = ConversationGenerator(
    respondent_prompt="...",
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model_name="llama3.2",
)

llama.cpp

bash
./llama-server -m model.gguf --port 8000
python
generator = ConversationGenerator(
    respondent_prompt="...",
    base_url="http://localhost:8000/v1",
    api_key="not-needed",
    model_name="local-model",
)

Tips for local generation:

  • Lower max_concurrency (2–4) for CPU inference
  • Keep max_turns at 2–3 for smaller models
  • Avoid models below 7B — conversations collapse into repetition past turn 2

Supported Providers

ProviderSetup
OpenAIapi_key="sk-..."
Geminiapi_key="...", model_name="gemini-1.5-pro"
DeepSeekbase_url="https://api.deepseek.com/v1"
OpenRouterbase_url="https://openrouter.ai/api/v1"
LocalvLLM / Ollama / llama.cpp

API Key Rotation

For high-volume generation, use SmartKeyPool to rotate across multiple keys:

python
from afterimage import SmartKeyPool

pool = SmartKeyPool(keys=["sk-1", "sk-2", "sk-3"])

generator = ConversationGenerator(
    respondent_prompt="...",
    api_key=pool,
    model_name="gpt-4o",
)

Preference Data (DPO / RLHF)

Generate chosen/rejected pairs for preference training:

bash
afterimage preference \
  --documents ./docs/ \
  --num-pairs 200 \
  --output dpo_dataset.jsonl
python
from afterimage import PreferenceGenerator

pref_gen = PreferenceGenerator(
    api_key="sk-...",
    model_name="gpt-4o",
)

pairs = await pref_gen.generate(
    documents=["./docs/"],
    num_pairs=200,
)

Resources

Your model. Not theirs.