AfterImage
Generate synthetic multi-turn chat datasets from your documents. Point AfterImage at a corpus — PDFs, markdown files, internal wikis — and get SFT-ready JSONL out.
Open Source
AfterImage is free and open source. View on GitHub →
Why AfterImage
When you fine-tune a model on a specific domain, you usually have the documents but not the conversations. Generating flat Q&A pairs is easy — but models trained on single-turn data sound like templates. They can't handle follow-ups, clarifications, or how users actually navigate a topic.
AfterImage generates multi-turn dialogues grounded strictly in your source documents. A simulated user (Correspondent) and a simulated assistant (Respondent) hold a real back-and-forth, both anchored to context chunks from your corpus.
Honest caveats:
- Quality degrades past 5–6 turns — keep conversations short
- Weak models (< 70B class) generate repetitive loops — use GPT-4o, Claude 3.5, or a capable local 70B
- This doesn't replace frameworks like
distilabelfor general pipelines — it's a narrow tool for one thing
Installation
pip install afterimagePython 3.11+ required.
Optional extras:
pip install "afterimage[embeddings]" # quality gating via embeddings
pip install "afterimage[server]" # FastAPI / Gradio UI
pip install "afterimage[training]" # training tool integrationsQuick Start
CLI
# Generate conversations from a folder of documents
afterimage generate --documents ./my-docs/ --num-dialogs 100
# Export to ShareGPT format for Axolotl / Unsloth
afterimage export --format sharegpt --output dataset.jsonl
# Push to HuggingFace Hub
afterimage push --repo your-org/your-datasetPython API
from afterimage import ConversationGenerator
from afterimage.callbacks import ContextualInstructionGeneratorCallback
generator = ConversationGenerator(
respondent_prompt="You are a helpful assistant for {domain} questions.",
api_key="your-api-key",
model_name="gpt-4o",
)
dialogs = await generator.generate(
num_dialogs=50,
max_turns=4,
max_concurrency=10,
)Conversation Generation
The ConversationGenerator class drives the dual-agent loop.
Initialization
from afterimage import ConversationGenerator
from afterimage.callbacks import (
ContextualInstructionGeneratorCallback,
PersonaInstructionGeneratorCallback,
WithContextRespondentPromptModifier,
)
generator = ConversationGenerator(
respondent_prompt="You are a technical support agent for {product}.",
api_key="sk-...",
model_name="gpt-4o",
instruction_generator=ContextualInstructionGeneratorCallback(),
prompt_modifier=WithContextRespondentPromptModifier(),
)Generation Parameters
| Parameter | Default | Description |
|---|---|---|
num_dialogs | — | How many conversations to generate |
max_turns | 4 | Max turns per conversation (keep ≤ 5) |
max_concurrency | 10 | Parallel generation workers |
stopping_criteria | None | Function to end a conversation early |
Instruction Generators (Callbacks)
These control how the simulated user (Correspondent) is prompted at each turn.
| Callback | Use when |
|---|---|
ContextualInstructionGeneratorCallback | You want grounded, document-aware user questions |
PersonaInstructionGeneratorCallback | You want persona-varied users (different styles, expertise levels) |
ToolCallingInstructionGeneratorCallback | You're building tool-calling / function-call datasets |
Prompt Modifiers
These inject context into the Respondent (assistant) prompt.
| Modifier | Use when |
|---|---|
WithContextRespondentPromptModifier | You want the assistant grounded in document chunks |
WithRAGRespondentPromptModifier | You have a Qdrant vector store and want live retrieval |
Persona Variation
Without persona variation, your synthetic users all sound like the same prompt engineer. PersonaGenerator builds a diverse pool from your documents so conversations reflect different expertise levels, roles, and communication styles.
from afterimage import PersonaGenerator
persona_gen = PersonaGenerator(
api_key="sk-...",
model_name="gpt-4o",
)
# Generate personas from your corpus
personas = await persona_gen.generate_from_documents(
documents=["./docs/"],
target_count=50,
)Then pass them to the instruction generator:
from afterimage.callbacks import PersonaInstructionGeneratorCallback
generator = ConversationGenerator(
respondent_prompt="...",
api_key="sk-...",
model_name="gpt-4o",
instruction_generator=PersonaInstructionGeneratorCallback(personas=personas),
)How it works: AfterImage generates exactly 5 personas per LLM call, deduplicates, and retries. The pool size formula is S(n) = 5(5^(n+1) - 1) / 4. Deeper layers are pruned when oversupplied and depth-weighted when undersupplied.
Export
AfterImage outputs to 8 formats, covering every major fine-tuning framework.
CLI Export
afterimage export --format sharegpt --output dataset.jsonl
afterimage export --format alpaca --output alpaca.jsonl
afterimage export --format dpo --output dpo.jsonlSupported Formats
| Format | Use with |
|---|---|
sharegpt | Axolotl, LLaMA-Factory |
alpaca | General instruction tuning |
messages | OpenAI fine-tuning API |
oumi | Oumi training framework |
llama_factory | LLaMA-Factory |
openai | OpenAI Batch fine-tuning |
dpo | DPO / RLHF / ORPO preference training |
raw | Custom pipelines |
Axolotl Example
# axolotl.yaml
datasets:
- path: dataset.jsonl
type: sharegpt
conversation: humanUnsloth Example
from datasets import load_dataset
dataset = load_dataset("json", data_files="dataset.jsonl")Push to HuggingFace Hub
afterimage push \
--repo your-org/your-dataset \
--format sharegpt \
--privateLocal Models
AfterImage works with any OpenAI-compatible endpoint — no API key needed for local generation.
vLLM
# Start vLLM server
vllm serve Qwen/Qwen3-1.7B --port 8000generator = ConversationGenerator(
respondent_prompt="...",
base_url="http://localhost:8000/v1",
api_key="not-needed",
model_name="Qwen/Qwen3-1.7B",
provider="local",
)Ollama
ollama pull llama3.2
ollama servegenerator = ConversationGenerator(
respondent_prompt="...",
base_url="http://localhost:11434/v1",
api_key="ollama",
model_name="llama3.2",
)llama.cpp
./llama-server -m model.gguf --port 8000generator = ConversationGenerator(
respondent_prompt="...",
base_url="http://localhost:8000/v1",
api_key="not-needed",
model_name="local-model",
)Tips for local generation:
- Lower
max_concurrency(2–4) for CPU inference - Keep
max_turnsat 2–3 for smaller models - Avoid models below 7B — conversations collapse into repetition past turn 2
Supported Providers
| Provider | Setup |
|---|---|
| OpenAI | api_key="sk-..." |
| Gemini | api_key="...", model_name="gemini-1.5-pro" |
| DeepSeek | base_url="https://api.deepseek.com/v1" |
| OpenRouter | base_url="https://openrouter.ai/api/v1" |
| Local | vLLM / Ollama / llama.cpp |
API Key Rotation
For high-volume generation, use SmartKeyPool to rotate across multiple keys:
from afterimage import SmartKeyPool
pool = SmartKeyPool(keys=["sk-1", "sk-2", "sk-3"])
generator = ConversationGenerator(
respondent_prompt="...",
api_key=pool,
model_name="gpt-4o",
)Preference Data (DPO / RLHF)
Generate chosen/rejected pairs for preference training:
afterimage preference \
--documents ./docs/ \
--num-pairs 200 \
--output dpo_dataset.jsonlfrom afterimage import PreferenceGenerator
pref_gen = PreferenceGenerator(
api_key="sk-...",
model_name="gpt-4o",
)
pairs = await pref_gen.generate(
documents=["./docs/"],
num_pairs=200,
)