Artifact formats

An Artifact is the deliverable a stage produces. Typed, previewable, downloadable, shareable, and chainable across seven formats.

Why typing the output matters. "Write me a summary" and "write me a summary as Markdown with a frontmatter block listing 5 key facts as a YAML array" produce wildly different results. The format declaration teaches the agent what success looks like. It also enables the preview pane to do the right thing: render Markdown, lint JSON against a schema, syntax-highlight code, validate CSV columns.

How the agent produces an artifact

There are two extraction paths in the codebase, and they look superficially similar but apply in different contexts. Both end up at the same row in app_artifact; pick the one that matches where you are.

Path A: App stages (markdown-section extraction)

When a stage runs as part of an AppExecution, the executor builds a system prompt that ends with an ## Expected Outputs block, listing each artifact_defs entry as ### {{title}} with its format and description. The agent's response is then scanned by extractArtifactContent with this priority:

Match a markdown heading whose text equals the artifact title (case-insensitive, # through ####): take everything until the next heading of the same depth or higher.
For structured formats (code, json, html, csv): fall back to the first language-matched fenced code block.
If the stage declares exactly one artifact: fall back to the entire response text.

That's why the example below uses ## Stripe: Company Brief as the section heading rather than an XML tag: it's what the matcher actually looks for.

## Stripe: Company Brief

# Stripe: Company Brief

## Business model
Stripe runs payment infrastructure as a service...

## Recent moves
...

Each saved artifact gets an S3 key app-artifacts/{userId}/{executionId}/{stageId}/{defId}, a generated filename {appSlug}_{stageSlug}_{defSlug}.{ext}, and a MIME type derived from the format (text/markdown, application/json, text/csv, text/html, etc.). For code, the extension follows the language field on the def (python to .python, ts to .ts); for everything else, a fixed map applies. The raw content is also stored on the app_artifact row, with S3 as a fallback for large bodies.

What doesn't work in app stages. There is no {{ stages.<id>.<artifactId> }} reference inside the goal; the goal templater only substitutes form inputs. Prior-stage artifacts reach the next stage two other ways: (1) files left on the shared sandbox filesystem, and (2) the auto-injected ## Previous Stage Outputs block at the top of the system prompt.

Path B: Chat stream (XML-tag extraction)

In free-form chat, the agent emits artifacts as XML tags directly in its output stream. The streaming parser (artifact/parser.ts) is chunk-boundary safe and reads three attributes:

<artifact title="Stripe: Company Brief" type="markdown" language="">
# Stripe: Company Brief

## Business model
Stripe runs payment infrastructure as a service...
</artifact>

Recognized attributes are title, type, and language. There is no id or format attribute on the chat-stream tag; the parser ignores anything else. An unclosed tag at end-of-stream is surfaced as literal text rather than silently dropped.

Picking the right format

Output is…	Use
A document a human reads	`markdown`
Data another machine consumes	`json` or `csv`
A page or dashboard	`html`
Source code (one file or a tree)	`code`
A picture or diagram	`image`
Anything else (PDF, ZIP, audio, video)	`file`

You set the format on the stage's artifact_defs entry. The agent will conform its output to that type or fail loudly if it can't, which is what you want.

markdown most common

Reports, summaries, briefs, documentation, drafts. The default for prose. Roughly 60% of artifacts in production are markdown.

Preview

Rendered with standard CommonMark + GFM extensions: tables, task lists, code fences with syntax highlighting. Math (KaTeX) and Mermaid diagrams render inline. Click any heading for a stable anchor link.

Export

.md: raw markdown source.
PDF: server-rendered with the workspace's brand stylesheet.
DOCX: Pandoc conversion; styles map to Word headings.

Chaining

Feed a markdown artifact into the next stage when the next step is summarization, translation, or restructuring the same content. The receiving stage can read the markdown verbatim or extract sections using prompt instructions.

code single or tree

Source files in any language. Used by Apps that generate scripts, configuration, migrations, tests, or whole project skeletons.

Preview

Syntax-highlighted view. The language is detected from the file extension or specified on the stage's artifact_def.language. Line numbers, copy button, fold blocks.

Multi-file output

A single code artifact can hold a tree of files: useful when the App generates a small project rather than a single file:

<artifact id="project" format="code">
<file path="src/main.ts">
import { greet } from './greet';
console.log(greet('world'));
</file>
<file path="src/greet.ts">
export const greet = (name: string) => `Hello, ${name}!`;
</file>
<file path="package.json">
{ "name": "demo", "version": "0.0.1" }
</file>
</artifact>

The preview shows the file tree on the left, selected file on the right. Download as a single file or as a ZIP of the whole tree.

html self-contained pages

Standalone web pages, dashboards, slide decks, branded reports. Rendered fully in the preview pane.

Preview

An iframe with the HTML rendered as-is, scripts allowed. The sandbox is strict: no access to the parent page, no access to your Connects, no cookies, no localStorage carry-over. Safe to share via public link.

Export

.html: single self-contained file.
PDF: server-side print to PDF.
PNG/JPEG screenshot: full-page or above-the-fold.

When this is the right type

Whenever you want a self-contained visual output that a human can open in a browser without any tooling. Common for: customer-facing reports, dashboard snapshots, weekly digests with charts, conference deck export.

json machine-to-machine

Structured data. Used when the output is going to be consumed by another system rather than read by a human, or when the next stage in the App wants to read specific fields rather than re-parsing free text.

Preview

A collapsible tree view with type information. If the stage declares a schema in artifact_def.schema, the preview validates and shows any drift inline.

Schema declaration

{
  "id": "extracted",
  "format": "json",
  "schema": {
    "type": "object",
    "properties": {
      "company": { "type": "string" },
      "founded_year": { "type": "number" },
      "employees": { "type": "number" }
    },
    "required": ["company"]
  }
}

The schema gives the agent guardrails: it knows what shape to produce, and the preview gets a validator for free.

Chaining

Multi-stage Apps almost always use json between stages: the next stage can read individual fields rather than re-parsing free text.

csv tabular

Tabular data. The bread-and-butter format for lead lists, account snapshots, financial line items, survey responses.

Preview

A spreadsheet-style table with sortable columns, inline column filters, and a count of rows. Wide tables get horizontal scroll; tall ones get virtual scrolling.

Export

.csv: RFC 4180 format with header row.
.xlsx: Excel file. Types are preserved when possible.
Push to Google Sheets: if Drive Connect is authorized.

Schema enforcement

If the stage declares column names and types, the agent's output is validated row-by-row. Rows that don't match are flagged in the preview with their reason. You can fix-in-place: edit the row, save, the run gets the corrected version when chained downstream.

image visual

Generated or rendered images. Diagrams, charts, marketing assets, mockups.

Preview

Full-size image with zoom and pan. Supports PNG, JPEG, SVG, and WebP.

How it's produced

Either:

By the generate_image tool (text-to-image via the configured provider, typically Gemini's image API if the host has a key).
By code in the agent's sandbox: Matplotlib charts, Graphviz diagrams, Pillow compositions, Playwright screenshots.

Export

Download in the source format.
Re-render at different resolutions for raster outputs.
For SVG: download as PNG / PDF rasterizations as well.

file escape hatch

Anything that doesn't fit one of the other formats: PDFs, ZIPs, audio, video, binary formats. If you can't decide what type to use, file is the safe default.

Preview

Type-aware:

PDF: renders inline with a built-in viewer.
Audio: HTML5 player with waveform.
Video: HTML5 player with scrub.
ZIP: file listing; click any entry to preview it as its own artifact.
Unknown: shows metadata (size, MIME) and a download button.

Export

Direct download. Files keep their original name and MIME type when delivered to email, Slack, or Drive. The MIME type is stored alongside the artifact in app_artifact.mime_type.

Chaining artifacts between stages

Each stage's artifact is automatically available to the next stage as a named reference. The receiving stage's goal references it directly:

# Stage 2 goal referencing Stage 1's artifact
Take the CSV from Stage 1 ({{ stages.find.leads }}) and
enrich each row with the company's funding history.
Output the same CSV with three new columns:
total_funding, last_round, last_round_date.

The variable resolves to the previous stage's output, in whatever format it was. The agent figures out how to read it based on the type: for CSV, it reads as a table; for JSON, as a structured object; for markdown, as a document. You don't need to add any parsing logic.

How artifacts are stored: the persistence layout

Once the executor extracts content (Path A or Path B above), it writes two places: a row in the app_artifact table and an object in the configured S3 bucket. The two are designed to outlive each other: DB content is the fast path, S3 is the durable copy and the source of truth for anything bulky.

The S3 key shape

Every artifact emitted from an App stage lands at a deterministic key:

app-artifacts/{user_id}/{execution_id}/{stage_id}/{def_id}

The shape gives you predictable batched access: every artifact for one run shares the app-artifacts/{user_id}/{execution_id}/ prefix; every artifact a specific stage ever produced (across runs of the same App) is reachable by listing under app-artifacts/{user_id}/*/{stage_id}/{def_id}. Chat-emitted artifacts use a parallel prefix (artifacts/{user_id}/.../{file_name}) since they don't have an execution context.

Filename convention

The row's file_name is built deterministically from slugged App, stage, and def titles: {app_slug}_{stage_slug}_{def_slug}.{ext}. The extension comes from the format-to-ext map below, except code uses the def's language as the extension verbatim (so a Python code artifact ends up as analysis.python, not analysis.py; that's intentional for round-tripping the type).

Format / MIME / extension reference

`format`	MIME stored on the row	Default extension
`markdown`	`text/markdown`	`md`
`json`	`application/json`	`json`
`html`	`text/html`	`html`
`csv`	`text/csv`	`csv`
`code`	`text/plain`	def's `language` (e.g. `python`, `ts`)
`image`	`application/octet-stream`	`txt` (override on save)
`file` & everything else	`text/plain`	`txt`

image and file default to a generic MIME because the actual content type isn't known until the bytes are inspected; the upload pipeline can override mime_type on the row after sniffing.

Inline content vs S3 fallback

For convenience, the row's content column also stores the raw artifact body; whether the artifact is hot-readable from the DB depends on how it was created and how big it is. The executor follows a deterministic rule when reading prior-stage artifacts on a resume:

If app_artifact.content is non-empty: use it directly.
Else if s3_key is set: download from S3, decode UTF-8, use that.
Else: proceed with empty content (a recoverable degenerate state).

So the worst-case end-to-end loss for an artifact is "content is empty"; the row, its metadata, and its preview structure all survive even if the S3 object disappears.

JSON validation

For format: "json", the executor attempts JSON.parse on the extracted content before saving. If parsing fails, it still saves the raw content (you don't lose the agent's work), but it logs a warning. The preview marks the artifact as "invalid JSON" so you can see it at a glance.

The `artifact` SSE event

Every time a stage's outputs are persisted, the execution's SSE stream emits an artifact event whose payload is the full AppArtifact row including inline content when present. Subscribers can render the preview without an extra fetch:

event: artifact
data: {
  "id": "art_8a3...",
  "execution_id":"exec_8f4...",
  "stage_id": "research",
  "def_id": "competitive_analysis",
  "title": "Stripe Competitive Analysis",
  "format": "markdown",
  "mime_type": "text/markdown",
  "file_name": "company_brief_research_competitive_analysis.md",
  "s3_key": "app-artifacts/usr_.../exec_.../research/competitive_analysis",
  "size_bytes": 8421,
  "content": "# Stripe Competitive Analysis\n\n..."
}

Sharing artifacts publicly

Each artifact can be shared via a public link without exposing the rest of your workspace:

Open the artifact and click Share. A token-gated URL is generated: https://app.aitroop.net/s/<token>.

Configure access. Read-only or read+download. Optional expiry timestamp. Optional password gate. Optional unlock prompt ("ask for email before unlocking").

Revoke any time. Open Settings → Shares and toggle off, or call DELETE /api/shares/:id.

The `app_share` data model

A share is one row in app_share keyed by a random share_token. The row carries enforcement state plus an optional access gate:

Column	Purpose
`resource_type` / `resource_id`	What's being shared: an artifact, an execution, an App.
`access_mode`	`public` (anyone with the link) or `token` (anyone who unlocks with the right secret).
`access_token_hash`	Argon2/bcrypt hash of the unlock secret when `access_mode = 'token'`.
`expires_at`	Nullable. After this timestamp the share resolves to `expired`.
`max_views` / `view_count`	Optional view cap and the running count. Each successful resolve atomically increments the count.
`revoked_at`	Set when you revoke; resolves to `not_found` thereafter.
`permission`	Currently always `read`; the schema reserves room for higher-permission shares.

Resolution states

Hitting GET /api/public/shares/:token returns one of five outcomes; the rendering logic on the public page is driven by which:

`kind`	What happened
`public_ok`	Public link, no gate. Render the resource.
`requires_token`	Token-gated; render the unlock form.
`expired`	`expires_at` has passed.
`over_limit`	`view_count >= max_views`.
`not_found`	Bad token, deleted share, or revoked.

Token unlock posts the secret to POST /api/public/shares/:token/unlock; on success the server issues a short-lived JWT, sets it as a cookie scoped to the /s path, and the user can subsequently hit /api/public/shares/:token/blob (inline) or .../download (attachment headers) until the cookie expires.

Versioning and history

Every artifact is versioned. Each run of an App produces a new version; in chat, every reply that produces an artifact is a new version. You can step back through history in the preview pane and restore any previous version as the current one.

Artifacts are retained for the lifetime of the App (or the chat). Deleting an App archives its artifacts; archives are recoverable for 30 days. Past that they're permanently deleted from S3.

Editing an artifact

Click Edit on any artifact and you get a chat scoped to that single artifact. Ask for changes: "make the table wider", "add a row for Q4", "rephrase the second paragraph to be less formal". Save back to the same artifact or branch into a new version.

Edits to an artifact don't change the App. If you want the change to apply to future runs, open the App in the editor and update the stage's goal. The "Edit in chat" surface is also a useful debug tool: the chat opens with the artifact and the original goal in context.

FAQ & troubleshooting

The agent produced output but my run shows "0 artifacts".

For App stages, the executor extracts content by matching the artifact's title against a ## heading in the response (or, for a single structured artifact, by finding a fenced code block in the right language). If neither hits, no artifact is persisted. Two fixes:

Tighten the stage goal so the agent names its sections: "Write the report under a ## Stripe Competitive Analysis heading.". The heading must match the artifact's title.
Check the stage's artifact_defs; the executor needs at least one entry to even start looking.

In chat, the rule is different: the streaming parser only fires on <artifact title="..." type="...">…</artifact> tags. If the agent forgot to wrap, you'll see the content inline but no row in app_artifact.

My JSON artifact has trailing commas / comments and won't parse.

The agent occasionally emits "JSON5" by accident. The preview will flag this. Fix the goal: "Output strict RFC 8259 JSON, no comments, no trailing commas." For repeated cases, attach a schema to the artifact_def so the agent gets stricter constraints.

CSV artifact has the wrong columns.

Declare the column schema in the artifact_def:

{
  "id": "leads",
  "format": "csv",
  "schema": {
    "columns": [
      { "name": "company", "type": "string", "required": true },
      { "name": "url", "type": "string" },
      { "name": "size", "type": "number" }
    ]
  }
}

The agent reads the schema as part of its instructions; mismatched rows fail validation and the preview marks them.

I want one stage to produce multiple artifacts.

List multiple entries in artifact_defs:

"artifact_defs": [
  { "id": "brief", "format": "markdown" },
  { "id": "data", "format": "json" }
]

The agent will emit two <artifact> blocks. Both get stored, both are previewable separately, both are referenceable downstream by ID.

Can I attach an artifact as input to a new run?

Yes; most file-input fields accept either an upload or a reference to an existing artifact. In the form, click the artifact picker icon next to the file field; pick the artifact. The new run reads it directly from S3; no re-download / re-upload.

How big can an artifact be?

Soft limit 100 MB per artifact, hard limit 1 GB. For larger payloads, use file format and split into a ZIP. Artifacts > 50 MB are stored on slower storage tiers; previews disable inline rendering above 25 MB.

My artifact's preview is fine but downloads are corrupted.

Almost always a MIME type mismatch: the artifact was declared as text/csv but the bytes are XLSX, or vice versa. The platform infers MIME on save; if you suspect the inference is wrong, open the artifact, click Edit metadata, set the correct MIME, re-save.

Back to Docs home

Artifact formats

How the agent produces an artifact

Path A: App stages (markdown-section extraction)

Path B: Chat stream (XML-tag extraction)

Picking the right format

markdown most common

Preview

Export

Chaining

code single or tree

Preview

Multi-file output

html self-contained pages

Preview

Export

When this is the right type

json machine-to-machine

Preview

Schema declaration

Chaining

csv tabular

Preview

Export

Schema enforcement

image visual

Preview

How it's produced

Export

file escape hatch

Preview

Export

Chaining artifacts between stages

How artifacts are stored: the persistence layout

The S3 key shape

Filename convention

Format / MIME / extension reference

Inline content vs S3 fallback

JSON validation

The artifact SSE event

Sharing artifacts publicly

The app_share data model

Resolution states

Versioning and history

Editing an artifact

FAQ & troubleshooting

The `artifact` SSE event

The `app_share` data model