generalMarch 24, 20268 min read

Getting Started with the DataStoryBot API

Get your first AI-generated data story in under 5 minutes. Step-by-step quickstart for the DataStoryBot API with Python, JavaScript, and curl examples.

By DataStoryBot Team

Getting Started with the DataStoryBot API

Most CSV analysis APIs stop at summary statistics. You send data, you get back means, medians, and maybe a histogram. DataStoryBot does something different: it reads your data, finds the stories hiding in it, and returns a full narrative with charts and a filtered dataset — all through three API calls.

This guide walks you through the entire flow. By the end, you'll have a working script that uploads a CSV, picks a story angle, and downloads a polished narrative with visualizations. Total time: under five minutes.

Prerequisites

You need two things:

A CSV file — any tabular data under 50 MB. If you don't have one handy, grab a dataset from Kaggle or use your own product analytics export.
The API base URL — all requests go to https://datastory.bot.

No API key is required during the current open beta. That will change, so check the playground for the latest auth requirements.

The Three-Call Flow

DataStoryBot's API is built around a simple pipeline:

Upload CSV  →  Analyze (3 story angles)  →  Refine (full narrative + charts)

Each step maps to one endpoint:

Step	Endpoint	What it does
1. Upload	`POST /api/upload`	Sends your CSV, returns a container ID and file ID
2. Analyze	`POST /api/analyze`	Discovers 3 story angles in your data
3. Refine	`POST /api/refine`	Generates a full narrative, charts, and filtered dataset for the angle you pick

The container is an ephemeral OpenAI Code Interpreter environment running GPT-4o. It lives for 20 minutes after creation, then it's gone. All your files, charts, and state disappear with it. This is by design — no data lingers on servers after you're done.

Step 1: Upload Your CSV

Send your file as multipart form data:

curl -X POST https://datastory.bot/api/upload \
  -F "file=@sales_data.csv" \
  -H "Content-Type: multipart/form-data"

The response gives you everything you need to proceed:

{
  "containerId": "ctr_abc123def456",
  "fileId": "file-7x8y9z",
  "metadata": {
    "fileName": "sales_data.csv",
    "rowCount": 12840,
    "columnCount": 9,
    "columns": ["date", "region", "product", "revenue", "units", "cost", "channel", "customer_segment", "returns"]
  }
}

The containerId is your session handle — you'll pass it to every subsequent call. The metadata object tells you what DataStoryBot detected: row count, column count, and column names. Use this to verify your file uploaded correctly before moving on.

Step 2: Analyze — Discover Story Angles

Now ask DataStoryBot to find the narratives in your data:

curl -X POST https://datastory.bot/api/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "containerId": "ctr_abc123def456"
  }'

The response returns three distinct story angles:

[
  {
    "id": "story_1",
    "title": "Q4 Revenue Surge Driven by Enterprise Segment",
    "summary": "Enterprise customers accounted for 68% of Q4 revenue growth, with a 42% quarter-over-quarter increase concentrated in the APAC region.",
    "chartFileId": "file-chart001"
  },
  {
    "id": "story_2",
    "title": "Rising Return Rates Signal Product Quality Issues",
    "summary": "Product returns increased 23% over the past 6 months, with the 'Pro' product line showing 3x the return rate of other lines.",
    "chartFileId": "file-chart002"
  },
  {
    "id": "story_3",
    "title": "Direct Channel Overtakes Retail for First Time",
    "summary": "Direct-to-consumer sales surpassed retail channel revenue in March, driven by a 31% increase in repeat purchases.",
    "chartFileId": "file-chart003"
  }
]

Each story has a preview chart you can download (more on that in Step 4). The title and summary are designed to be shown directly to end users if you're building a UI on top of this API.

Using Steering Prompts

If you already have a hypothesis or a specific angle you want explored, pass a steeringPrompt:

curl -X POST https://datastory.bot/api/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "containerId": "ctr_abc123def456",
    "steeringPrompt": "Focus on regional differences in return rates"
  }'

This doesn't force the output — it guides the AI to weight certain patterns more heavily. The three stories will still be data-driven, but they'll lean toward the direction you specified.

Step 3: Refine — Generate the Full Narrative

Pick the story title that matters most and send it to the refine endpoint:

curl -X POST https://datastory.bot/api/refine \
  -H "Content-Type: application/json" \
  -d '{
    "containerId": "ctr_abc123def456",
    "selectedStoryTitle": "Q4 Revenue Surge Driven by Enterprise Segment"
  }'

This is where the heavy computation happens. DataStoryBot runs code against your data inside the container — aggregating, filtering, building visualizations — and returns a complete package:

{
  "narrative": "## Q4 Revenue Surge Driven by Enterprise Segment\n\nEnterprise customers drove a ...",
  "charts": [
    {
      "fileId": "file-chart101",
      "caption": "Quarterly revenue by customer segment (2025)"
    },
    {
      "fileId": "file-chart102",
      "caption": "Enterprise revenue growth by region"
    }
  ],
  "resultDataset": {
    "fileId": "file-ds001",
    "caption": "Filtered dataset: Enterprise segment transactions Q4 2025"
  }
}

The narrative field is Markdown. The charts array contains file IDs for PNG images. The resultDataset is a filtered CSV containing only the rows relevant to the story — useful for follow-up analysis or for feeding into your own BI tools.

You can also pass a refinementPrompt to adjust tone, length, or focus:

curl -X POST https://datastory.bot/api/refine \
  -H "Content-Type: application/json" \
  -d '{
    "containerId": "ctr_abc123def456",
    "selectedStoryTitle": "Q4 Revenue Surge Driven by Enterprise Segment",
    "refinementPrompt": "Make it executive-friendly. Keep it under 300 words. Emphasize APAC growth."
  }'

Step 4: Download Charts and Datasets

Every file — charts, filtered CSVs — is accessible via a single endpoint:

# Download a chart (PNG)
curl -o revenue_chart.png \
  https://datastory.bot/api/files/ctr_abc123def456/file-chart101

# Download the filtered dataset (CSV)
curl -o enterprise_q4.csv \
  https://datastory.bot/api/files/ctr_abc123def456/file-ds001

Remember: these files live in an ephemeral container. You have 20 minutes from the initial upload to download everything. After that, the container and all its files are deleted.

Full Python Example

Here's a complete script that runs the entire pipeline:

import requests
import json
import time

BASE_URL = "https://datastory.bot"

# Step 1: Upload
with open("sales_data.csv", "rb") as f:
    upload_resp = requests.post(
        f"{BASE_URL}/api/upload",
        files={"file": ("sales_data.csv", f, "text/csv")}
    )
upload_data = upload_resp.json()
container_id = upload_data["containerId"]
print(f"Uploaded: {upload_data['metadata']['rowCount']} rows, "
      f"{upload_data['metadata']['columnCount']} columns")

# Step 2: Analyze
analyze_resp = requests.post(
    f"{BASE_URL}/api/analyze",
    json={"containerId": container_id}
)
stories = analyze_resp.json()

print("\nStory angles found:")
for i, story in enumerate(stories):
    print(f"  {i+1}. {story['title']}")
    print(f"     {story['summary']}\n")

# Step 3: Refine — pick the first story
selected_title = stories[0]["title"]
refine_resp = requests.post(
    f"{BASE_URL}/api/refine",
    json={
        "containerId": container_id,
        "selectedStoryTitle": selected_title
    }
)
result = refine_resp.json()

# Save the narrative
with open("narrative.md", "w") as f:
    f.write(result["narrative"])
print(f"Narrative saved ({len(result['narrative'])} chars)")

# Download all charts
for chart in result["charts"]:
    chart_resp = requests.get(
        f"{BASE_URL}/api/files/{container_id}/{chart['fileId']}"
    )
    filename = f"chart_{chart['fileId']}.png"
    with open(filename, "wb") as f:
        f.write(chart_resp.content)
    print(f"Chart saved: {filename} — {chart['caption']}")

# Download the filtered dataset
ds = result["resultDataset"]
ds_resp = requests.get(
    f"{BASE_URL}/api/files/{container_id}/{ds['fileId']}"
)
with open("filtered_data.csv", "wb") as f:
    f.write(ds_resp.content)
print(f"Dataset saved: filtered_data.csv — {ds['caption']}")

JavaScript / Fetch Example

For frontend or Node.js integrations:

const BASE_URL = "https://datastory.bot";

async function runDataStory(csvFile) {
  // Step 1: Upload
  const formData = new FormData();
  formData.append("file", csvFile);

  const uploadRes = await fetch(`${BASE_URL}/api/upload`, {
    method: "POST",
    body: formData,
  });
  const { containerId, metadata } = await uploadRes.json();
  console.log(`Uploaded: ${metadata.rowCount} rows`);

  // Step 2: Analyze
  const analyzeRes = await fetch(`${BASE_URL}/api/analyze`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({ containerId }),
  });
  const stories = await analyzeRes.json();

  // Step 3: Refine
  const refineRes = await fetch(`${BASE_URL}/api/refine`, {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify({
      containerId,
      selectedStoryTitle: stories[0].title,
    }),
  });
  const result = await refineRes.json();

  return {
    narrative: result.narrative,
    charts: result.charts.map(c => ({
      url: `${BASE_URL}/api/files/${containerId}/${c.fileId}`,
      caption: c.caption,
    })),
    dataset: `${BASE_URL}/api/files/${containerId}/${result.resultDataset.fileId}`,
  };
}

Understanding the Response

A few things worth knowing about what comes back:

Narratives are Markdown. They include headers, bullet points, bold text, and inline data references. You can render them directly with any Markdown library.

Charts are PNGs generated by matplotlib inside the container. They use a dark theme that matches the DataStoryBot UI. If you need a different style, mention it in your refinementPrompt.

Filtered datasets are CSVs containing the subset of your original data relevant to the selected story. This is useful for audit trails ("here's exactly what the narrative is based on") and for feeding into downstream pipelines.

Tips for Better Results

Use steering prompts when you have context the AI doesn't. If you know that Q3 had a pricing change, tell the analyze step. It will factor that into the story angles it generates.

Use refinement prompts to control output format. Want bullet points instead of paragraphs? Want the narrative in a specific language? Want it framed for a board meeting vs. an engineering standup? The refinement prompt handles all of this.

Handle container expiry gracefully. Containers expire 20 minutes after creation. If you get a 404 on a file download or an error on analyze/refine, the container is gone. Your code should re-upload the file and restart the pipeline. The Python example above runs fast enough that this isn't usually an issue, but batch processing workflows should account for it.

Smaller files analyze faster. If your CSV has columns you know are irrelevant, drop them before upload. A 50-column dataset with 3 useful columns will produce noisier results than a focused 3-column file.

Next Steps

Now that you have the basics working:

Learn how to analyze CSV data at scale in Automate CSV Analysis with AI — the conceptual framework behind what DataStoryBot does with your data.
See how to build automated report generation pipelines using the refine endpoint.
Explore chart generation patterns to customize the visualizations DataStoryBot produces.
Try it live in the DataStoryBot playground — no code required.

Ready to find your data story?

Upload a CSV and DataStoryBot will uncover the narrative in seconds.

Try DataStoryBot →