generalMarch 24, 202612 min read

OpenAI Code Interpreter for Data Analysis: A Complete Guide

Everything you need to know about OpenAI's Code Interpreter for data analysis — how it works, what it can do, and how to build production applications with it.

By DataStoryBot Team

OpenAI Code Interpreter for Data Analysis: A Complete Guide

Code Interpreter is OpenAI's sandboxed Python execution environment. You give it a file and a question, it writes Python code, runs it in an isolated container, and returns the results — dataframes, statistics, charts, transformed datasets. No infrastructure to manage. No dependencies to install. No code to debug.

This guide covers everything you need to build production data analysis applications with Code Interpreter: the architecture, the API surface, the container lifecycle, what libraries are available, and the limitations you will hit. We will use DataStoryBot as a running example of what a production implementation looks like.

What Is Code Interpreter?

Code Interpreter is a tool available through OpenAI's API that lets GPT-4o write and execute Python code inside a sandboxed container. The container is ephemeral — it spins up on demand, runs your code, and expires after a fixed TTL.

The key distinction from asking GPT-4o to write code in a normal chat: Code Interpreter actually runs the code. It can read files you upload, execute pandas operations on real data, generate Matplotlib charts as real PNGs, and return computed results. It is not generating plausible-looking code and hoping you run it yourself. It runs, observes the output, and iterates if something fails.

This makes it qualitatively different from code generation. It is code execution with an LLM in the loop.

How the Architecture Works

The system has three layers:

Containers API — manages the sandboxed execution environments. You create a container, upload files to it, and later retrieve output files from it. Each container has a TTL (maximum 20 minutes on the current API), after which it and all its files are destroyed.

Responses API — the inference layer. You send a message to GPT-4o with the code_interpreter tool enabled and a reference to your container. The model decides when to write and execute code, does so, observes the output, and continues until it has an answer.

File retrieval — after execution, any files the code generated (charts, CSVs, etc.) can be downloaded from the container before it expires.

Here is how these pieces connect:

Upload CSV → Container (files live here)
                ↓
         Responses API (GPT-4o + code_interpreter tool)
                ↓
         Python executes inside container
                ↓
         Results: text output + generated files
                ↓
         Download files before TTL expires

The API in Detail

Creating a Container

curl -X POST https://api.openai.com/v1/containers \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "data-analysis-session",
    "expires_after": {
      "anchor": "last_activity",
      "minutes": 20
    }
  }'

The expires_after field controls the TTL. The maximum is 20 minutes anchored to last activity. After 20 minutes of no API calls referencing this container, it and all uploaded files are deleted. This is a hard limit — you cannot extend it.

Uploading Files

curl -X POST "https://api.openai.com/v1/containers/${CONTAINER_ID}/files" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -F "file=@dataset.csv"

The response includes a file_id you will reference when asking Code Interpreter to work with the file. Files uploaded to a container are accessible to any code that runs inside it.

Executing Code via the Responses API

This is where the work happens. You send a request to the Responses API with the code_interpreter tool defined:

curl -X POST https://api.openai.com/v1/responses \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "instructions": "Analyze the uploaded dataset. Find trends and generate visualizations.",
    "input": [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "Analyze sales_data.csv and identify the top 3 findings."
          }
        ]
      }
    ],
    "tools": [
      {
        "type": "code_interpreter",
        "container": {
          "id": "'"${CONTAINER_ID}"'",
          "type": "auto"
        }
      }
    ]
  }'

The model receives your prompt, sees that code_interpreter is available, and decides to use it. It writes Python code, the container executes it, the model reads the output (stdout, stderr, generated files), and formulates a response. If the code fails, the model often fixes the error and retries automatically.

Retrieving Generated Files

After execution, any files the code wrote — charts, transformed CSVs, reports — live in the container. You retrieve them by file ID:

curl -o chart.png \
  "https://api.openai.com/v1/containers/${CONTAINER_ID}/files/${FILE_ID}/content" \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Available Libraries

Code Interpreter's container comes with a substantial Python environment pre-installed. The libraries most relevant to data analysis:

Category	Libraries
Data manipulation	pandas, numpy
Visualization	matplotlib, seaborn
Statistics	scipy, statsmodels
Machine learning	scikit-learn
File handling	openpyxl, xlrd, csv
General	json, datetime, re, collections

You cannot pip install additional packages. The container has no network access. Whatever is pre-installed is what you get. For most data analysis workflows, the available libraries are sufficient. If you need something exotic — say, geopandas or prophet — you will need to run that part of the analysis outside Code Interpreter.

The Security Model

The sandboxing is the feature that makes Code Interpreter viable for production use with untrusted data:

No network access: Code running inside the container cannot make HTTP requests, connect to databases, or phone home. Your data stays in the sandbox.
Isolated filesystem: Each container has its own filesystem. Containers cannot see each other's files.
Automatic expiry: Containers and all their contents are destroyed after the TTL. No data persists on OpenAI's infrastructure beyond the session.
No persistent state: There is no way to save state between containers. Each session starts clean.

This matters when you are processing customer data, financial records, or anything sensitive. The data enters the container, gets analyzed, and the results come back. After 20 minutes of inactivity, everything is gone.

Building a Data Analysis Workflow

Here is a complete Node.js implementation of a data analysis pipeline using Code Interpreter. This is a simplified version of what runs inside DataStoryBot.

import OpenAI from "openai";
import fs from "fs";

const openai = new OpenAI();

async function analyzeDataset(csvPath) {
  // 1. Create container
  const container = await openai.containers.create({
    name: "analysis-session",
    expires_after: { anchor: "last_activity", minutes: 20 },
  });

  // 2. Upload file
  const fileStream = fs.createReadStream(csvPath);
  const file = await openai.containers.files.create(container.id, {
    file: fileStream,
  });

  // 3. Run analysis via Responses API
  const response = await openai.responses.create({
    model: "gpt-4o",
    instructions: `You are a data analyst. Analyze the uploaded CSV file.
      Identify the 3 most interesting findings. For each finding,
      generate a publication-quality chart using matplotlib with
      dark_background style, #141414 facecolor, 150 DPI, 10x6 inches.
      Save charts as PNG files.`,
    input: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Analyze this dataset and show me what's interesting.",
          },
        ],
      },
    ],
    tools: [
      {
        type: "code_interpreter",
        container: { id: container.id, type: "auto" },
      },
    ],
  });

  // 4. Extract results
  const textOutput = response.output
    .filter((item) => item.type === "message")
    .map((item) => item.content.map((c) => c.text).join(""))
    .join("\n");

  // 5. Find generated file IDs from code interpreter output
  const codeOutputs = response.output.filter(
    (item) => item.type === "code_interpreter_call"
  );

  const fileIds = [];
  for (const output of codeOutputs) {
    if (output.results) {
      for (const result of output.results) {
        if (result.type === "files") {
          fileIds.push(...result.files.map((f) => f.file_id));
        }
      }
    }
  }

  // 6. Download chart files
  const charts = [];
  for (const fileId of fileIds) {
    const content = await openai.containers.files.content(
      container.id,
      fileId
    );
    const buffer = Buffer.from(await content.arrayBuffer());
    const filename = `chart_${fileId}.png`;
    fs.writeFileSync(filename, buffer);
    charts.push(filename);
  }

  return { analysis: textOutput, charts };
}

This is the core loop. Upload, analyze, extract, download. Everything else is orchestration.

DataStoryBot: A Production Example

DataStoryBot uses this exact infrastructure but adds a structured workflow on top. Here is how the pieces map to DataStoryBot's API:

POST /api/upload creates the container and uploads the CSV. It also parses the file to extract metadata (column names, types, row count) before Code Interpreter touches it, so the UI can show the user what they uploaded immediately.

POST /api/analyze sends the first Responses API call. The prompt instructs GPT-4o to examine the data and return exactly three story angles — each with a title, summary, and suggested chart types. The response is structured so the frontend can render a selection UI.

POST /api/refine sends the second Responses API call with the selected story title. This time, the prompt asks for a full narrative in markdown, 2-4 charts supporting that narrative, and a filtered/transformed dataset. The model writes and executes multiple rounds of Python code: cleaning the data, computing derived metrics, generating each chart with consistent styling.

GET /api/files/[containerId]/[fileId] proxies file downloads from the container. The frontend fetches chart PNGs through this endpoint to render them inline.

The entire flow — from CSV upload to narrative with charts — typically completes in 30-60 seconds. The container stays alive for 20 minutes after the last API call, so users can request refinements or explore alternative story angles without re-uploading.

For a walkthrough of how the chart generation specifically works, see How to Generate Charts from CSV Data Automatically.

Container Lifecycle Management

The 20-minute TTL is the most important operational constraint. Here is what it means in practice:

The timer resets on activity. Every API call that references the container (uploading a file, running a Responses API call with that container, downloading a file) resets the 20-minute clock. An active session can run indefinitely as long as there is activity within each 20-minute window.

Plan for expiry. If a user uploads a CSV and walks away for 25 minutes, the container is gone. Your application needs to handle this gracefully — detect the expired container and prompt the user to re-upload.

Download results promptly. Generated charts and datasets exist only inside the container. If you need them after the session, download and store them in your own infrastructure before the container expires.

One container per session. Each container is independent. You cannot share files between containers or resume a container after it expires. Design your workflow accordingly.

// Example: handling container expiry gracefully
async function safeApiCall(containerId, apiCall) {
  try {
    return await apiCall(containerId);
  } catch (error) {
    if (error.status === 404 || error.message?.includes("expired")) {
      throw new Error(
        "SESSION_EXPIRED: Your analysis session has expired. " +
        "Please re-upload your dataset to start a new session."
      );
    }
    throw error;
  }
}

Limitations and Gotchas

After building production applications on Code Interpreter, these are the constraints that actually matter:

20-minute TTL is a hard ceiling. You cannot request a longer-lived container. For long-running analysis workflows, you need to either keep the session active with periodic API calls or accept that the user may need to re-upload.

File size limits. Large CSVs (hundreds of megabytes) will hit upload limits and slow down processing. For large datasets, consider pre-filtering or sampling before upload.

No custom packages. The pre-installed library set is fixed. No pip install. If your analysis requires a library that is not available (geographic analysis with geopandas, time series forecasting with prophet), you need to handle that outside Code Interpreter.

No network access. Code inside the container cannot fetch external data, call APIs, or connect to databases. Everything the code needs must be uploaded to the container before execution.

Execution time. Complex analyses with large datasets can take 30-90 seconds. The model may write and execute multiple rounds of code, with each round taking a few seconds. This is fast for automated analysis, but users expect some form of progress indication.

Non-deterministic output. The same dataset and prompt can produce different code and different charts on each run. The analysis will be valid, but the specific chart types, color choices, and narrative framing may vary. If you need exact reproducibility, save and re-use the generated code rather than re-running the analysis.

Error handling in code execution. Code Interpreter usually recovers from errors automatically — it reads the traceback and fixes the code. But not always. Your application should handle cases where the model fails to produce valid output after its retry attempts.

When to Use Code Interpreter vs. Running Your Own Python

Use Code Interpreter when:

You need sandboxed execution of untrusted or semi-trusted analysis tasks
You want the LLM to decide what analysis to run, not just execute pre-written code
You do not want to manage Python environments, container infrastructure, or dependency hell
The pre-installed libraries cover your needs

Run your own Python when:

You need custom packages not available in Code Interpreter
You need persistent state across sessions
You need network access from within the analysis code
You need execution times longer than what Code Interpreter supports
You need exact reproducibility

For most data analysis use cases — the kind where someone uploads a CSV and wants to understand what is in it — Code Interpreter handles the job cleanly. That is exactly the use case DataStoryBot is built for.

Getting Started

The fastest way to see Code Interpreter in action with real data is the DataStoryBot playground. Upload a CSV, pick a story, and watch it work. Behind every chart and narrative is a Code Interpreter session doing exactly what this guide describes.

If you want to understand how DataStoryBot uses Code Interpreter to analyze data before generating visualizations, read How to Analyze CSV Files with AI. For the complete API reference, see DataStoryBot API Documentation.

To build your own application on Code Interpreter, start with the Node.js example above. The core loop is straightforward: create container, upload file, call Responses API with the code_interpreter tool, download results. Everything else is prompt engineering and UX.

Ready to find your data story?

Upload a CSV and DataStoryBot will uncover the narrative in seconds.

Try DataStoryBot →