general12 min read

Audience-Aware Data Narratives: Exec vs. Analyst vs. Engineer

How to use DataStoryBot's steering and refinement prompts to adjust tone and depth for different stakeholders from the same dataset.

By DataStoryBot Team

Audience-Aware Data Narratives: Exec vs. Analyst vs. Engineer

The same dataset needs to say three different things to three different people.

Your VP of Sales wants to know whether Q1 hit the number and what's driving it. Your revenue analyst wants to understand the distribution of deal sizes, the statistical significance of any regional differences, and how the methodology compares to last quarter's model. Your data engineer wants to know whether the pipeline is producing reliable data — null rates, type coercions, join fidelity — before trusting any of the numbers above.

All three stakeholders are asking about the same CSV. None of them wants the other two's version of the answer.

DataStoryBot's steeringPrompt parameter (on /api/analyze) and refinementPrompt parameter (on /api/refine) give you direct control over this. You upload once. You steer the analysis for each audience. You refine the narrative for each audience. You get three different outputs from the same underlying file.

This article covers the mechanics of that pattern: what each audience actually needs, what prompts produce the right output, and what the resulting narratives look like.

Why One Analysis Fails Three Audiences

The default behavior — no steering, no refinement — produces a balanced general-purpose narrative. It mentions the headline number, a trend, and an anomaly. It uses moderate technical language. It includes two or three charts covering different aspects of the data.

This is the analytics equivalent of a Wikipedia article: thorough, accurate, and useful to nobody in particular. The failure mode is different for each audience:

Executives receive a narrative with too much statistical detail and no clear action. They see "revenue grew 12.4% YoY with a 95% confidence interval of ±1.8pp" and cannot immediately map that to a decision. They needed "revenue beat the plan by $2.1M and the growth is coming from one region — here's what to do about it."

Analysts receive a narrative that skips the methodology and aggregates away the variance they care about. They see "West region outperformed" without knowing whether that's adjusted for seasonality, whether the sample is large enough to be meaningful, or whether data quality issues affect the conclusion.

Engineers receive a narrative about business outcomes when what they needed was a data quality report. They don't care that revenue grew — they care whether revenue was populated consistently, whether timestamps are UTC or local, and whether any joins produced duplicates.

Data storytelling is the craft of shaping a finding so it lands with a specific reader. The challenge for APIs and automated pipelines is doing this at scale — generating the right narrative for each stakeholder without a human editor in the loop.

The API Pattern

Every audience-aware pipeline starts the same way: upload once, then run separate analysis and refinement passes for each audience.

import requests

BASE = "https://datastory.bot"

# Step 1: Upload once
with open("q1_sales.csv", "rb") as f:
    upload = requests.post(
        f"{BASE}/api/upload",
        files={"file": ("q1_sales.csv", f, "text/csv")}
    ).json()

container_id = upload["containerId"]

# Step 2: Three separate analyze calls with audience-specific steering
exec_analysis    = analyze(container_id, steering=EXEC_STEERING)
analyst_analysis = analyze(container_id, steering=ANALYST_STEERING)
engineer_analysis = analyze(container_id, steering=ENGINEER_STEERING)

# Step 3: Refine each selected story with audience-specific tone
exec_story    = refine(container_id, exec_analysis, refinement=EXEC_REFINEMENT)
analyst_story = refine(container_id, analyst_analysis, refinement=ANALYST_REFINEMENT)
engineer_story = refine(container_id, engineer_analysis, refinement=ENGINEER_REFINEMENT)

The container has a 20-minute TTL. For typical datasets, three audience passes are well within range — each analyze call takes 20–40 seconds; each refine call takes 10–20 seconds.

Audience 1: Executives

Executives need the answer, not the analysis. They read the narrative at 7am between two other tabs. They will not re-read it. If the action is not in the first two sentences, it does not reach them.

The core requirements are:

  • Lead with the single most consequential finding, quantified
  • Connect findings directly to business decisions or risks
  • Avoid statistical language (p-values, confidence intervals, R²) unless it's translated into plain English
  • End with a clear "so what" — a recommended action or a decision that needs to be made
  • Keep the total length under 300 words

Steering Prompt (Executives)

EXEC_STEERING = (
    "Analyze this data for a C-level executive audience. "
    "Lead with the single most important business finding — "
    "the one that most directly affects revenue, cost, or risk. "
    "Quantify everything in dollar terms or percentage change where possible. "
    "Do not surface statistical methodology. "
    "Each finding should connect to a decision or action. "
    "If there are warning signs or risks in the data, flag them explicitly."
)

Refinement Prompt (Executives)

EXEC_REFINEMENT = (
    "Write this as an executive summary. "
    "Maximum 250 words. "
    "First sentence: the headline finding with the number. "
    "Second paragraph: why it matters and what's driving it. "
    "Final paragraph: the recommended action or the decision that needs to be made. "
    "Use plain language. No jargon. No hedging. "
    "Bold the single most important number."
)

Example Narrative Output (Executives)

Q1 revenue hit $8.4M — 12% above plan — but 71% of the beat came from a single region.

The West region drove $1.1M of the $1.4M plan overperformance, with APAC and East falling short of targets by 8% and 5% respectively. The concentration risk is material: if West softens in Q2, the company will miss the half-year number regardless of recovery in other regions.

The West outperformance is primarily expansion revenue from existing accounts (NRR hit 138%), not new logo growth. Win rates remained flat at 23% company-wide. The growth is real, but it is narrow.

Recommended action: Before Q2 planning, assess whether the West's expansion playbook can be replicated in APAC and East, or whether the regional outperformance reflects factors that are not transferable.

No methodology. No variance decomposition. The action is in the last sentence. This is what executives can act on.

Audience 2: Analysts

Analysts need to understand the data well enough to defend it — in front of stakeholders who will push back, in front of their manager who will ask about the methodology, and in front of next quarter's version of themselves who will need to reproduce the analysis.

Their requirements are almost the inverse of executives:

  • Show the methodology and any assumptions explicitly
  • Surface distributional information, not just means and totals
  • Flag statistical significance — and flag when sample sizes are too small for significance to be meaningful
  • Show how the current period compares to prior periods using consistent methodology
  • Expose any data quality issues that affect interpretation

Steering Prompt (Analysts)

ANALYST_STEERING = (
    "Analyze this data for a quantitative analyst audience. "
    "Surface the full distribution of key metrics — not just means, "
    "but medians, quartiles, and outlier counts. "
    "For any comparisons between groups (regions, segments, time periods), "
    "report effect sizes and whether differences are statistically significant. "
    "Flag any segments where sample size is too small for reliable conclusions. "
    "Show period-over-period comparisons using a consistent methodology. "
    "Document any assumptions made during analysis."
)

Refinement Prompt (Analysts)

ANALYST_REFINEMENT = (
    "Write this for a data analyst who will present findings to leadership "
    "and needs to defend the methodology. "
    "Include: the key finding, the statistical basis for it, "
    "the distribution of underlying data, any caveats about sample size or data quality, "
    "and how this compares to the prior period using consistent methodology. "
    "Use precise language. Show the numbers that support your claims. "
    "Length: 400-600 words."
)

Example Narrative Output (Analysts)

Q1 Revenue: West Region Outperformance — Statistical Basis and Caveats

West region revenue of $5.96M exceeded plan by $1.1M (+22.7%). The outperformance is statistically robust: based on the monthly variance observed in trailing 8 quarters, a deviation of this magnitude has a <2% probability of occurring by chance (z = 2.41).

Distribution: The West's deal-level revenue distribution is right-skewed (mean $47K, median $31K, IQR $18K–$58K). Three enterprise deals above $200K contributed $620K — 56% of the plan beat. Excluding these three outliers, West performance was 7.4% above plan, which is within normal variance. This matters for forecasting: if the Q2 plan is built from Q1 actuals without adjusting for large-deal concentration, it will be optimistic.

Period comparison (consistent methodology): Q1 2026 vs. Q1 2025 uses the same cohort definition (closed-won with invoice issued in quarter). Q1 2025 West revenue was $4.31M, giving a YoY growth rate of 38.3%. Note that the Q1 2025 figure was restated in Q3 2025 to remove two deals ($310K combined) that were reversed; the comparison is against the restated baseline.

Regions with insufficient sample size: APAC had 14 closed deals in Q1, compared to West's 127. Percentage comparisons for APAC should be treated as directional, not statistically reliable. A 5% miss on 14 deals is noise; the same percentage miss on 127 deals is signal.

This is a narrative an analyst can defend. The distribution caveat, the outlier decomposition, the restated baseline note, the sample size warning — these are the details that determine whether the analysis survives scrutiny.

For more on how steering prompts shape statistical depth, see using steering prompts to control analysis direction.

Audience 3: Engineers

Data engineers and platform engineers reading a data narrative are usually answering a different question entirely: can I trust this data? They are evaluating pipeline health, looking for schema drift, checking whether transformations are producing correct output.

They are also the least interested in the business outcome. A data quality report that says "revenue looks unreliable because the amount column has 14% nulls" is more valuable to them than a full narrative about what the revenue trend means.

Their requirements:

  • Report null rates, type inconsistencies, and duplicate rows
  • Flag columns with unexpected value ranges (negative revenues, future timestamps, IDs with mixed formats)
  • Surface any join or aggregation artifacts
  • Report row counts and any filtering applied during analysis
  • Note anything that would cause a downstream consumer to misinterpret the data

Steering Prompt (Engineers)

ENGINEER_STEERING = (
    "Perform a data quality assessment on this dataset before analyzing content. "
    "Report: null rates per column, columns with unexpected data types or mixed types, "
    "duplicate rows or duplicate primary keys, value range anomalies "
    "(negative values in fields that should be positive, future dates, etc.), "
    "and any column where more than 5% of values are outside the expected range. "
    "After the data quality report, analyze the content — but note which findings "
    "may be affected by the quality issues."
)

Refinement Prompt (Engineers)

ENGINEER_REFINEMENT = (
    "Format this as a data quality report followed by a summary of findings. "
    "Section 1: Data Quality — table of columns with null rates, type issues, "
    "and value range anomalies. Flag any column that should not be used in "
    "calculations without cleaning. "
    "Section 2: Findings — what the data shows, with explicit notes on "
    "which findings are affected by the quality issues. "
    "Use technical precision. Include row counts and filter conditions applied."
)

Example Narrative Output (Engineers)

Data Quality Report: q1_sales.csv

Input: 2,847 rows × 12 columns after header parsing. No encoding errors detected.

Quality issues flagged:

ColumnIssueSeverity
discount_pct14.2% nulls (405 rows)High — affects margin calculations
close_date3 rows with dates in 2031 (likely typos)Medium — excludes from time-series
customer_idMixed formats: 2,801 rows CUST-XXXXX, 46 rows integer-onlyLow — join risk to customer dimension
region12 rows with value "Unknown"Low — excluded from regional aggregations

Duplicate check: 0 exact duplicate rows. opportunity_id is unique across all rows.

Findings (with quality caveats): Revenue totals are reliable (no nulls in revenue, value range 0–$487K with no negatives). Regional comparisons exclude 12 "Unknown" rows (0.4% of total revenue — immaterial). Margin analysis using discount_pct is unreliable without imputing the 405 null rows; the narrative's margin figures reflect the 2,442 rows where discount is populated.

Pipeline recommendation: add a NOT NULL constraint on discount_pct at ingestion, or document the null-means-zero assumption explicitly in the schema.

This output is what an engineer needs before they sign off on a dashboard being built from this data. The business narrative is present, but it is subordinate to the trust question.

Practical Considerations

Run analyze separately per audience, not shared. It is tempting to run /api/analyze once and then call /api/refine with different refinement prompts for each audience. This works when the story angle is audience-agnostic — but executives and engineers often need different story angles, not just different prose. The executive wants the business impact story; the engineer wants the data quality story. These require different analyze results, not different refinements of the same result.

Container TTL is the binding constraint. Six API calls (three analyze, three refine) must complete within 20 minutes of the upload. Parallelize the analyze calls where possible; the refine calls depend on the analyze results, so they run sequentially per audience. Download chart images and filtered dataset files before the container expires — these are only accessible via /api/files while the container is live.

Version your prompts. Audience-steering prompts encode assumptions about what each stakeholder cares about. When those assumptions change, update the prompt. Treat these as configuration, not inline strings.

Further Reading

The steering prompt mechanics that power the audience-aware pattern are covered in detail in using steering prompts to control analysis direction.

For the foundational principles behind why narrative structure matters — and how to write a data story that actually drives decisions — see how to write a data story.

For the distinction between a data story and a data visualization, and why both are necessary for different audiences, see data storytelling vs. data visualization.

Ready to find your data story?

Upload a CSV and DataStoryBot will uncover the narrative in seconds.

Try DataStoryBot →