Guide2026-04-2010 min

10 python-pptx limitations we hit in production (and how we solved them)

After 18 months running slideforge.dev on top of python-pptx, here are the specific limitations we kept running into — embedded-picture loss on merge, canvas overflow, chart-type gaps, fonts, memory at 1K+ slides — and the code workarounds.

python-pptx is the de facto Python library for generating PowerPoint files — MIT licensed, mature, used in production at thousands of companies. We use it internally at slideforge.dev. But over 18 months of shipping an API-first slide engine on top of it, we've hit a list of real limitations. This post is a catalog of the worst ones, with the specific code workarounds we ended up writing.

Nothing here is a critique of Steve Canny's work — python-pptx is doing exactly what it was designed for: low-level .pptx manipulation. These limitations only show up when you try to run it as a high-throughput production engine that generates consulting-grade output from structured data.

1. Merging decks silently loses embedded images

This was the first one that cost us a day of debugging. You have two decks — each has icons, logos, and maybe AI-generated images — and you want to concatenate them into one.

from pptx import Presentation
from copy import deepcopy

def merge_bad(files):
    master = Presentation(files[0])
    for f in files[1:]:
        src = Presentation(f)
        for slide in src.slides:
            # Deep-copy the slide element into master — seems right
            xml_slide = deepcopy(slide._element)
            master.slides._sldIdLst.append(...)  # etc
    return master

The output opens without errors. Text is intact. Charts are intact. But every Pictureshape points to a relationship ID that doesn't exist in the master deck — so when you open it in PowerPoint, the images are blank placeholders.

The fix: extract each picture's blob, re-add it via add_picture() in the new deck, then rewrite the XML reference. We wrote _copy_slide_with_pictures() that does this slide-by-slide. ~80 lines to get right. Every .pptx spec we produce now goes through this helper rather than a naive deep-copy.

2. Right-edge overflow on 16:9 canvas (the 13.33″ budget problem)

python-pptx lets you place any shape at any coordinate — including off the slide. There's no overflow: hidden, no canvas-bounds validator. If you add a 4-inch-wide text box at left=Inches(10), it extends past the slide's right edge and gets clipped when rendered.

For an AI-generated layout, this happens constantly. The model computes coordinates based on its understanding of the layout and doesn't always reconcile against the 13.33" × 7.5" canvas.

The fix: a static linter that runs before the sandbox executes the generated code. 46 heuristic rules check every shape: canvas bounds, text overlap via containment + stacking filters with a 15% threshold, contrast ratios, font size minimums, right-edge annotation budgets. When a violation is detected, the failure message is fed back into the LLM prompt and it retries. Adding the linter dropped our broken-slide rate from ~12% to under 1%.

3. No chart types beyond the basic six

python-pptx supports XL_CHART_TYPE.LINE, BAR_CLUSTERED, PIE, AREA, and a handful of variants. If you want waterfall, funnel, marimekko, radar, sunburst, treemap, bullet, heatmap, or Gantt — you're out of luck via native chart API.

These are exactly the charts consulting and finance teams want.

The fix:we build them as shape compositions, not native charts. A waterfall chart in slideforge.dev is a horizontal flex layout of rectangles with accent lines and value labels. Not a native PowerPoint chart — but a legitimate visual that's still editable shape-by-shape after download. 35 such composable components ship today (Metric, BarList, OrgChart, ThreeHorizons, MaturityModel, RAGScorecard, BurndownChart, CapTable, UnitEconomics, Heatmap, Gantt, Swimlane, Roadmap, etc.), and they compose — nest any component inside a SplitView or Card container for dashboards.

4. Animation support is effectively absent

There's an open feature request for entrance/exit animations. It's been open since 2018. The underlying XML is complex and every attempt has run into edge cases.

The workaround:we don't generate animations. 100% of slideforge.dev output is animation-free by choice. For slides that genuinely need animation, the workflow is: generate in slideforge.dev, open in PowerPoint, add animations manually. Not ideal, but honest about the constraint.

5. Font substitution when the template font isn't embedded

python-pptx will happily write tf.font.name = "Helvetica Neue" into the XML whether or not that font exists on the machine opening the file. PowerPoint silently substitutes with a local font, which usually changes line height, line breaks, and sometimes wraps text differently.

For consulting-grade decks where every pixel of typography matters, this is a failure mode we have to actively defend against.

The fix: theme resolution at render time. slideforge.dev ships three theme variants (modern sans, classic serif, monospace) with PowerPoint-safe font stacks (Calibri → Arial → Aptos, Times → Cambria → Georgia, etc.) and a theme override that lets users upload a corporate .pptxand inherit its exact font stack. We also embed fonts when the user's theme explicitly requests it.

6. Performance degrades past ~1,000 slides

Python-pptx holds the entire .pptx in memory as a parsed XML tree via lxml. For a 50-slide deck this is fine (~50MB). Past 500 slides it starts to hurt. At 1,000+ slides — common for report-generation use cases — memory gets heavy and save times climb non-linearly.

We hit this with a customer doing 323-slide sprint reports. RSS peaked at 1.7GB on a 2GB container, which is right at the OOM line.

The fix: for translation and read-only operations we built an lxml-only extractor that bypasses python-pptx's object model, operates on the XML stream directly, and stays under 300MB RSS even on 500-slide decks. For generation (where we typically produce 1 slide per API call anyway), we parallelize: 5 slides rendered concurrently via asyncio.Semaphore(3), then merged via our custom merge_presentations() (see #1).

7. “Insert Picture from URL” doesn't exist

slide.shapes.add_picture() requires a local filesystem path or a file-like object. If your image lives at https://..., you download it first.

The fix: an add_image() helper wrapped around add_picture() that accepts either a filesystem path or an HTTPS URL, downloads to a temp file, applies our crop-to-fit aspect-ratio logic via Pillow, and cleans up. Tiny workaround, but one more thing every downstream user had to write before we added it.

8. XML round-trips can drop slide master properties

If you open a customer-provided .pptxand iterate over slides, python-pptx re-serializes the XML on save. For simple decks this is lossless. For decks with uncommon features — e.g. slide masters with embedded videos, custom XML parts for interactive widgets, or SmartArt generated by PowerPoint's UI — the re-serialization strips what python-pptx doesn't understand.

The customer opens the output and finds their SmartArt is now a static image. Rare, but a failure mode when it happens.

The fix: we don't round-trip customer decks. For translation (translate_deck) we operate on the XML via lxml directly — no python-pptx parse step — and only rewrite text runs. For merging, we only merge slides we generated, which are always simple.

9. Chart data updates via the chart.replace_data() API are fragile

Updating a chart's underlying data is supposed to be easy: get the chart, call replace_data(CategoryChartData()). In practice, the shape of the existing chart (category axis, series count, data labels) has to match exactly or you get an exception. Some chart types don't support replace_data at all.

The fix: for most charts we regenerate from scratch rather than update in place. For the rare case a user wants to update an existing .pptx chart, we strongly recommend regenerating the slide.

10. No built-in PDF export

python-pptx produces .pptx files. It has no notion of PDF.

The fix: we shell out to libreoffice --headless --convert-to pdfin a subprocess sandbox. On Azure Container Apps with 1 CPU, a 10-slide deck converts in about 4–6 seconds. Not python-pptx's job, but worth noting because every production workflow needs it eventually.

Bonus — things that python-pptx actually nails

To be fair, a lot of things python-pptx handles well:

  • Text box API is excellent. Granular paragraph + run control. Font, color, alignment, hyperlinks, all exposed cleanly.
  • Shape geometry. All auto-shapes, custom shapes, connectors — solid coverage.
  • Table API. Cell-by-cell access with merged-cell support. Works.
  • Picture basics. Add, crop, reposition — reliable.
  • Slide layouts. Access to slide masters and layouts is ergonomic.
  • XML escape hatch. You can always drop down to raw XML via ._element when the Python API doesn't expose something.

For most projects, python-pptx is still the right choice. It's free, it's transparent, it works.

Where slideforge.dev fits

We built slideforge.devas a hosted API on top of python-pptx (and lxml for the performance-critical paths) with all ten of the above workarounds baked in. You POST a brief or a structured spec, and the engine handles the canvas math, font stack, chart composition, picture re-embedding, visual QA, and PDF export automatically. If the ten workarounds above resonate with code you've written yourself, save the time — call our API for $0.03/slide or hit the MCP server from Claude Desktop.

But if you're doing something python-pptx already covers well and you don't need consulting-grade defaults, use python-pptx. It's genuinely a great library. We'd probably still be using it as-is if we weren't shipping a hosted service.

Further reading

Try SlideForge free

$3 credit, no card required. Generate your first slide in under a minute.