How to remove all images from a PDF (text-only export)

Strip every image from a PDF — text-only extraction, image-object removal, or re-export — and verify nothing visual remains.

6 min read

How to remove all images from a PDF (text-only export)

By ScoutMyTool Editorial Team · Last updated: 2026-05-21

I once needed the text of a 90-page image-heavy report to feed into a search index, and I burned an hour fighting tools that "removed" the images by hiding them — the file barely shrank and the pictures were still extractable underneath. The mistake was not picking the wrong tool; it was not deciding which of two different jobs I actually had. "Remove all images" can mean give me only the words, or give me the same document with the pictures gone — and they need opposite tools. This guide separates the two, shows how to do each cleanly, and how to verify that nothing visual is left behind.

Methods compared

MethodKeeps layout?ResultBest for
Extract text onlyNo — reflows to plain textA .txt / text-only document, zero imagesFeeding text to search, NLP, or accessibility
Remove image objectsYes — pages keep structureSame PDF, images deleted, gaps remainLighter file that still looks like the original
Re-export without imagesMostlyClean PDF rebuilt text-onlyA tidy text-only PDF to share
Print to PDF (text only)PartialDepends on source app settingsQuick one-off from a desktop app
OCR a scan to textNoText recovered from an image-only PDFScanned PDFs that are all image

Step by step — produce a clean text-only export

  1. Decide which job you have. Need only the words? Choose text extraction. Need the same document minus pictures? Choose image-object removal. This one decision determines everything that follows.
  2. Check for a text layer. Try selecting text in the PDF. If it highlights, the text is extractable directly. If nothing selects, it is a scan — you will need OCR first to create text before you can export it.
  3. Extract the text (text-only path). Run the PDF through a text extractor to output the words as plain text. By definition this drops every image and graphic, giving a guaranteed image-free result ready for indexing, analysis, or accessibility.
  4. Or remove image objects and rebuild (keep-layout path). Use a tool that deletes the embedded image objects and then re-writes the document, so the freed space is actually reclaimed and the file shrinks — not just covered over.
  5. Verify nothing visual remains. Open the output and scroll every page; for the keep-layout path, try to extract images from the new file and confirm none come out, and check the file size dropped. Trust the result only after this check.

Don’t lose meaning that lived in the pictures

The honest caveat: any information carried only by an image — a chart, a diagram, a scanned figure, a signature — disappears when the image does, in both approaches. Text-only extraction discards it because it keeps only characters; image-object removal discards it because the object is deleted. Before you strip images from anything important, ask whether a figure carries content you need. If it does, capture it as text or a description first (OCR a chart’s data table, write alt text for a diagram), then remove the image. Otherwise you end up with a clean, light, text-only file that is also missing something it should have kept.

Related reading

FAQ

What does "remove all images from a PDF" actually mean — there are two different jobs?
Yes, and choosing the wrong one wastes time. Job one is text-only extraction: you want just the words, with no images and usually no layout, producing a plain text document for search, analysis, or accessibility. Job two is image-object removal: you want the same PDF — same pages, same text positions — but with the embedded pictures deleted to shrink the file or strip visuals, leaving blank space where images were. Decide first whether you need the words alone or the document-minus-pictures, because the tools and the results are completely different.
How do I get a true text-only version with no images at all?
Use text extraction. A text extractor reads the PDF’s text layer and outputs the words as plain text, discarding every image, vector graphic, and layout element by definition — there is nothing visual left because only the characters are exported. This is the cleanest way to guarantee zero images. The trade-off is that you lose formatting and positioning; you get a linear stream of text. That is exactly what you want for indexing, language processing, or screen-reader-friendly content, and exactly what you do not want if you need the document to still look like the original.
Why does my PDF stay almost the same size after I "delete" the images?
Usually because the images were not truly removed, only hidden or covered, or because the file was not rebuilt afterward. Drawing a white box over an image leaves the image data in the file. Even genuinely removing image objects can leave the freed space until the PDF is re-saved/optimised, which rewrites the file without the orphaned data. To actually shrink the file, use a tool that deletes the image objects and then re-writes the document, and confirm by checking the new file size and trying to extract images afterward.
My PDF is a scan — it is entirely images. How do I make it text-only?
A scanned PDF has no text layer; every page is an image of the page, so there is no text to extract until you create one. Run OCR first to recognise the characters in the images, which generates a text layer, then extract that text. Quality depends on the scan: 300 DPI or higher and good contrast give the best recognition. After OCR-and-extract, you have a text-only document, but verify it against the source — OCR introduces errors that a plain extraction from a born-digital PDF never would.
Will removing images break the document’s text or accessibility?
Text-only extraction can improve accessibility — a clean text stream is exactly what screen readers and indexers want — but it discards any information that lived only in images (charts, diagrams, scanned figures), which is then simply gone. Image-object removal keeps the text and layout but leaves gaps where visuals were, which can confuse a reader who expected a figure. Whichever you choose, remember that meaning carried only by an image (a graph, a signature, a photo) is lost when the image is, so capture any essential image content as text or alt text first.
Is it safe to strip images from a confidential PDF online?
Only if the work happens on your own device. Server-side tools upload the document to a remote machine to process it, so a confidential file leaves your control and may be cached. Client-side (in-browser) tools extract text or remove image objects locally, so the file never leaves your computer — ScoutMyTool’s PDF tools work this way. For anything sensitive, confirm the tool processes client-side before uploading, or use offline desktop software such as a local Ghostscript pipeline.

Citations

  1. Wikipedia — PDF (text layer, image objects, document structure)
  2. Wikipedia — Raster graphics (what embedded PDF images are)
  3. Ghostscript — documentation (re-writing/optimising PDFs to drop image data)
  4. Wikipedia — Optical character recognition (text from image-only scans)

Export a PDF to text only, no images

ScoutMyTool PDF to Text extracts just the words — every image and graphic dropped by definition — and runs entirely in your browser, so the document never leaves your computer.

Open PDF-to-Text tool →