How to edit a scanned PDF — extract and modify text

Five real approaches to editing a scanned PDF, with the limits of each and the free tools that handle them.

10 min read

How to edit a scanned PDF — extract and modify text (2026)

By ScoutMyTool Editorial Team · Last updated: 2026-05-20

Introduction

A landlord I know asked me last month to "fix one typo" in a scanned tenancy agreement — page 3, line 9, "tennant" should be "tenant". His instinct was to open the PDF and click on the word; my instinct was to look at the file structure. The PDF was a scanned image with no text layer. Clicking did nothing. Editing it ended up taking forty seconds with the right two-step workflow (OCR first, then in-place edit), but only because we picked the right approach. This article is the practical version of why "edit a scanned PDF" is more nuanced than it sounds, the five real approaches, and which one fits which use case.

Why editing a scanned PDF is different from editing a regular PDF

A regular PDF exported from Word, Google Docs, or any digital authoring tool contains its text as characters in a font — every glyph on the page is text in the file structure, addressable per the ISO 32000-1 specification.1 Editing it is straightforward: click a word, type new text, the file's text content updates.

A scanned PDF contains only pixels. There is no text in the file structure — the page is a photograph of a document. Clicking on a word does nothing because there is no word to click; there is only an image of a word. To "edit" a scanned PDF you first have to do something to convert pixels into text, or you have to overlay new text on top of the image, or you have to accept that you are not really editing the original but creating a new document. All five approaches below are variations on those three core strategies.

Five approaches — and when each one fits

ApproachWhat it doesBest forLimits
1. OCR + small in-place text editsOCR the scan to add a searchable text layer; use a "PDF edit text" tool to modify individual words or short phrases in place.Fixing a typo, updating a date, replacing one phrase with another. Single-word or single-line changes.The visual font on the page is rasterised. The edited text uses a substitute font that may not match exactly — small changes look fine, larger edits look conspicuous.
2. OCR + extract to Word + re-author + exportOCR the scan, convert the searchable PDF to a Word document, edit in Word, export back to PDF.Substantial text changes, restructuring, formatting updates. The right path when you need to rewrite a paragraph or add new content.Visual fidelity degrades — the resulting PDF looks like a Word document, not the original scan. For documents where appearance matters, this loses the original look entirely.
3. Image edit + flatten back to PDFConvert each page to an image, edit the image in a raster editor (Photoshop, GIMP, Affinity Photo), and re-assemble as a PDF.Documents where the original visual must be preserved exactly — old scanned forms, certificates, illustrated content.Time-intensive. The edited text remains a raster image (not text), so the resulting PDF is still image-only and not searchable unless re-OCRed afterwards.
4. Annotate without modifying the originalAdd text annotations, sticky notes, or overlay text boxes onto the scan without changing the underlying image.Marking up a document for review, redlining, adding comments. The original page stays untouched; annotations sit on top.Not actually editing the source text — adds layered annotations. The reviewer can still see the original under the annotation.
5. Re-create from scratchType the document content into a fresh Word / Google Docs / Markdown file and export to PDF.Documents where the original is illegible, badly damaged, or the change is large enough that starting fresh is cheaper than fighting the conversion.Time investment scales with document length. Trade-off vs Approach 2: more control, more time.

Approach 1 in detail — OCR + in-place text edit

The right approach for small fixes (typos, dates, single phrases) in scanned documents you want to keep visually identical. Two steps:

  1. OCR the scan. Use ScoutMyTool's PDF OCR, ocrmypdf on the command line, or Adobe Acrobat Pro's built-in OCR. The result is a searchable PDF that looks identical to the scan but has a text layer underneath that the editor can target.
  2. Edit in place. ScoutMyTool's PDF Edit Text tool identifies the text in the OCRed layer, lets you click on a word, type a replacement, and re-renders the page with the new text in roughly the same position. Adobe Acrobat Pro's "Edit PDF" tool does the same with better font matching but at subscription cost.

Limits: the original scan was a photograph using fonts whose exact specification is not embedded in the file. The substitute font used for the edited text may differ slightly. Small changes look fine; larger ones look conspicuous because the new text does not match the surrounding font exactly. For mission-critical visual fidelity, Adobe Acrobat Pro's font-matching is the best in the field; for casual use, the free tools are sufficient.

Approach 2 in detail — OCR + Word round-trip

The right approach when the edit is substantial (rewriting paragraphs, adding new sections, restructuring). Three steps:

  1. OCR the scan as above.
  2. Convert the searchable PDF to Word using ScoutMyTool's PDF to Word or another converter. The result is an editable .docx where the OCRed text is laid out as paragraphs.
  3. Edit in Word, then export back to PDF using Word to PDF. The result is a clean Word-style PDF with your edits applied.

The trade-off versus Approach 1: visual fidelity degrades — the output looks like a Word document, not the original scan. For documents where the original appearance matters (signed contracts, certificates), this is the wrong path. For documents where the content matters more than the visual (internal memos, draft policy documents, anything you would re-flow in Word anyway), this is the right path. See Convert scanned PDF to Word for the detailed version of this workflow.

When NOT to edit — annotate instead

Many "I need to edit this scanned PDF" requests are actually requests to mark up the document — add comments, highlight sections, note proposed changes — rather than to modify the original. For those, annotation is the right tool, not editing. ScoutMyTool's annotation flow lives across several tools: Sign PDF for signatures, Add Watermark for repeated header text, Highlight (built into any PDF reader). The annotation sits on top of the original; the underlying scan stays pristine. Reviewers can see both the original and the proposed changes.

Use annotation when: you are reviewing rather than authoring; the recipient needs to see both the original and your additions; the original document carries authority you do not want to disturb (a signed contract being commented on, a historical record being annotated for context).

A safe edit workflow — six steps

  1. Make a working copy. Never edit the original. Naming convention: contract.original.pdf and contract.working.pdf.
  2. Decide which approach. Match the edit size to the approach matrix above: tiny fixes → Approach 1, substantial changes → Approach 2, visual fidelity required → Approach 3, review-only → Approach 4, complete rewrite → Approach 5.
  3. OCR the working copy if needed. Approaches 1, 2, and 3 all require OCR as the first step. Use PDF OCR with the correct language settings.
  4. Apply the edits. Use the matching tool. Test on one or two instances first before applying to the entire document.
  5. Verify visually. Open the edited PDF in a different reader than the editor used (Preview if you edited in Acrobat; Acrobat if you edited in a browser tool). Catches font-substitution issues and any layout shift.
  6. Re-OCR if the edit was raster-based. Approach 3 produces an image-only edited PDF. Run PDF OCR on the result so it remains searchable and accessible.

Frequently asked questions

Why can't I just open a scanned PDF in Word and edit the text?
Because a scanned PDF contains pixels, not text. Microsoft Word can open a regular text-based PDF (one exported from a word processor) and import it as an editable document. A scanned PDF is just an image of a page — Word imports the image but cannot edit the words inside it. The workaround is to OCR the scan first (which extracts the text from the pixels and adds a text layer), then Word can read the text layer and produce an editable document. Most "scanned PDF to Word" workflows are doing exactly this two-step pipeline behind the scenes.
How accurate is OCR for edit workflows?
For clean 300 DPI scans of modern printed text, 95–99 percent character accuracy per page is typical. That sounds high but at one error per 100 characters even short documents end up with several mistakes — a 500-word page might have 5–10 OCR errors that need manual correction. For mission-critical text accuracy (legal contracts, medical records), allocate roughly as much time to post-OCR proofreading as to the conversion itself. For "good enough to edit" workflows the accuracy is usually fine.
What is "edit-in-place" and when does it work?
Edit-in-place tools (Adobe Acrobat Pro's "Edit PDF" tool; ScoutMyTool's PDF Edit Text) let you click on existing text in a PDF and type to modify it. For a born-digital PDF with a real text layer, this works well — the original font is in the file, the substitution looks seamless. For a scanned PDF, the underlying text is OCR-recognised; the original font is not embedded; the substitute font Acrobat or ScoutMyTool uses for the edited text may not perfectly match. Small fixes (a typo, a date) usually look fine; longer changes start to look conspicuous because the new text does not match the surrounding font exactly.
My scanned PDF has handwriting. Can OCR read it?
Sometimes. Modern cloud OCR engines (Google Document AI, Amazon Textract, Microsoft Azure Form Recognizer) handle printed-handwriting reasonably well; cursive remains hard for everything. Open-source OCR (Tesseract) struggles on handwriting in general. If the handwriting is critical, expect to transcribe manually rather than relying on OCR. For mixed printed-and-handwritten documents (a form with a printed label plus a handwritten answer), the printed portion OCRs cleanly and the handwritten portion may need manual entry afterwards.
Are scanned PDFs uploaded when I use a free edit tool?
Depends on the tool. ScoutMyTool's PDF Edit Text, PDF OCR, and PDF to Word tools all run in your browser tab without uploading the file. Adobe Acrobat Pro runs locally on your desktop. Server-uploading services (Smallpdf, iLovePDF, OnlineCamScanner) require the file to transit their infrastructure. For confidential scanned documents (contracts, medical records, identity documents), the client-side or desktop options are the right default; the server-based ones carry a privacy cost worth being deliberate about.
How do I preserve the original look of the scanned document?
If the goal is to keep the original visual exactly while making text changes, your options are limited. Approach 1 (edit-in-place) preserves layout but may substitute fonts. Approach 3 (raster image edit) preserves visual fidelity at the cost of time and re-OCR. Approach 4 (annotate without modifying) keeps the original pristine but adds overlay annotations rather than true edits. The most-faithful "edit-in-place" outputs come from Adobe Acrobat Pro's edit tool, which can match the original font more closely than free alternatives because it has a larger font database to match against. For mission-critical visual fidelity, the paid Acrobat Pro tier still wins; for everyday workflows, the free tools are good enough.
How do I edit a scanned table or form?
Tables and forms are harder because the cell structure adds spatial constraints that OCR and re-rendering tools handle inconsistently. Two practical paths. (1) For occasional cell edits: OCR + edit-in-place. The numbers and labels become editable text; the visual grid remains as a background image. (2) For structural changes: rebuild the table in Word or Excel, drop the old image, and re-export as PDF. For fillable forms specifically, run the scan through ScoutMyTool's Fillable Form Builder which adds AcroForm fields on top of the original page image — the recipient can then type into the fields rather than editing the underlying scan.

Edit a scanned PDF in your browser, free

Browser-based PDF text editing — runs after OCR adds a searchable layer to your scan. Nothing is uploaded.

Open the free PDF Edit Text tool →

References

  1. ISO 32000-1:2008, Document management — Portable document format — Part 1: PDF 1.7. opensource.adobe.com PDF32000_2008. The text-showing operators that make text-based PDFs editable are described in §9; image-only pages (§8.9) are the reason scanned PDFs require OCR first.
  2. Tesseract OCR project, Tesseract User Manual. tesseract-ocr.github.io (accessed May 2026). The open-source OCR engine that powers ScoutMyTool, ocrmypdf, and most free scan-to-text workflows.
  3. ISO/IEC 29500-1:2016, Office Open XML File Formats — Part 1. iso.org standard 71691 (accessed May 2026). The Word format that Approach 2 round-trips through.