PDF for lawyers — document review and bates numbering basics

Practical PDF workflow for legal document review — bates, OCR, true redaction, privilege log, bundle production.

7 min read

PDF for lawyers — document review and bates numbering basics

By ScoutMyTool Editorial Team · Last updated: 2026-05-20

Introduction

Discovery in modern litigation is a PDF problem at heart: thousands of pages arrive as a production set, each must be uniquely identifiable, the privileged content must be properly withheld, and the responsive set produced back with redactions where required. The tools at a small-firm or solo practitioner's disposal — without paying for Relativity or Concordance — are five free PDF utilities applied in a specific sequence. This article walks through the document-review workflow, the specific bates and redaction conventions that hold up in court, and the privacy constraints that distinguish privilege-safe tools from risky ones.

The document-review workflow

StepWhatWhy
1. IngestReceive production PDFs from opposing counsel or clientEstablishes the document universe for review; metadata preservation matters
2. Bates stampApply unique bates number to every pageLets you cite specific pages in correspondence, motions, and depositions
3. OCRRun optical character recognition on image-only pagesWithout OCR, keyword search and TAR (technology-assisted review) cannot run
4. Privilege reviewFirst-pass tag for privilege, responsiveness, key issuesPrivileged docs must be logged and withheld; non-responsive culled
5. RedactApply true redactions to privileged or PII contentRectangle annotations leak — true redaction removes the underlying text
6. ProduceCompile the responsive, non-privileged set as a deliverableFormat and bates ranges must match opposing counsel's production specification
7. LogGenerate privilege log listing withheld documents by batesPrivilege log required by Fed. R. Civ. P. 26(b)(5) and most state rules

Step by step — produce a bates-stamped set

  1. Merge source PDFs into one production file. Use Merge PDF to combine in the agreed production order. Verify page count matches the source.
  2. OCR the merged file so every page has searchable text. Use Make PDF Searchable with the appropriate language pack(s). Verify by Cmd-F searching a known phrase.
  3. Bates-stamp the file. In Bates Numbering, set prefix (e.g. SMTH), padding (6–7 digits), start number (per the production specification), position (bottom-right is conventional). Apply.
  4. Apply true redactions to privileged or PII content. Use Redact PDF — draw rectangles over content to redact, choose "Apply", verify the underlying text is gone. Save and label as the redaction-applied version.
  5. Generate the privilege log and produce. Export the privilege-tagged metadata to a spreadsheet (bates range, date, author, recipient, description, privilege type). Send the redacted production set + privilege log to opposing counsel via the agreed transmission method.

FAQ

How do I bates-stamp a production set so the numbers match opposing counsel's format?
Match three things to the specification: prefix, padding width, and starting number. The prefix is typically the producing party's initials plus a session identifier (e.g. SMTH-2026-0001 for "Smith production session 2026 page 0001"). Padding width is usually 6 or 7 digits zero-padded so all numbers sort lexicographically; 8 digits if the production exceeds 10 million pages. The starting number is whatever opposing counsel says to start from — often "0000001" for a clean production or a continuation number from a prior production session. In ScoutMyTool Bates Numbering, set the prefix, padding, and start number explicitly; the tool stamps each page in sequence and exports a bates-tagged PDF ready for the production deliverable.
Why is "true redaction" different from drawing a black box?
A black rectangle drawn over text in a PDF annotation tool is metadata layered above the original page contents — the underlying text is still present in the PDF's content stream. Anyone with Acrobat Pro (or any PDF parser) can remove the annotation or extract the text underneath. True redaction is destructive: it removes the underlying text and replaces it with the redaction colour permanently. ScoutMyTool Redact PDF performs true redaction — after Apply, the redacted text cannot be recovered. Multiple high-profile cases (US v. Manafort 2018; Burgess v. Sansom 2021) involved sanctioned attorneys whose "redactions" were actually annotations and could be defeated by anyone with Acrobat Pro. Always verify by trying to select-copy the redacted region — you should get nothing or the redaction colour, never the original characters.
How do I create a privilege log from a tagged document set?
Export the privilege-tagged documents to a spreadsheet listing one row per withheld document with the columns required by FRCP 26(b)(5)(A): bates range, document date, author, recipients, subject/description, privilege type (attorney-client, work product, joint defence), and grounds for withholding. Most review platforms (Relativity, Concordance, Everlaw) export this directly. For solo and small-firm work without a review platform, build the log in Excel keyed to the bates numbers of the withheld set. Cross-check that every bates number on the log is genuinely missing from the production set — a common error is including bates numbers that were actually produced.
What is the difference between PDF redaction and just deleting the page?
Redaction preserves the page in the production set with the privileged content removed, so opposing counsel sees the page with redactions in place and knows there was withheld content. Deleting the page removes it from the production entirely, which leaves a bates-number gap that opposing counsel will notice. The right approach depends on the rule: most US federal courts expect redactions over deletions because the page-existence signal is valuable. State rules vary — California civil procedure differs from Texas in how missing pages are treated. In practice, redact when you can show the page context (date, header, addresses) safely; delete only when the entire page is privileged with no non-privileged context worth retaining.
Can I use a free browser-based PDF tool for privileged client documents?
Only if the tool runs client-side. Privileged client documents must not be transmitted to a third party — the moment a privileged document touches an outside vendor's server, the privilege argument gets harder to defend. ScoutMyTool's redaction, OCR, bates, merge, and protect tools run entirely in your browser tab; the file never uploads. Smallpdf, iLovePDF, Adobe online, and other server-side tools should not be used for privileged content unless you have a vendor agreement (NDA + data-processing terms) that explicitly addresses attorney-client privilege. For non-privileged production-set work, either approach is defensible — but the safe default is client-side.
How do I generate a hearing bundle (a tabbed, paginated bundle for court)?
Five-step workflow. First, organise tabs in folders (01_Tab_A, 02_Tab_B, ...) with one PDF per tab. Second, create a cover page and tab divider for each section. Third, merge in order with ScoutMyTool Merge PDF. Fourth, apply continuous page numbering across the merged file (every page gets a unique number; tab dividers count). Fifth, add a hyperlinked table of contents at the front linking to each tab. For UK courts, the hearing bundle convention is more prescriptive (specific page formats, paragraph numbering); follow the local rule. For US courts, the standard is "neat, searchable, paginated, and indexed" — exact formatting varies by judge.
How do I OCR a 1000-page production set efficiently?
Split into chunks of 50–100 pages, OCR each chunk separately, then re-merge. Browser-based OCR can stall on very large single files due to memory pressure; chunking keeps the working set manageable. ScoutMyTool Make PDF Searchable handles up to ~200 pages comfortably in a single pass on a modern laptop; beyond that, chunk. After OCR, verify the searchable text by spot-checking three pages: Cmd-F a word you can see on the page; the search should jump there. For accuracy-critical productions (high-stakes litigation), run OCR twice with different settings and compare — the diff highlights pages where the OCR was uncertain.

Citations

  1. Federal Rules of Civil Procedure, Rule 26(b)(5)(A) — privilege log requirements in US federal litigation.
  2. Federal Rules of Civil Procedure, Rule 34 — production of documents and electronically stored information.
  3. The Sedona Conference — "Cooperation Proclamation" and best-practice guidelines on e-discovery.
  4. ISO 32000-1:2008 — "Document management — Portable document format" — base PDF specification.
  5. NIST Special Publication 800-88 — "Guidelines for Media Sanitization" — applicable principles for digital redaction.

A privilege-safe PDF toolkit

Every ScoutMyTool PDF tool runs entirely in your browser tab. Privileged client documents never transmit to a third-party server.

Open the PDF toolkit →