GDPR and PDFs — personal data handling guide

How GDPR applies to PDFs — lawful basis, minimisation, retention, redaction, access requests.

6 min read

GDPR and PDFs — personal data handling guide

By ScoutMyTool Editorial Team · Last updated: 2026-05-20

Most GDPR-compliance work I have seen ends up treating PDF archives as the forgotten corner of the data inventory. Personal data sits in old contracts, employment records, customer correspondence, and signed agreements — none of it inside a structured database that compliance tooling can easily survey. This article maps GDPR\'s key principles to their PDF-specific implications, the practical controls that satisfy each, and the workflows for subject-access and erasure requests that compliance demands.

This article is general information, not legal advice. Consult a qualified data protection professional for your specific situation.

GDPR principles → PDF implications

GDPR principlePDF implicationPractical control
Lawful basis (Art. 6)Need a lawful basis to hold a PDF containing EU personal dataDocument why each PDF type is kept (contract, legitimate interest, consent)
Data minimisation (Art. 5(1)(c))Keep only the personal data necessary for the purposeRedact unnecessary PII before storage; archive minimised versions
Accuracy (Art. 5(1)(d))Personal data in PDFs must be accurate and up-to-dateProcess for rectification when an individual notifies an error
Storage limitation (Art. 5(1)(e))Retain only as long as necessary for the purposePer-document-type retention schedule; automated deletion at end of period
Integrity (Art. 5(1)(f))Protect against unauthorised access and accidental lossPassword-protect or encrypt; access-controlled storage
Right of access (Art. 15)Individuals can request copies of their personal dataSearchable, indexed archive so per-individual extraction is feasible
Right to erasure (Art. 17)Individuals can request deletion (with exceptions)Locate-and-delete process per individual; preserve audit log of deletions

Step by step — handle a subject-access request

  1. Verify the requester\'s identity before searching. GDPR Art. 12(6) allows proportionate verification — confirm via the contact channel on file.
  2. Search the PDF archive by name, email, and known identifiers. Run on a full-text index that covers OCR\'d scans, not just born-digital files.
  3. Compile relevant PDFs. Filter false positives (other people with the same name).
  4. Redact other individuals\' personal data from documents that contain mixed PII. Use destructive redaction (ScoutMyTool, Acrobat).
  5. Deliver within 30 days (Art. 12(3)) via the channel the requester used or specified. Document the response in a request log.

Retention schedules per PDF type

GDPR storage limitation requires a defensible retention period per document type tied to the lawful basis. Typical durations: employment contracts retain employment-duration + 6 years (alignment with statute of limitations); customer invoices retain 6–10 years (tax-law requirement); marketing consent records retain until consent withdrawn + 1 year; CCTV-derived PDFs retain 30 days (legitimate interest weighed against privacy). Document the schedule in a written retention policy, train the team, automate deletion via DMS retention rules where possible. Manual retention is reliable for small archives; automation matters at scale.

Note that "retain forever" is rarely a defensible policy under GDPR. If the data is no longer necessary for the original purpose, either delete or anonymise. Anonymisation (removing all directly and indirectly identifying information) is the GDPR escape hatch — anonymised data is no longer personal data and is outside GDPR scope. Achieving true anonymisation in PDFs is harder than redaction (need to also remove patterns that could re-identify); for most retention purposes, deletion is simpler.

Documentation that supports compliance

Four documents make a PDF GDPR compliance posture defensible. A record of processing activities (Art. 30) listing each PDF type held, its lawful basis, retention period, and security measure. A data protection impact assessment for any high-risk processing (Art. 35) — typically not required for routine PDF workflows but worth doing for sensitive ones (HR personnel files, health records). An information notice (Art. 13/14) shared with data subjects when their data is collected — including what is stored in PDFs and for how long. An incident response plan covering PDF-data breaches — what to do if a confidential PDF is exposed, who to notify, on what timeline.

These four documents fit on roughly ten pages total for a typical SMB and are written once, updated annually. They are what a regulator looks for first in an audit; having them ready demonstrates the rest of the compliance posture is taken seriously.

Related reading

FAQ

Does GDPR apply to PDFs my business already had before May 2018?
Yes — GDPR applies to all processing of EU personal data regardless of when the data was first collected. Historical PDFs (employee records, customer contracts, archived correspondence) containing EU personal data are subject to the same principles as new ones. The practical implication is that legacy archives need an audit: identify which PDFs contain EU personal data, document the lawful basis for each, set a retention schedule, and have a process for handling subject-access requests against the archive. For most businesses this is a one-time audit followed by an annual refresh.
A customer requested a copy of all their personal data — how do I find it across our PDF archive?
Two-step process. First, search by name across the entire archive using a full-text search tool (Recoll, DocFetcher, or your DMS's search). Ensure every PDF has been OCR'd so the search finds text inside scanned documents — un-OCR'd scans are invisible to text search and an individual's data hidden in them will be missed. Second, manually review the search results to confirm the PDFs actually contain personal data about the requesting individual (vs name collisions). Compile the relevant PDFs, redact other individuals' personal data from any documents that contain mixed PII, deliver within 30 days (GDPR Art. 12 timeline).
Can I rely on password-protecting PDFs as my GDPR security measure?
Partially. GDPR Art. 32 requires "appropriate technical and organisational measures" — password protection is one such measure but rarely sufficient on its own. For PDFs containing routine personal data (employee contracts, customer invoices) AES-256 password protection plus controlled storage is appropriate. For special-category data (health, biometric, political opinion — Art. 9), password protection is the floor not the ceiling; consider also access control, audit logging, and (for high-volume processing) encryption at rest of the underlying storage. Document your risk assessment so the choice of measures is defensible.
Do I need to redact personal data in old PDFs to comply with data minimisation?
Yes when the original purpose has ended and the data is now excessive. Example: an employee's original employment contract contained their bank account number for payroll setup. Five years after they left, the bank account number is no longer necessary — redact it. The original contract document remains; the unredacted version goes through a destructive redaction step before being kept for the longer-term retention period. The redacted-archive version satisfies retention for legal/tax purposes without exceeding data minimisation. ScoutMyTool Redact PDF runs client-side and produces true destructive redactions suitable for compliance.
My PDFs are stored on Dropbox / Google Drive — is that GDPR-compliant?
Depends on the data-processing agreement and the data location. Both Dropbox Business and Google Workspace offer GDPR-compliant data-processing agreements (DPAs) and EU-region storage options — but only on certain paid tiers, and you must explicitly opt into EU storage. Free / consumer tiers may store data in the US and have weaker DPA terms. Audit your account: is it Business tier with the DPA in place, with EU-region storage enabled? If yes, processing through these vendors is generally GDPR-compliant. If no, either upgrade the account, switch to an EU-based provider, or move sensitive PDFs to on-premises storage.

Citations

  1. EU General Data Protection Regulation (Regulation (EU) 2016/679).
  2. European Data Protection Board — guidelines on the rights of access and erasure.
  3. ICO (UK) — guidance on data minimisation and retention.
  4. CNIL (France) — guidance on document retention and erasure obligations.
  5. ISO 32000-1:2008 — PDF specification including encryption and redaction mechanics.

Browser-based redact + protect for compliance

ScoutMyTool Redact and Protect run client-side — personal data stays on your machine through compliance processing.

Open the PDF toolkit →