My table came out misaligned or merged into one column — why?

Detecting column boundaries from a PDF is fundamentally a heuristic problem. The original PDF only contains glyph positions, not "this is a table" semantics. When the source used merged cells, single-space gutters between columns, or wrapped text inside cells, the column-detector has to guess. The tool gets you into the right ballpark — usually a five-minute hand-tidy in Excel finishes the job, which is still vastly faster than re-typing.

Are formulas preserved?

No, and this is a hard limit of the PDF format rather than a tool limitation. PDFs only store the displayed values that were rendered when the spreadsheet was exported — the underlying =SUM() or =VLOOKUP() expressions are not part of the file. The Excel output is a flat data table; you re-apply formulas in Excel if you need them.

Does it work on scanned PDFs (page images, not text)?

Yes. When the PDF has no extractable text, the tool runs OCR (optical character recognition) on each page image first, then feeds the recognised text into the same table-detection pipeline. English is supported. The summary panel says "Scanned PDF (text recognised from images)" so you know which path was used and can expect slightly looser column alignment than a native PDF.

Can I convert just one page out of a multi-page PDF?

Not directly — the entire PDF is converted in one pass. The standard workaround: use our /pdf/split-pdf tool first to extract just the page (or page range) containing the table you want, then convert that smaller PDF. This is also faster, since conversion time scales with page count.

What's the difference between XLSX and XLS?

XLSX is the modern Office Open XML format introduced with Excel 2007 — smaller files, more rows per sheet (1,048,576 vs 65,536), more features, and the de facto standard for current Excel, Google Sheets, Apple Numbers, and LibreOffice Calc. XLS is the legacy 1997-2003 binary format, useful only when your downstream system is too old to read XLSX. Pick XLSX unless you specifically know otherwise.

What is the file size limit?

50 MB per file. For larger PDFs, split them with our /pdf/split-pdf tool and convert each part separately. Per-table conversion is also a good idea regardless of file size — large PDFs with many unrelated tables produce a single concatenated sheet that is awkward to navigate.

If I just need raw text rows, do I need XLSX?

No — for plain text data that you want to pipe into another tool, scripts, or a database, use /pdf/pdf-to-csv instead. CSV is one human-readable text file per table, no Excel install needed, and trivially scriptable in Python, R, or shell.

Free PDF to Excel converter — extract…

8 min read

Free PDF to Excel converter — extract tables to spreadsheet

By ScoutMyTool Editorial Team · Last updated: 2026-05-17

I spent half a Sunday afternoon re-typing rows from a vendor pricing PDF into Excel before realising I was on the wrong side of a fifty-year-old computing problem. The data was right there, perfectly aligned on the page; I just needed a tool that could turn the visual layout back into a spreadsheet. A few attempts at free online converters later — one paywalled me, one mangled the columns, one wanted my email three times — I had the workflow I now use for every vendor list, financial statement, and exported report that lands in my inbox as a PDF. Below is what survives the round-trip, what reliably needs cleanup, and how to set expectations before you even click Convert.

Step-by-step: convert a PDF table to Excel

ScoutMyTool's converter runs server-side (high-quality table reconstruction needs heavy detection libraries that aren't browser-friendly). Files are uploaded over HTTPS, converted, returned to you, and deleted immediately — nothing is retained.

Open the tool. Go to scoutmytool.com/pdf/pdf-to-excel. Static HTML, no account screen — the upload zone is interactive within about a second on broadband.
Pick the output format. Choose XLSX (recommended — modern Excel, smaller files, more rows per sheet) or XLS (legacy Excel 97-2003, only if a downstream system specifically requires it). XLSX is the public Office Open XML standard.¹
Add your PDF. Drag-and-drop or click to pick. One PDF per pass, up to 50 MB. For larger files, run them through split-pdffirst and convert the parts — also faster, since conversion scales with page count.
Click "Convert". Two paths split based on what's inside the PDF:
- Native PDF: text and vector content. The underlying text stream is extracted, glyph positions are clustered into rows and column candidates, and the column-detector resolves the layout into a real spreadsheet. Typical time: 5-15 seconds for a few-page financial statement.
- Scanned PDF: page images, no extractable text. Each page is OCR'd first to extract text, then fed into the same column-detection pipeline. Per-page OCR adds roughly a second; very dense pages can take longer.
Download and open. The output filename uses your source name with the chosen extension (e.g. vendor-prices.xlsx). It opens in Excel, Google Sheets, Numbers, and LibreOffice Calc — every spreadsheet program that speaks Office Open XML, which is essentially all of them since 2007.
Skim before you trust. Look at three things in order: do the column headers line up correctly with the data below; are any numeric columns accidentally left-aligned (a hint that they were parsed as text, not numbers); do any rows wrap into the next row by mistake. These three checks catch 90% of the issues in 60 seconds.
Clean up if needed. Common five-minute fixes: re-format left-aligned numeric columns to General or Number; merge or unmerge a header row; delete a stray empty row that came from a PDF page break. Realistic cleanup ranges: clean tabular PDFs (financial statements, exported reports) usually need zero touch; complex multi-table PDFs with merged headers need 2-10 minutes.
If the PDF is password-protected. The converter cannot read encrypted PDF streams and will return a clear error instead of corrupted Excel. Unlock the PDF first via unlock-pdf — or open it in any viewer and re-save without a password — then convert the unlocked copy.

What survives the conversion (and what doesn't)

PDF tables are not "tables" the way Excel knows them — they're glyphs at coordinates that happen to look table-shaped to a human. Reconstructing the spreadsheet involves informed guesses. Here is what you can realistically expect:

Cell values: survive faithfully on native PDFs; OCR'd values are 98-99% accurate on clean scans, lower on faxes or low-DPI photos.
Column boundaries: detected from the spacing pattern between values. Reliable when the source used clear gutters; weaker when columns are tight or when text wraps inside cells.
Row boundaries: detected from vertical position. Very reliable for clean tables; can stumble when rows have inconsistent heights or when one logical row wraps across multiple visual lines.
Numeric formatting: currency symbols, thousands separators, and decimals come through as text. You'll usually want to re-format columns to General or Number in Excel so you can run calculations on them.
Dates: come through as text strings. Run Data → Text to Columns → Date in Excel to convert to real dates if you need to sort or compute on them.
Formulas: never preserved — they aren't in the PDF to begin with. PDFs store only the displayed values.
Merged cells & nested headers: the hardest case. Often the simplest approach is to convert the data and re-build the merged-header layout in Excel by hand.
Charts & sparklines: exported from Excel into a PDF as raster images. They cannot be reversed back into editable charts; only the underlying numeric data (if visible in a nearby table) can be reconstructed.
Multi-table PDFs: currently concatenated into a single sheet. Split the PDF first if you need tables on separate sheets.

How ScoutMyTool compares to Smallpdf, iLovePDF and PDF2Go

All four tools are server-side conversions. The meaningful differences are quota, size cap, OCR availability on the free tier, and how long files are retained after download.

Feature	ScoutMyTool	Smallpdf	iLovePDF	PDF2Go
Free for unlimited conversions	Yes	2 per day, then paywall	1 file per task on free tier	Yes, up to 100 MB
No signup required	Yes	Required after 2 tasks	Required for files >50 MB	Yes
Per-file size limit	50 MB	5 GB Pro / 100 MB free	200 MB free	100 MB free
Scanned-PDF OCR on free tier	Yes (English)	Pro tier only	Pro tier only	Yes
File retention after download	Deleted immediately	1 hour (then deleted)	2 hours (then deleted)	24 hours (then deleted)

Third-party tool quotas are taken from each vendor's public pricing page as of May 2026 and may change.

Related PDF data-extraction tools

PDF to CSV — plain-text CSV for scripts, data pipelines, or any downstream tool that prefers text to XLSX.
PDF to Word — for documents where you need the prose, not the tables.
PDF to Text — strip formatting entirely and extract just the raw text.
Excel to PDF — the round-trip back to PDF when you're done editing.
Split PDF — required first step for PDFs over 50 MB, or to isolate a single table page.
Unlock PDF — required first step for password-protected PDFs.

Frequently asked questions

My table came out misaligned or merged into one column — why?: Detecting column boundaries from a PDF is fundamentally a heuristic problem. The original PDF only contains glyph positions, not "this is a table" semantics. When the source used merged cells, single-space gutters between columns, or wrapped text inside cells, the column-detector has to guess. The tool gets you into the right ballpark — usually a five-minute hand-tidy in Excel finishes the job, which is still vastly faster than re-typing.
Are formulas preserved?: No, and this is a hard limit of the PDF format rather than a tool limitation. PDFs only store the displayed values that were rendered when the spreadsheet was exported — the underlying =SUM() or =VLOOKUP() expressions are not part of the file. The Excel output is a flat data table; you re-apply formulas in Excel if you need them.
Does it work on scanned PDFs (page images, not text)?: Yes. When the PDF has no extractable text, the tool runs OCR (optical character recognition) on each page image first, then feeds the recognised text into the same table-detection pipeline. English is supported. The summary panel says "Scanned PDF (text recognised from images)" so you know which path was used and can expect slightly looser column alignment than a native PDF.
Can I convert just one page out of a multi-page PDF?: Not directly — the entire PDF is converted in one pass. The standard workaround: use our /pdf/split-pdf tool first to extract just the page (or page range) containing the table you want, then convert that smaller PDF. This is also faster, since conversion time scales with page count.
What's the difference between XLSX and XLS?: XLSX is the modern Office Open XML format introduced with Excel 2007 — smaller files, more rows per sheet (1,048,576 vs 65,536), more features, and the de facto standard for current Excel, Google Sheets, Apple Numbers, and LibreOffice Calc. XLS is the legacy 1997-2003 binary format, useful only when your downstream system is too old to read XLSX. Pick XLSX unless you specifically know otherwise.
What is the file size limit?: 50 MB per file. For larger PDFs, split them with our /pdf/split-pdf tool and convert each part separately. Per-table conversion is also a good idea regardless of file size — large PDFs with many unrelated tables produce a single concatenated sheet that is awkward to navigate.
If I just need raw text rows, do I need XLSX?: No — for plain text data that you want to pipe into another tool, scripts, or a database, use /pdf/pdf-to-csv instead. CSV is one human-readable text file per table, no Excel install needed, and trivially scriptable in Python, R, or shell.

Ready to extract?

No signup, files deleted after 1 hour — for this server-side conversion specifically, your file is removed immediately after you download the spreadsheet.

Open the free PDF to Excel converter →