How to Extract Data from Multiple PDF Files to Excel (3 Methods)

Dataset Overview

We are going to use the below PDF file dataset to demonstrate the extraction of data from PDF to Excel sheets.


Method 1 – Using the Power Query Tool to Extract Data from Multiple PDF Files to Excel

To open three PDF files and load all the tables from each file into an Excel worksheet, follow these steps:

  • Go to the Data tab and click on Get Data.

Extract Data from Multiple PDF Files Using Power Query

  • Select From File and then choose From Folder.

Extract Data from Multiple PDF Files Using Power Query

  • Browse to the folder where your PDF files are saved and click Open.

  • In the Power Query window, you’ll see the loaded PDF files listed in the Name column.

Extract Data from Multiple PDF Files Using Power Query

  • Click on the “Combine” option and choose either Combine & Transform Data or Combine & Load.

  • A new query window will open, allowing you to preview and select tables from each file.
  • In that window, you can cycle through files by the Sample File window.

  • You can choose a table of a particular file for the preview.
  • Click OK.

  • All of the tables are now loaded into a new query window. The left-most column of the table denotes the source of the data, whether it is from Dataset_1 or Dataset_2, or Dataset_3.

  • Once all tables are loaded, go to the Home tab and click Close & Load.

  • In the Import Data window, select Existing Worksheet and specify the data location (e.g., $B$5:$F$29).
  • Click OK after this.

  • The data table will now be loaded into the specified location as an Excel table.
  • Convert the table back to a range by going to Table Design and selecting Convert to Range from the Tools group.

  • The loaded data is now converted into range.

  • Delete the source column (e.g., cells B4:B35) to complete the extraction.

The Excel worksheet now has extracted data from multiple PDF files.

Read More: How to Extract Data from PDF to Excel


Method 2 – Extracting Data From Multiple PDF Files to Excel Using Microsoft Word

We are going to import the below PDF file into Excel.

  • Open Microsoft Word and click Open from the file menu.

  • Browse to the location of your PDF files and select the desired file.
  • Click Open.

  • The PDF file will open in Word in an editable form.

  • Select the table, right-click, and choose Copy.

Using Microsoft Word oextract data from multiple pdf files to Excel

  • Switch to your Excel worksheet, select cell B4, and right-click.
  • Choose the Keep Source Formatting (K) icon to paste the table from the Word file.

Using Microsoft Word oextract data from multiple pdf files to Excel

  • Adjust cell formatting as needed to match the PDF format.

  • The below is the final image after tweaking some formatting such as text color, column width, cell color, etc.

  • You can repeat the same process for other files.
  • Below is the second PDF file’s data that has been extracted into the Excel worksheet.

  • Below is the third PDF file’s data that has been extracted into the Excel worksheet.

Read More: How to Export Data from Fillable PDF to Excel


Method 3 – Applying Copy and Paste Tool to Extract Data from Multiple PDFs

  • Open the PDF file containing the table data you want to extract.
  • Press Ctrl+A to select all the data in the PDF.
  • Right-click and choose Copy from the context menu.

Copying and Pasting Directly from the PDF File to extract data in Excel

  • Switch to your Excel worksheet and select cell B4.
  • Right-click again and click on the Keep Source Formatting (K) icon from the Paste Options.

Copying and Pasting Directly from the PDF File to extract data in Excel

  • The PDF table will now be pasted into the Excel sheet.
  • Note that the table may not look exactly the same as in the PDF due to fixed cell sizes in Excel.
  • Apply formatting adjustments (e.g., text color, cell color, cell width) to make it match the original PDF layout.

Copying and Pasting Directly from the PDF File to extract data in Excel

After applying formatting, the imported table now looks like the PDF one.

Copying and Pasting Directly from the PDF File to extract data in Excel

Read More: How to Link PDF Form to Excel Database


Download Practice Workbook

You can download the practice workbooks from here:


Related Articles

<< Go Back to Import PDF to Excel | Importing Data in Excel | Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Rubayed Razib Suprov
Rubayed Razib Suprov

Rubayed Razib, holding a BSC degree in Naval Architecture & Engineering from Bangladesh University of Engineering and Technology, serves as a devoted member of the ExcelDemy project. He has contributed significantly by authoring numerous articles and showcasing proficiency in VBA. Razib efficiently automates Excel challenges using VBA macros and actively participates in the ExcelDemy forum, providing valuable solutions for user interface challenges. Apart from creating Excel tutorials, he is interested in Data Analysis with MS Excel,... Read Full Bio

4 Comments
  1. how to Convert Multiple PDF files from PDF to excel using power BI and power query. When i used your video and sample file it was working fine. But i downloaded 3 Annual reports of TVS Motors each about 500 pages and tried to convert multiple PDF files to excel using power BI. But the data is not getting combined. Instead of Table001 and page001 i get multiple Tables and pages. Can you guide me how to do it.

    • Hello P Vinayakam,

      Thanks for your question! Handling large PDF files in Power BI and Power Query can be tricky, especially with lengthy reports like the ones you mentioned. When Power BI detects multiple tables and pages instead of a single combined dataset, try the following steps:

      Use the Navigator Pane Efficiently: Select only the required tables from each PDF. If there’s a pattern in table names (like Table001, Table002), select them carefully.

      Combine Tables Dynamically: Once you load the tables, go to Home >> Append Queries in Power Query to combine the tables. This will help merge the data from multiple pages or tables.

      Standardize Table Structures: Inconsistent table structures often prevent proper merging. Use Remove Columns, Rename Columns, and Align Headers to ensure the tables have a uniform structure.

      Filter Tables: If you notice irrelevant tables (like Table003, Page001), apply filters in Power Query to target only necessary data.

      Regards
      ExcelDemy

  2. Iam a equity Analyst. I have ten annual reports of TVS Motors in PDF format from FY2014 to FY2024 downloaded in a folder in desktop. Now i want my Profit & Loss account, Balance sheet and Cash flow statement to be arranged in ascendinig order in excel format using Power BI. Do you teach this in your Power BI course. If so what is the fees?

    • Hello P Vinayakam,

      Thank you for reaching out! Extracting and organizing financial statements from multiple annual reports using Power BI is a valuable task for equity analysts. While we don’t specifically offer a structured Power BI course, I can share some key steps to help you get started:

      Load PDF Files: Use Get Data > PDF in Power BI to load all your annual reports from the folder.
      Data Cleanup: Use Transform Data in Power Query to extract the Profit & Loss Account, Balance Sheet, and Cash Flow tables.
      Sort Data: Apply sorting functions to organize the data in ascending order by fiscal year.
      Create Visuals and Reports: Build comprehensive tables or dashboards for better insights.

      Regards
      ExcelDemy

Leave a reply

ExcelDemy
Logo

Advanced Excel Exercises with Solutions PDF