How to Extract Specific Data from PDF to Excel Using VBA

 

Extract Specific Data from PDF to Excel Using VBA (Code)

Sub Extract_Data_from_PDF()

Set MyWorksheet = ActiveWorkbook.Worksheets("Sheet1")
Application_Path = "C:\Program Files\Adobe\Acrobat DC\Acrobat\Acrobat.exe"
PDF_Path = "E:\ExcelDemy\standardnormaltable.pdf"

Shell_Path = Application_Path & " """ & PDF_Path & """"
Call Shell(pathname:=Shell_Path, windowstyle:=vbNormalFocus)

Application.Wait Now + TimeValue("0:00:03")

SendKeys "%vpc"
SendKeys "^a"
SendKeys "^c"

MyWorksheet.Range("A1").PasteSpecial Paste:=xlPasteAll

Call Shell("TaskKill /F /IM Acrobat.exe", vbHide)

End Sub

VBA Code to Extract Specific Data from PDF to Excel Using VBA


Extract Specific Data from PDF to Excel Using VBA: Step-by-Step Code Explanation

Consider the PDF file called standardnormaltable.pdf that contains a table of the normal distribution.

PDF File to Extract Specific Data from PDF to Excel Using VBA

And we’ve opened a worksheet called Sheet1 in an Excel workbook where we’ll copy the data from the PDF file.

Blank Worksheet to Extract Specific Data from PDF to Excel Using VBA

⧪ Step 1: Declaring the Necessary Inputs

These include the worksheet name, the range of the cells, the location of the application through which the PDF file will be opened (Adobe Reader in this example), and the location of the PDF file.

Set MyWorksheet = ActiveWorkbook.Worksheets("Sheet1")
Application_Path = "C:\Program Files\Adobe\Acrobat DC\Acrobat\Acrobat.exe"
PDF_Path = "E:\ExcelDemy\standardnormaltable"

⧪ Step 2: Opening the PDF File (by Using the VBA Shell Command)

Shell_Path = Application_Path & " """ & PDF_Path & """"
Call Shell(pathname:=Shell_Path, windowstyle:=vbNormalFocus)

Opening PDF to Extract Specific Data from PDF to Excel Using VBA

⧪ Step 3 (Optional): Waiting for a Few Moments

This allows your computer to finish the earlier task successfully and start a new task.

Application.Wait Now + TimeValue("0:00:03")

Waiting Some Time to Extract Specific Data from PDF to Excel Using VBA

This waits for 30 seconds. But if you want, you can wait for more.

⧪ Step 4: Using SendKeys to Copy Data from the PDF File

  • ALT + V, P, C: This is mainly for enabling the scrolling in the PDF. For small files, this is not necessary. But for larger files, this becomes necessary for selecting the whole file.
  • CTRL + A: This selects the whole file.
  • CTRL + C: Copies the selection.

Therefore, the lines of code will be:

SendKeys "%vpc"
SendKeys "^a"
SendKeys "^c"

⧪ Step 5: Pasting the Data in the Excel File

MyWorksheet.Range("A1").PasteSpecial Paste:=xlPasteAll

Pasting Data to Extract Specific Data from PDF to Excel Using VBA

This pastes into cell A1 of the worksheet and will populate the rest of the cells accordingly.

⧪ Step 6 (Optional): Closing the PDF File (Ending the Application)

Call Shell("TaskKill /F /IM Acrobat.exe", vbHide)

Read More: How to Extract Data from PDF to Excel


Example to Extract Specific Data from PDF to Excel Using VBA

We’ve seen the step-by-step procedure to extract data from a PDF file to an Excel worksheet using VBA.

Therefore, the complete VBA code to extract data from the PDF file called standardnormaltable to Sheet1 will be:

VBA Code:

Sub Extract_Data_from_PDF()

Set MyWorksheet = ActiveWorkbook.Worksheets("Sheet1")
Application_Path = "C:\Program Files\Adobe\Acrobat DC\Acrobat\Acrobat.exe"
PDF_Path = "E:\ExcelDemy\standardnormaltable.pdf"

Shell_Path = Application_Path & " """ & PDF_Path & """"
Call Shell(pathname:=Shell_Path, windowstyle:=vbNormalFocus)

Application.Wait Now + TimeValue("0:00:03")

SendKeys "%vpc"
SendKeys "^a"
SendKeys "^c"

MyWorksheet.Range("A1").PasteSpecial Paste:=xlPasteAll

Call Shell("TaskKill /F /IM Acrobat.exe", vbHide)

End Sub

VBA Code to Extract Specific Data from PDF to Excel Using VBA

Output:

Run this code. And it’ll copy data from the PDF file called “standardnormaltable” to the worksheet called “Sheet1” in the active workbook.

Read More: How to Extract Data from Multiple PDF Files to Excel


Things to Remember

  • The workbook in which you’ll copy the data from the PDF file must be kept open during running the code. Otherwise, you’ll have to use the name of the workbook in the code.
  • The name of the application that you are using inside the code (Adobe Acrobat DC here) must be installed on your computer. Otherwise, you’ll receive an error.
  • For large data sets of PDF files, the process may take some time to copy all the data and paste. So be patient and wait until the process completes successfully.

Download Practice Workbook

Download this practice workbook to exercise while you are reading this article.


Related Articles

<< Go Back to Import PDF to Excel | Importing Data in Excel | Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Rifat Hassan
Rifat Hassan

Rifat Hassan, BSc, Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, has worked with the ExcelDemy project for almost 2 years. Within these 2 years, he has written over 250 articles. He has also conducted a few Boot Camp sessions on effective coding, especially Visual Basic for Applications (VBA). Currently, he is working as a Software Developer to develop and deploy additional add-ins to enhance the customers with a more sophisticated experience with Microsoft Office Suits,... Read Full Bio

12 Comments
  1. It doesn’t work. It runs the SendKeys commands inside the Visual Basic Excel Editor and not te PDF file. The output si “Acrobat.exe” text in the A cell. Somehow you have to move the focus(make active) the PDF file before executing SendKeys commands.

    • Hello Marian,

      There are some unwanted issues that may arise while dealing with the VBA code. In that case, you need to utilize the following code.

      Sub Extract_Data_from_PDF()
      
      Dim MyWorksheet As Worksheet
      
      Set MyWorksheet = ActiveWorkbook.Worksheets("Sheet1")
      Application_Path = "C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe"
      PDF_Path = "C:\Users\USER\Desktop\Durjoy\Comments\New folder\standardnormaltable.pdf"
      
      Shell_Path = Application_Path & " """ & PDF_Path & """"
      Call Shell(pathname:=Shell_Path, windowstyle:=vbNormalFocus)
      
      Application.Wait Now + TimeValue("0:00:03")
      
      SendKeys "%vpc"
      SendKeys "^a"
      SendKeys "^c"
      
      On Error Resume Next
      Range("A1").PasteSpecial
      
      Call Shell("TaskKill /F /IM Acrobat.exe", vbHide)
      
      End Sub

      Here, the Application path and pdf path need to be defined accurately. Then, when you run the code, you will find the copied items in the clipboard.
      Then, click on Paste all to extract values.

      null

      Thanks
      Author, ExcelDemy

  2. Hi, the VBA code stops before “SendKeys “%vpc””, the PDF is read only and cannot be copied, can help ?

  3. hi, the PDF cannot be copied as Read Only, can help to revise the VBA code ? thx.

    • Hello, Billy!
      Thanks for sharing your problem with us!
      Actually, this code perfectly works for me. This code extracts specific data from pdf to Excel properly. Please, make sure you use the accurate Application and PDF paths.
      Can you please send me your excel file via email? ([email protected]).
      So that, I can solve your problem.

      Good Luck!

      Regards,
      Sabrina Ayon
      Author, ExcelDemy
      .

  4. hi, Sabrina Ayon,

    Thx for your quick response. But using your VBA code, some PDF files can be copied successfully, but some PDF is failed. It requests me using OCR as the PDF is a scanned copy. Or I need to use the button Edit PDF manually.

    Thank you.

  5. hi, Sabrina Ayon,

    when the PDF is non-readable or scanned, do you have the VBA code to extract PDF text?

    Regards,

    Billy

    • Dear BILLY,
      Your PDF file should be readable in order to extract data. The application needs to recognize data. Please ensure a readable copy.
      Regards
      ExcelDemy Team

  6. Hi, when I run the first version of the code, I only get the path of the filename appearing in cell A1.

    I can see it opening up the pdf and get no error issues but obviously don’t get the output I was looking for.

    If you can assist that would be great thanks

    • I am getting the exact same thing. And tried your altered code above in the comments and pretty much everything I could think of to eliminate areas of issue. I actually went so far as to make a new PDF FROM EXCEL with literally 3 columns of 5 rows of data (numbers between 1 and 20 even) with a 1 Row header. Still get one of two things every time… either the letter c in A1 (lowercase) and all the rest of the cells colored in gray. Or the file path pasted in A1.

      I originally typed/copied the code step by step with your (very good) instructions. But have even tried copying the entire thing (and only editing the paths) but still no luck.

      Have you been able to identify what is causing this for people!? I’d otherwise LOVE to use this and appreciate the work!

      • Hi JEFF Z,
        Thank you for sharing your problem with us. We have got a very simple solution to your problem. You just have to delete the last line (Call Shell(“TaskKill /F /IM Acrobat.exe”, vbHide)) from the code.
        And, the revised code is given as follows:

        Sub Extract_Data_from_PDF()
        
        Set MyWorksheet = ActiveWorkbook.Worksheets("Sheet1")
        Application_Path = "C:\Program Files\Adobe\Acrobat DC\Acrobat\Acrobat.exe"
        PDF_Path = "C:\Users\PC 21\Downloads\sample.pdf"
        
        Shell_Path = Application_Path & " """ & PDF_Path & """"
        Call Shell(pathname:=Shell_Path, windowstyle:=vbNormalFocus)
        
        Application.Wait Now + TimeValue("0:00:03")
        
        SendKeys "%vpc"
        SendKeys "^a"
        SendKeys "^c"
        
        MyWorksheet.Range("A1").PasteSpecial Paste:=xlPasteAll
        
        End Sub

        The problem with the previous code was that the code would close the PDF file before pasting it into Excel. So, we have removed the last line so that, the code doesn’t close the pdf file at all. In this case, you have to close the pdf file manually.
        Make sure to close the pdf file before running the code. And, also, make sure to clear the contents of Column 1 of Sheet 1 before running the code.

        Regards,
        Sourav Kundu
        ExcelDemy.

    • Reply Avatar photo
      Naimul Hasan Arif Mar 1, 2023 at 5:12 PM

      Hello Mark,
      Glad to hear from you. There might be one of several reasons for not having the desired output. If the file is corrupted, protected, or scanned, there will be problem in extracting data. It will be helpful for me to specify the problem if you could send me the file and the code at [email protected].

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo