In this article, learn how to
– Scrape tabular data from web in Excel.
– Scrape entire page data including hyperlinks from web.
While writing this article, we applied all procedures using Microsoft 365, but you can also try these in other versions.
Excel, a popular spreadsheet program, offers a user-friendly and accessible platform for scraping web data. Moreover, we can also use different functions and features to analyze those data to understand the nature and characteristics of an event like how the dollar price rate will be in the next few months or how the stock price of the share market will change.
Here is an overview of a data table after web scraping.
Download Practice Workbook
Web Scraping in Excel: 2 Suitable Ways
In this section, we will show you two practical methods for scraping data from a website and importing it into Excel. To demonstrate, we select a table from a website that we want to export into Excel for additional examination.
1. Utilizing Get Data From Web Feature for Scraping Data from a Website
- First, navigate to the Data tab and select From Web under the Get & Transform Data section of your Excel workbook.
- A window will pop up. Insert the URL of the website whose data you want to scrape.
- After that, click OK.
- Next, a window will pop up saying how you want to access the web. Select Anonymous and click Connect.
- The data of the website will be linked to Excel. If the website contains several tables, you will see a list of them. Select one that you want to work with and click on the Transform Data button.
- You will see the data table in the Power Query Editor. Remove unnecessary columns and make suitable changes to the table. As the table contains a lot of columns, we cannot show it completely in the Power Query Editor.
- To load the table in a sheet, click on Close & Load button or drop-down. Select Close & Load To to load the table in a specific sheet.
- After that, insert the cell reference where you want to extract the table in the Import Data dialog box and click OK.
Later, the data table will be imported to the desired cell of the sheet.
- A little trick can be used to update the table if there’s any change in the website. Just select any cell on the extracted table and go to Data >> Refresh All Drop Down >> Connection Properties… If you want to refresh the data table instantly or manually, just choose the Refresh or Refresh All.
- Next, the Query Properties window will appear. Check the Refresh every option and set a suitable time. Here, we set the timer to 10 So the data table will update every 10 minutes.
2. Use of VBA for Web Scraping in Excel
2.1 Scraping Data from Web as Table
- First, press Alt+F11 or select Developer >> Visual Basic to open the VBA editor.
- Next, open a VBA Module following the image below.
- Insert the code below in the Module.
Option Explicit
Public Sub ExtractStockData()
Call ClearSheet
Call UseQueryTable
End Sub
Private Sub ClearSheet()
Dim aA_table As QueryTable
For Each aA_table In Sheet5.QueryTables
aA_table.Delete
Next aA_table
Sheet5.Cells.Clear
End Sub
Private Sub UseQueryTable()
Dim aA_URL As String
aA_URL = "https://www.investing.com/equities/"
Dim aA_table As QueryTable
Set aA_table = Sheet5.QueryTables.Add("URL;" & aA_URL, Sheet5.Range("A1"))
With aA_table
.WebSelectionType = xlSpecifiedTables
.WebTables = "1"
.WebFormatting = xlWebFormattingNone
.Refresh
End With
End Sub
🎓How Does the Code Work?
- In this case, we utilized the Private Sub Procedure UseQueryTable() to extract tabular data from the website of interest.
- A With Statement helps to organize the data on the Sheet5 sheet, which we titled Result (see the worksheet).
- The Private Sub Procedure ClearSheet deletes previously used tables, making it easier to scrape new data from the URL.
- Finally, the Public Sub Procedure (ExtractStockData) takes the commands from these Private Sub Procedures to scrape the tabular data from the website.
You can directly use the code; you just need to change the website link if needed.
- Press Alt + F8 to run the Macro.
The table will be scraped to the sheet named Result.
Next, format the data table to make it look better.
2.2 Scraping Entire Page Data from Web Including Hyperlinks
Here is another code to scrape the entire page.
Option Explicit
Public Sub ScrapeFullPage()
Call ClearSheet
Call UseQueryTable
End Sub
Private Sub ClearSheet()
Dim aA_table As QueryTable
For Each aA_table In Sheet6.QueryTables
aA_table.Delete
Next aA_table
Sheet6.Cells.Clear
End Sub
Private Sub UseQueryTable()
Dim aA_URL As String
aA_URL = "https://www.investing.com/equities/"
Dim aA_table As QueryTable
Set aA_table = Sheet6.QueryTables.Add("URL;" & aA_URL, Sheet6.Range("A1"))
With aA_table
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingAll
.Refresh
End With
End Sub
🎓How Does the Code Work?
- We utilized the Private Sub Procedure UseQueryTable() to retrieve tabular data from the website of our choice.
- A With Statement helps in organizing the data on the Sheet6 page, which we labeled Entire Page (see the worksheet).
- The Private Sub Procedure ClearSheet deletes previously used tables, allowing new data to be scraped from the URL.
- Finally, the Public Sub Procedure (ScrapeFullPage) scrapes the full webpage from the website using the commands from these Private Sub Procedures.
- Run the macro similarly to scrape the entire webpage. The name of the Macro here is ScrapeFullPage.
After running the code, you will see all the data (including links) of the website in your Excel sheet named Entire Page (Sheet6). The following picture shows the first parts of the data in the website.
And here is the image after a bit of scrolling, showing the data table we scraped earlier using the Macro named ExtractStockData.
Things to Remember
- The VBA code technique is not suitable for every webpage. Rather, we must adjust the code each time according to the layout of the webpage. As a result, it can be time-consuming and inconvenient.
- In the 1st method, we can scrape multiple objects from a webpage at a time.
Frequently Asked Questions
1. How does web scraping in Excel work?
Answer: Web scraping in Excel typically involves using Excel’s built-in web query functionality or utilizing external tools and libraries that can fetch data from websites and import it into Excel. You can specify the URLs to scrape, define the data elements you want to extract, and configure the scraping process accordingly.
2. Are there any limitations or legal considerations when web scraping in Excel?
Answer: Yes, there are limitations and legal considerations to keep in mind when web scraping in Excel. Some websites may have terms of service or robots.txt files that prohibit scraping. It’s important to respect these rules and ensure that your scraping activities comply with applicable laws and regulations. Additionally, websites may have rate limits or employ anti-scraping measures that can affect the success and speed of your scraping process.
Conclusion
In a nutshell, we can conclude that you will be an expert in web scraping in Excel after going through the topics and practicing the problems from the downloadable files from the articles above. If you have any questions or feedback regarding this article, please share them in the comment section. This will help me enrich my upcoming articles.
Web Scraping in Excel: Knowledge Hub
<< Go Back to Importing Data in Excel | Learn Excel
Get FREE Advanced Excel Exercises with Solutions!