Excel Data Mining (All Things You Need to Know)

In this tutorial, we will demonstrate data mining Excel example. Further, we will explore examples to prepare data before data mining in Excel. You will also learn to install Data Mining Add-in in Excel.

Businesses produce lots of data nowadays, which can give useful ideas if analyzed well. Data mining finds patterns in big sets of data and is a strong way to get these ideas. Microsoft Excel has tools for data analysis and mining examples too. Read the full article to learn about this in the next sections.

Before that, see the overview image of the data mining Excel example below.

Overview of data mining Excel example


Download Practice Workbook

You can download the practice workbook for free.


What is Data Mining in Excel?

Data mining means finding patterns and useful ideas in big sets of data using math and computers. It looks at data from different places, finds patterns and links, and turns it into helpful info. The goal of data mining is to uncover patterns in big data. This helps with smart choices, making businesses better, and staying ahead.

Here are 4 professional skills to gain for data mining in Excel:

  • Exploratory data analysis (EDA)
  • Market basket analysis
  • Cluster analysis using K-means
  • Naive Bayes classification

How to Install Data Mining Add-in in Excel

To access data mining tools, we need to install the Data Mining Add-in first. Here is an easy way to load the add-in:

  • Click on Insert >> Get Add-ins.

Clicking Get Add-ins option from Insert tab

  • In the Search bar of the Office Add-ins window, write Data Mining >> select the desired Add-in that is Analytical Solver Data Mining >> click Add option.

Searching Analytic Solver Data Mining

  • Check the I agree to all the above terms & conditions box >> Continue.

Checking I agree to all the above terms & conditions

  • Consequently, the Data Mining tab appears.

Data Mining Add-in appears in Excel


How to Prepare Data Before Data Mining in Excel

Getting data ready is a crucial part of data mining. This means cleaning, changing, and arranging data before analyzing it. Data often has problems like being incomplete or wrong, which makes analysis not right. Data preparation ensures the data for analysis is correct and reliable. This makes the insights from data mining better.

Step 1: Use Excel Tools for Data Cleaning

We can clean data by avoiding duplicate records or filtering the necessary data only.

Tool 1: Remove Duplicates

  • To use the tool for removing duplicates, select range B5:E13 >> go to Data tab >> select Remove Duplicates.

Accessing Remove Duplicates from Data tab

  • In the Remove Duplicates dialog, check Location >> OK.

Checking Location column in Remove Duplicates dialog box

  • Click OK in the Excel warning box.

Tapping OK to close warning

  • As a result, the duplicates in the Location column disappear.

Two rows with redundancy disappear


Tool 2: Filter

We can also filter out the desired data using the Filter feature. Here are the steps:

  • Select the range B4:E13 >> click Data >> Filter.

Clicking Filter command in Data tab

  • Click the Filter dropdown icon beside Bottle Size (ml) column >> check 500 >> OK.

Selecting 500 in Filter dropdown

  • Hence, we filter only the 500 values of the Bottle Size(ml) column.

Filtered data appears


Step 2: Transform Data Using PivotTable in Excel

Once the data is clean, the next thing is to transform data using the Pivot Table in Excel. To do so,

  • Go to Insert tab >> click on PivotTable >> From Table/Range.

Inserting PivotTable using From Table/Range

  • In the PivotTable from table or range dialog box, input B4:E13 as Table/Range >> click on New Worksheet >> OK.

Selecting range for Pivot table

  • In the PivotTable Fields pane, drag Temperature (°C) to the Filters field, Location to Rows field, Sales to Values field.

Dragging columns to fields in PivotTable Fields

  • Subsequently, the Pivot Table pops up.

Pivot table appears


Step 3: Insert Excel Power Query for Data Integration

Data integration means integrating data from different places. We can easily integrate data from other Excel files using Power Query. Follow the steps:

  • Initially, go to Data >> Get Data >> From File >> From Excel Workbook.

Clicking From Excel Workbook in Get Data for data mining excel example

  • In the Import Data window, select the source file >> Import.

Importing desired file

  • Tap on List >> Load on the Navigator window.

Selecting List in Navigator window

  • Eventually, we get the List query.

List query appears in another sheet for data mining excel example output

Now, you can combine, merge or connect this with the main query table to analyze.


Step 4: Use Excel AVERAGE Function for Data Reduction

Data reduction means making less data to analyze. You can use aggregate functions like SUM and AVERAGE to summarize data. In this case, we will average the values.

  • In cell D15, write the below formula based on AVERAGE function and press Enter.
=AVERAGE(B5:B13)
  • Again, enter the below AVERAGE formula in D16,
=AVERAGE(E5:E13)

Using AVERAGE function for data reduction for data mining excel example output

The formulas return averages of temperature and sales respectively.


Examples of Data Mining in Excel

In this section, we will discuss some practical examples of Data mining in Excel. Suppose, we want to predict sales of cold water bottles based on the weather forecast. Now, we want to apply various data mining techniques to it.

Example 1: Application of Regression Analysis

First, we will apply the Regression Analysis to predict sales of cold water bottles based on the weather forecast. To do so,

  • Go to Data tab >> Data Analysis >> select Regression in Data Analysis dialog box >> OK.

Selecting Regression option in Data Analysis

  • Input B5:B13 in Input Y Range box >> E5:E13 in Input X Range >> OK.

Inputting range of X and  Y axis

  • Eventually, we obtain the summary output.

Summary of Regression analysis for data mining excel example output


Example 2: Data Visualization with One Independent Variable

Now, let’s consider the Sales column as the independent variable. In that case,

  • Select the range B4:B13 & E4:E13 >> move to Insert tab >> Insert Line or Area chart >> select Line chart.

Selecting data to create a  Line chart

  • Thus, the Line chart There, right-click on a Node >> click Add Trendline from the Context Menu.

Selecting Add Trendline in Line chart for data mining excel example

  • In the Format Trendline pane, select Linear as Trendline Options.

Clicking Linear option in Format Trendline

  • Hence, the desired trendline appears suggesting the data mining trends.

Trendline appears in Line chart as data mining excel example output


Example 3: Use LINEST Function

We can also use the LINEST function in the data mining example in Excel. We will predict our future sales with the change in temperatures using this function.

  • Write the LINEST formula in E5:
=LINEST(C5:C13,B5:B13,TRUE,TRUE)

Using LINEST function for data mining Excel example

Here, the output of m and b columns indicate the changing temperature and predicted sales respectively.


Advantages and Disadvantages of Data Mining in Excel

Advantages of Data Mining:

  • User-friendly interface: People can explore ways to do data mining, even without much programming knowledge.
  • Versatile: People can do various data mining methods like grouping, prediction, and finding connections. They can also make their calculations and rules.

Disadvantages of Data Mining:

  • Lack of Scalability: Excel has limits in handling big data. For big organizations needing to grow, it might not be the best choice.
  • Slow Processing: Excel provides slower analysis and less exact outcomes for large data.

Things to Remember

  • Along with PivotTable, use date and time functions and Mathematical transformations for data transformation.
  • Data connection is an alternative to Power Query for data integration.

Frequently Asked Question

1. Can we customize our data analysis result in Excel?
Yes, you can customize analysis results through Excel formulas or rules.

2. What is the function of Outliers in Excel data mining?
The Outliers command provides a proper way to clean the information.

3. What is the data mining cycle?
Business Understanding ⇆ Data Understanding ⇆ Data Preparation ⇆ Modeling → Evaluation → Deployment → Business Understanding.


Conclusion

In summary, we have discussed the Data Mining Excel example. We have also learned how to clean, transform, integrate, and reduce data for data preparation before data mining. Hopefully, you can implement these examples in your data analysis. If you have any suggestions or queries, let us know in the comment box below. Thank you for reading.


<< Go Back to Algorithm in Excel | Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Yousuf Khan
Yousuf Khan

Yousuf Khan has been working with the ExcelDemy project for over one year. He has written 47+ articles for the ExcelDemy site. He also solved a lot of user problems with ExcelDemy Forum. He has interests in Excel & VBA, Desktop and mobile applications, and projects & templates. He completed his graduation and post-graduation in Information Technology from Jahangirnagar University. Currently, he works as a VBA & Excel Content Developer in ExcelDemy projects, writing unique and informative content... Read Full Bio

We will be happy to hear your thoughts

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo