In this tutorial, we will demonstrate data mining Excel example. Further, we will explore examples to prepare data before data mining in Excel. You will also learn to install Data Mining Add-in in Excel.
Businesses produce lots of data nowadays, which can give useful ideas if analyzed well. Data mining finds patterns in big sets of data and is a strong way to get these ideas. Microsoft Excel has tools for data analysis and mining examples too. Read the full article to learn about this in the next sections.
Before that, see the overview image of the data mining Excel example below.
Download Practice Workbook
You can download the practice workbook for free.
What is Data Mining in Excel?
Data mining means finding patterns and useful ideas in big sets of data using math and computers. It looks at data from different places, finds patterns and links, and turns it into helpful info. The goal of data mining is to uncover patterns in big data. This helps with smart choices, making businesses better, and staying ahead.
Here are 4 professional skills to gain for data mining in Excel:
- Exploratory data analysis (EDA)
- Market basket analysis
- Cluster analysis using K-means
- Naive Bayes classification
How to Install Data Mining Add-in in Excel
To access data mining tools, we need to install the Data Mining Add-in first. Here is an easy way to load the add-in:
- Click on Insert >> Get Add-ins.
- In the Search bar of the Office Add-ins window, write Data Mining >> select the desired Add-in that is Analytical Solver Data Mining >> click Add option.
- Check the I agree to all the above terms & conditions box >> Continue.
- Consequently, the Data Mining tab appears.
How to Prepare Data Before Data Mining in Excel
Getting data ready is a crucial part of data mining. This means cleaning, changing, and arranging data before analyzing it. Data often has problems like being incomplete or wrong, which makes analysis not right. Data preparation ensures the data for analysis is correct and reliable. This makes the insights from data mining better.
Step 1: Use Excel Tools for Data Cleaning
We can clean data by avoiding duplicate records or filtering the necessary data only.
Tool 1: Remove Duplicates
- To use the tool for removing duplicates, select range B5:E13 >> go to Data tab >> select Remove Duplicates.
- In the Remove Duplicates dialog, check Location >> OK.
- Click OK in the Excel warning box.
- As a result, the duplicates in the Location column disappear.
Tool 2: Filter
We can also filter out the desired data using the Filter feature. Here are the steps:
- Select the range B4:E13 >> click Data >> Filter.
- Click the Filter dropdown icon beside Bottle Size (ml) column >> check 500 >> OK.
- Hence, we filter only the 500 values of the Bottle Size(ml) column.
Step 2: Transform Data Using PivotTable in Excel
Once the data is clean, the next thing is to transform data using the Pivot Table in Excel. To do so,
- Go to Insert tab >> click on PivotTable >> From Table/Range.
- In the PivotTable from table or range dialog box, input B4:E13 as Table/Range >> click on New Worksheet >> OK.
- In the PivotTable Fields pane, drag Temperature (°C) to the Filters field, Location to Rows field, Sales to Values field.
- Subsequently, the Pivot Table pops up.
Step 3: Insert Excel Power Query for Data Integration
Data integration means integrating data from different places. We can easily integrate data from other Excel files using Power Query. Follow the steps:
- Initially, go to Data >> Get Data >> From File >> From Excel Workbook.
- In the Import Data window, select the source file >> Import.
- Tap on List >> Load on the Navigator window.
- Eventually, we get the List query.
Now, you can combine, merge or connect this with the main query table to analyze.
Step 4: Use Excel AVERAGE Function for Data Reduction
Data reduction means making less data to analyze. You can use aggregate functions like SUM and AVERAGE to summarize data. In this case, we will average the values.
- In cell D15, write the below formula based on AVERAGE function and press Enter.
=AVERAGE(B5:B13)
- Again, enter the below AVERAGE formula in D16,
=AVERAGE(E5:E13)
The formulas return averages of temperature and sales respectively.
Examples of Data Mining in Excel
In this section, we will discuss some practical examples of Data mining in Excel. Suppose, we want to predict sales of cold water bottles based on the weather forecast. Now, we want to apply various data mining techniques to it.
Example 1: Application of Regression Analysis
First, we will apply the Regression Analysis to predict sales of cold water bottles based on the weather forecast. To do so,
- Go to Data tab >> Data Analysis >> select Regression in Data Analysis dialog box >> OK.
- Input B5:B13 in Input Y Range box >> E5:E13 in Input X Range >> OK.
- Eventually, we obtain the summary output.
Example 2: Data Visualization with One Independent Variable
Now, let’s consider the Sales column as the independent variable. In that case,
- Select the range B4:B13 & E4:E13 >> move to Insert tab >> Insert Line or Area chart >> select Line chart.
- Thus, the Line chart There, right-click on a Node >> click Add Trendline from the Context Menu.
- In the Format Trendline pane, select Linear as Trendline Options.
- Hence, the desired trendline appears suggesting the data mining trends.
Example 3: Use LINEST Function
We can also use the LINEST function in the data mining example in Excel. We will predict our future sales with the change in temperatures using this function.
- Write the LINEST formula in E5:
=LINEST(C5:C13,B5:B13,TRUE,TRUE)
Here, the output of m and b columns indicate the changing temperature and predicted sales respectively.
Advantages and Disadvantages of Data Mining in Excel
Advantages of Data Mining:
- User-friendly interface: People can explore ways to do data mining, even without much programming knowledge.
- Versatile: People can do various data mining methods like grouping, prediction, and finding connections. They can also make their calculations and rules.
Disadvantages of Data Mining:
- Lack of Scalability: Excel has limits in handling big data. For big organizations needing to grow, it might not be the best choice.
- Slow Processing: Excel provides slower analysis and less exact outcomes for large data.
Things to Remember
- Along with PivotTable, use date and time functions and Mathematical transformations for data transformation.
- Data connection is an alternative to Power Query for data integration.
Frequently Asked Question
1. Can we customize our data analysis result in Excel?
Yes, you can customize analysis results through Excel formulas or rules.
2. What is the function of Outliers in Excel data mining?
The Outliers command provides a proper way to clean the information.
3. What is the data mining cycle?
Business Understanding ⇆ Data Understanding ⇆ Data Preparation ⇆ Modeling → Evaluation → Deployment → Business Understanding.
Conclusion
In summary, we have discussed the Data Mining Excel example. We have also learned how to clean, transform, integrate, and reduce data for data preparation before data mining. Hopefully, you can implement these examples in your data analysis. If you have any suggestions or queries, let us know in the comment box below. Thank you for reading.
<< Go Back to Algorithm in Excel | Learn Excel
Get FREE Advanced Excel Exercises with Solutions!