How to Use the Data Analysis Toolpak in Excel – 13 Features

 

Steps to Enable Data Analysis Toolpak in Excel

 Steps:

  • In File, go to Options.

Installing  Data Analysis Toolpak

  • Select Add-ins.
  • In Manage, select Excel Add-ins.
  • Click Go.

How to Use Data Analysis Toolpak in Excel

  • In the new window, check Analysis ToolPak.
  • Click OK.

  • In the Data tab, select Analyze and choose Data Analysis.

How to Use Data Analysis Toolpak in Excel


 

Feature 1 – Anova Analysis

ANOVA is a statistical method used to analyze variance observed within a dataset by dividing it into two sections: 1) Systematic factors and 2) Random factors.

Anova can determine which factors have a significant effect on a given set of data and supports additional analysis on methodological factors and on regression analysis.

The Formula of Anova is:

F=MSE / MST

Here:

F = Anova coefficient

MST = Mean sum of squares due to treatment

MSE = Mean sum of squares due to error

Anova is of two types: single factor and two factors, according to variance analysis.

  • In two factors, there are multiple dependent variables and in one factor, there will be one dependent variable.
  • Single-factor Anova calculates the effect of a single factor on a single variable.
  • Single-factor Anova identifies the differences that are statistically significant between the average means of numerous variables.

1.1 Single-Factor Anova Analysis

The sample dataset showcases a group of factors.

Single Factor Anova Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

  • In the Data Analysis window, select Anova: Single Factor.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • In the Anova: Single Factor window, enter data in Input Range.
  • Check Labels in First Row.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check  Labels in the first row (if you select the input data range with the label).
  • Click OK.

  • This is the output..

How to Use Data Analysis Toolpak in Excel

Interpretation of the Result:

  • In the Summary table, you will find each group’s average and variances. The average level is 60.4 for Group A but the variance is 315.15 which is lower than other groups: elements in that group are less valuable.
  • Anova results are not very significant as you are calculating only the variances.
  • P-values interpret the relation between the columns: the values are greater than 0.05, which is not statistically significant.

1.2 Anova: Two Factor with Replication

The dataset showcases data on different exam scores on two shifts in that school.

You want analyze data to find a relation between the shifts and students’ marks.

Anova: Two Factor with Replication

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select Anova: Two-Factor With Replication.
  • Click OK.

  • In the new window, enter data in Input Range.
  • Enter 4 in Rows per sample as you have 4 rows per shift.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply (here) or in a new workbook by selecting New Workbook.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • A new worksheet was created.
  • The two-way Anova result is displayed.

Interpretation of the Result:

  • The average score on the Morning shift in Math score is 65.5 but on the Day shift is 83.75
  • In the Chemistry exam, the average score on the Morning shift is 87.25, but on the Day shift is 77.25.
  • Variance is very high at 91 on the morning shift in the Math exam.
  • You will get a complete overview in the summary.

How to Use Data Analysis Toolpak in Excel

You can summarize the interactions and individual effects.

  • The P-value of Columns is 0.037 which is statistically significant:  there is an effect of shifts on the performance of the students in the exam. But the value is close to the alpha value of 0.05 (the effect is less significant).
  • The P-value of interactions is 0.000967 which is much less than the alpha value. It is very significant statistically: the effect of the shift on both exams is very high.

How to Use Data Analysis Toolpak in Excel


1.3 Anova: Two Factor without Replication

You have data on different exam scores on two shifts in that school. T

To find a relation between shifts and students’ marks:

Anova: Two Factor without Replication

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

  • In the Data Analysis window, select “Anova: Two-Factor Without Replication”.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • In the new window, enter data in Input Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply (here) or in a new workbook by selecting New Workbook.
  • Check Labels.
  • Click OK.

  • A new worksheet was created.
  • The two-way Anova result is displayed.

How to Use Data Analysis Toolpak in Excel

Interpretation of the Result:

  • The average score on the Morning shift in Math score is 65.5 but on the Day shift is 83.75
  • In the Chemistry exam, the average score on the Morning shift is 87 but on the Day shift is 77.25.
  • The P-value of Columns is 0.24, which is statistically significant:  there is an effect of shifts on the performance of the students. The value is close to the alpha value of 0.05, so the effect is less significant.

Read More: How to Use Analyze Data in Excel


Feature 2. Correlation Analysis

In statistics, correlation or correlation coefficient is the parameter to show coherence between two variables in response to the continuous fluctuating quantity of another. Its value ranges from -1 to +1. It has three states of defining variable relations:

  • -1 indicates a negative correlation, which means the variables change in the opposite direction.
  • +1 indicates a positive correlation, which means the variables change in the same direction.
  • 0 indicates no correlation, which means no apparent movements in any direction of a variable upon changing other variables’ values.

The dataset showcases two stock prices in different periods of time.

Correlation Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select Correlation.
  • Click OK.

  • In the new window, enter data in Input Range.
  • Check  Columns in Grouped By.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels in the first row.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • This is the correlation result.

There is a positive correlation, which means the variables change in the same direction.

Read More: How to Install Data Analysis in Excel


Feature 3 – Covariance Analysis

The covariance of two variables is a measure of how one of them influences the other. It’s a necessary evaluation of the deviation between two variables (they do not have to be dependent on each other).

Covariance Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select Covariance.
  • Press Enter.
  • Click OK.

  • In the new window, enter data in Input Range.
  • Check Columns in Grouped By.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels in the first row.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • This is the covariance result.

Explanation of the Result:

The highlighted section in the following image indicates the variance for each subject.

  • The variance of Math is 154.9375 and the variance of Physics is 76.484375. The variance of Chemistry is 154.9375.

How to Use Data Analysis Toolpak in Excel

The highlighted section in the following image indicates the values of variance between the subjects: Math and Physics, Maths and Chemistry, and Physics and History have a variance value of respectively 60.9375, 3.65625, and 8.8125. When the covariance is positive, it indicates that the variables are proportionate: when one increases, the other tends to rise.

Read More: How to Make Histogram Using Analysis ToolPak 


Feature 4 – Descriptive Statistics Analysis

To do a descriptive statistics analysis.:

Descriptive Statistics Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select Descriptive Statistics.
  • Click OK.

  • In the new window, enter data in Input Range.
  • Check Columns in Grouped By.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Summary statistics.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • The covariance result will be displayed.

  • The result gives the characteristics of two variables, mean, median, standard deviation, and maximum and minimum value of the dataset, which are respectively 77.25, 80, 19.088, 45, and 100.

Feature 5 – Exponential Smoothing Analysis

The dataset contains items sold in different weeks by a manufacturing company. To get an exponential smoothing analysis:

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select Exponential Smoothing.
  • Click OK.

  • In the new window, enter data in Input Range.
  • Enter 0.9 in Damping factor. Here, 1-alpha.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels.
  • Check Chart Output.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • The exponential smoothing result and the chart for 0.1 alpha are displayed.

How to Use Data Analysis Toolpak in Excel

Excel can not provide data for the first value in this method. If you use a large damping factor, you will get more smooth peaks and valleys.


Feature 6 – F-Test Two-Sample for Variances Analysis

To do an F-test for two sample variances:

F-Test for Two-Sample Variances Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

 

  • In the Data Analysis window, select F-Test Two-Sample for Variances.
  • Click OK.

  • In the new window, enter data in Input Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • The F-test for two sample variances is displayed.

How to Use Data Analysis Toolpak in Excel

Explanation of the Result:

  • The mean value of variable 2 is greater than variable 1.
  • If F is greater than F Critical one-tail, it doesn’t follow the null hypothesis. In the above data, the value of F is .8016 and the value of F Critical one-tail is 0.3145 which indicates F is greater than F critical one-tail:  the variances between the two variables don’t match.

Feature 7 – Moving Average Analysis

The moving average means the period of time of the average is the same, but it keeps moving when new data is added. A moving average smooths irregularities (peaks and valleys) from data to easily recognize trends. The larger the interval period is, the more smoothing occurs, as more data points are included in each calculated average.

The dataset contains items sold in different weeks by a manufacturing company. To do a moving average analysis:

Moving Average Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

  • In the Data Analysis window, select Moving Average.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

In the Moving Average box:

  • Enter data in Input Range.
  • Enter the number of intervals in Interval.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • To see the trendline with a chart, check Chart Output.
  • Check Labels in the first row.
  • Click OK.

  • The moving average and the Excel trendline will be displayed, showing both the original data and the moving average value with smooth fluctuations.

How to Use Data Analysis Toolpak in Excel


Feature 8 – Random Number Generation

To generate random numbers with different criteria:

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

Random Number Generation

  • In the Data Analysis, select Descriptive Statistics.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

In the Random Number Generation box:

  • Enter the Number of Variables (the number of columns of random numbers that you want in your worksheet. Here, 2).
  • Enter the Number of Random Numbers (the number of rows that you want in your worksheet. Here, 7).
  • Select  Uniform in Distribution.
  • In parameters, indicate the boundaries of your distribution. Here 40 to 60.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Click OK.

  • Random numbers will be generated.

How to Use Data Analysis Toolpak in Excel


Feature 9 – Rank and Percentile Analysis

The dataset showcases students’ ID and their Math exam scores. To calculate the rank and percentile of each student’s math exam score:

Rank and Percentile Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select Rank and Percentile.
  • Click OK.

In the Rank and Percentile box:

  • Enter data in Input Range.
  • Check Columns in Grouped By section.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels in the first row.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • The rank and percentile for each student’s exam score will be displayed.

Read More: [Fixed!] Data Analysis Not Showing in Excel


Feature 10 – Regression Analysis

Regression analysis helps to predict values depending on two or more variables.

The dataset contains the player name, the number of matches played, and the number of goals each player scored.

Regression Analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select Regression.
  • Click OK.

In the Regression box,

  • Enter data in Input Range: the data ranges in Input X Range and Input Y Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels.
  • Check Residuals.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • The result of the regression analysis is displayed.

Explanation of the Regression Analysis Result:

Regression Statistics:

How to Use Data Analysis Toolpak in Excel

Regression statistics is an array of various parameters that describe how well the measured linear regression is.

  • Multiple R is a correlation coefficient parameter that indicates the correlation between variables. Its value ranges from -1 to +1. The bigger the value, the stronger the correlative relationships are.
  • R Square symbolizes the coefficient of determination. It indicates the scale by how well the data model fits the regression analysis.
  • The adjusted R square is used in multiple variables in regression analysis.
  • Standard Error is another parameter that shows a healthy fit of any regression analysis. The smaller the standard error the more accurate the linear regression equation. It shows the average distance of data points from the linear equation.

Anova:

How to Use Data Analysis Toolpak in Excel

It analyses the variance of the data model.

  • Here, df represents the degree of freedom.
  • SS( sum of squares) symbolizes the good-to-fit parameter.
  • MS means the Mean Square.
  • F refers to the Null Hypothesis. It tests the overall significance of the regression model
  • Significance of F means the P-value of F.

Co-efficient Outcome:

It helps to determine Y values.

Residual Output:

How to Use Data Analysis Toolpak in Excel

It compares the estimated value with the calculated value.


Feature 11 – t-Test Analysis

There are three types of T-Test:

  • Paired two samples for means
  • Two samples assuming equal variances
  • Two- samples using unequal variances

 

11.1 t-Test: Paired Two Sample for Means

The dataset contains Students’ IDs and each student’s Math and Physics scores.

To do a t-Test: Paired Two Sample for Means:

t-Test: Paired Two Sample for Means

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

  • In the Data Analysis window, select t-Test: Paired Two Samples for Means.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

In the t-Test: Paired Two Sample for Means box:

  • Enter data in Input: the data ranges in Variable 1 Range and Variable 2 Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels.
  • Click OK.

  • The result of the t-Test: Paired Two Sample for Means will be displayed.

How to Use Data Analysis Toolpak in Excel

Explanation of the Result:

  • The Mean Value of the Math Score is greater than the mean value of the Physics Score.
  • The variance of the Math Score is also greater than the variance of the Physics Score.
  • If t Stat is greater than t Critical two-tail, you can’t eliminate null hypothesis. In the above calculation, t Stat and t Critical two-tail value are respectively 1.603 and 2.36464. That means 1.603< 2.36464: it doesn’t match the null hypothesis. The variances between the two variables don’t match.

11.2 t-Test Two-Sample Assuming Equal Variances

To do a t-Test: Two-Sample Equal Variances:

t-Test Two-Sample Assuming Equal Variances

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select t-Test: Two-Sample Equal Variances.
  • Click OK.

In the t-Test: Paired Two Sample Equal Variances  box,

  • Enter data in Input: the data ranges in Variable 1 Range and Variable 2 Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels.
  • Click OK.

  • The t-Test: Two-Sample Equal Variances will be displayed.

How to Use Data Analysis Toolpak in Excel

Explanation of the Result:

  • The Mean Value of the Math Score is greater than the mean value of the Physics Score.
  • The variance of the Math Score is also greater than the variance of the Physics Score.
  • If t Stat is greater than t Critical two-tail, you can’t eliminate the null hypothesis. In the above calculation, t Stat is and t Critical two-tail values are respectively 1.48 and 2.144. 1.48< 2.144: it doesn’t match the null hypothesis. The variances between the two variables don’t match.

11.3 t-Test: Two-Sample Assuming Unequal Variances

To do a t-Test: Two-Sample Unequal Variances:

t-Test: Two-Sample Assuming Unequal Variances

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select t-Test: Two-Sample Unequal Variances.
  • Click OK.

In the t-Test: Paired Two Sample Unequal Variances box,

  • Enter data in Input: the data ranges in Variable 1 Range and Variable 2 Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Check Labels.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • The t-Test: Two-Sample Unequal Variances will be displayed.

Explanation of the Result:

  • The Mean Value of the Math Score is greater than the mean value of the Physics Score.
  • The variance of the Math Score is also greater than the variance of the Physics Score.
  • If t Stat is greater than t Critical two-tail, you can’t eliminate null hypothesis. In the above calculation, t Stat and t Critical two-tail values are respectively 79 and 2.1311.79< 2.131: it doesn’t match the null hypothesis. The variances between the two variables don’t match.

Feature 12 – z-Test: Two Sample for Means

Use the VAR.P function to calculate the variance of both variables.

z-Test: Two Sample for Means

Steps:

  • To calculate the variance of the Math score, use the following formula in D14:

=VAR.P(C5:C12)

  • Press Enter.
  • You will see the variance of Match Score.

How to Use Data Analysis Toolpak in Excel

  • To calculate the variance of the Physics score, use the following formula in D15:

=VAR.P(D5:D12)

  • Press Enter.
  • You will see the variance of Physics Score.

  • Go to the Data tab.
  • Select Data Analysis.

How to Use Data Analysis Toolpak in Excel

  • In the Data Analysis window, select z-Test: Two-Sample Means.
  • Click OK.

In the z-Test: Two-Sample Means box:

  • Enter data in Input: the data ranges in Variable 1 Range and Variable 2 Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Enter the value variance of Math and Physics Score in Variable 1 Variance (known) and Variable 2 Variance (known)
  • Check Labels.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

  • The z-Test: Two-Sample Means will be displayed.

How to Use Data Analysis Toolpak in Excel

Explanation of the Result:

  • The Mean Value of the Math Score is greater than the mean value of the Physics Score.
  • The variance of the Math Score is also greater than the variance of the Physics Score.
  • If Z is less than Z critical two-tail, you can’t eliminate the null hypothesis. In the above calculation,  z and z Critical two-tail values are 52 and 1.95. 1.52 < 1.95:  matches the null hypothesis. The variances between the two variables match.

Feature 13. Sampling Analysis

To do a sampling analysis:

Sampling analysis

Steps:

  • Go to the Data tab.
  • Select Data Analysis.

  • In the Data Analysis window, select Sampling.
  • Click OK.

How to Use Data Analysis Toolpak in Excel

In the Sampling box,

  • Enter data in the Input Range.
  • In the Output Range box, enter the data range. You can also see the output in a new worksheet by selecting New Worksheet Ply or in a new workbook by selecting New Workbook.
  • Enter data in Number of Samples.
  • Check Labels.
  • Click OK.

  • The Sampling analysis will be displayed: six samples were taken from the Total Sales column.

How to Use Data Analysis Toolpak in Excel


Download Practice Workbook

Download the practice workbook to exercise.


Related Articles


<< Go Back to Data Analysis with Excel Learn Excel

Get FREE Advanced Excel Exercises with Solutions!
Saquib Ahmad Shuvo
Saquib Ahmad Shuvo

Saquib Ahmad Bhuiyan, holding a BSc degree in Naval Architecture & Marine Engineering from Bangladesh University of Engineering and Technology, skillfully integrates engineering expertise with a passion for Excel. He provides solutions to complex issues as an Excel & VBA Content Developer for ExcelDemy Forum. He demonstrates an enthusiastic mindset and adeptness in managing critical situations with finesse, showcasing his commitment to excellence. Apart from creating Excel tutorials, he is interested in Data Analysis with MS Excel, Rhinoceros,... Read Full Bio

We will be happy to hear your thoughts

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo