Statistical data analysis is quite common to analyze and interpret our data in scientific way. It helps us to find trends of data and also to take decisions based on the trends. Most fundamental data analysis operations can be accomplished with the aid of Microsoft Excel. Additionally, Excel could be used to effectively statistically analyze data. In this article, we are going to learn five simple yet useful methods to know how to statistically analyze data in Excel. So, let’s start this article and explore these methods.
The above image represents applications of some common statistical functions used in Excel. Later, we’ll discuss in detail.
Download Practice Workbook
Download the following practice workbook and practice along with it.
Introduction to Statistical Data Analysis
Statistical data analysis involves collecting and analyzing a group of data, interpreting it in a way that provides valuable and meaningful information on the behavior and patterns of the data, and ultimately using this information to make informed decisions.
Types:
There are two major types of statistical data analysis. They are described in brief in the following section.
- Descriptive Statistics
Descriptive statistics are utilized to depict the basic features of a dataset in a study. This includes determining measures of variability such as standard deviation, range, and variance, along with measures of central tendency such as mean, median, and mode.
- Inferential Statistics
This is applied to make inferences and predictions about a large population of data based on a sample dataset. Hypothesis testing, regression analysis, ANOVA, etc. are included in this type of statistical data analysis.
Advantages:
- Statistical data analysis helps to identify patterns and trends in the dataset.
- It removes any chance of making biased decisions thus making the decision-making process better and more accurate.
- Statistical data analysis also allows us to test hypotheses and make calculative decisions rather than intuitive decisions.
How to Statistically Analyze Data in Excel: 5 Useful Ways
In this section of the article, we will learn five easy ways to statistically analyze data in Excel. Each of these five ways will cover numerous Excel functions to statistically analyze data.
1. Calculating Central Tendency and Variability
This is our dataset, which will be used in this method. Here, the marks of 7th grade students are given based on Math, History, Science, and Literature subjects.
Central Tendency and Variability are two useful tools of statistics to summarize a set of data. The Central Tendency refers to the middle position of a dataset. On the other hand, Variability indicates how spread out the data is.
1.1 Using AVERAGE Function
In the above image, you see the Average Marks for Ethan in cell C20.
Here, we have used the AVERAGE function which returns the arithmetic mean of a set of data. The used formula for C20 cell is:
=AVERAGE(C9:F9)
1.2 Employing AVERAGEIF and AVERAGEIFS Functions
Now, we will learn about the AVERAGEIF and AVERAGEIFS functions of Excel to statistically analyze data. The AVERAGEIF function simply calculates the average of a selected range of values based on a single criterion, whereas the AVERAGEIFS function finds the average based on multiple criteria.
The above image demonstrates the diverse uses of the AVERAGEIF and AVERAGEIFS functions. Let’s explore the facts in detail.
Let’s say, you want to find the average of the marks obtained by the students of Grade 6. To do this, we have applied the following formula in the B22 cell.
=AVERAGEIF(B5:B16,B19,D5:D16)
Let’s assume another situation where we want to find the Average Marks of the students based on two criteria, which are their Grade and Section. The used formula for C24 cell is:
=AVERAGEIFS($D$5:$D$16,$B$5:$B$16,B19,$C$5:$C$16,C19)
1.3 Utilizing HARMEAN and GEOMEAN Functions
The HARMEAN function gives us the Harmonic Mean of a selected set of values. The harmonic mean is defined by the reciprocal of the average of the reciprocals of values in a dataset. For instance, let’s say we have six numbers as our data. The numbers are 1,2,3,4,5 and 6. Then our harmonic mean value will be as follows.
Harmonic Mean = 11+1/2+1/3+1/4+1/5+1/66 = 2.4489
Moreover, we will learn the application of the GEOMEAN function in Excel. The GEOMEAN function calculates the Geometric Mean of a selected dataset. The geometric mean is calculated by finding the nth root after multiplying the n values of a dataset. Here, n is the total number of values in a dataset. For instance, let’s say we have 5 numbers as our dataset. These are 1, 2, 3, 4, and 5. So, the Geometric Mean will be,
Geometric Mean = 51*2*3*4*5 = 2.6051.
The above image demonstrates a practical example of using both HARMEAN and GEOMEAN functions. The formula for calculating the harmonic mean in the B21 cell is:
=HARMEAN(C5:C17)
One thing is clear that the value of the harmonic mean in this case is lower than the average value of the Average Marks. The arithmetic mean is 82.58, but the harmonic mean is 82.37. That means it limits the value of the large value of Average Marks.
In the case of finding the geometric mean, we have used the following formula in cell C22.
=GEOMEAN(C5:C17)
Like the harmonic mean, the geometric mean (82.47) is different from the arithmetic mean (82.58). Investors use the geometric mean as it provides a more accurate average value, whenever row values are given across a number of periods.
1.4 Applying STANDARDIZE Function
In this section of the article, we will learn to use the STANDARDIZE function in Excel. The STANDARDIZE function returns the normalized value of a distribution. This value is defined by the mean value and the standard deviation of the selected numbers.
- Firstly, calculate the Mean and the Standard Deviation of the dataset.
- Following that, insert the following formula in cell D5.
=STANDARDIZE(C5,$C$20,$C$21)
Similarly, you can use the MODE.SNGL, MEDIAN, VAR.S, VAR.P, STDEV.S, and STDEV.P functions to further statistically analyze data in Excel.
2. Computing Relative Standing
Computing relative standing is another important aspect of statistically analyzing data in Excel. In this section of the article, we will use various functions of Excel to do this. Firstly, let’s discuss the RANK.EQ and RANK.AVG functions.
The RANK.EQ function returns the relative ranks for a set of data. The criterion that separates the RANK.EQ function from the RANK.AVG function is the return value in the case of ties. In other words, if two ranks are tied, then the RANK.EQ function will return the top rank for both values.
For instance, let’s say both the 5th and 6th-ranked values are the same. In that case, the RANK.EQ function will return rank 5 for both of those values, and the next rank value will be rank 7. Here, we have the Total Marks of 7th Grade Students as our dataset.
Here, you can see that the 10th and the 11th values were tied. So, the RANK.EQ function returned rank 10 for both values. We have applied the following formula in cell D5.
=RANK.EQ(C5,$C$5:$C$17,0)
Here, the 10th and the 11th values were tied. So, the RANK.AVG function returned an average of 10.5 for both values.
Now, let’s learn about the RANK.AVG function. The RANK.AVG function also returns the relative ranks of a dataset. But, in the case of ties, it will return an average rank for the tied values. For example, let’s say the 4th and 5th ranked values are tied. So, the RANK.AVG function will return a rank of 4.5 for both values. The rank of the next value will be 6. Now, let’s use the instructions outlined below to utilize the RANK.AVG function in Excel to statistically analyze data. We have used the following formula in cell E5.
=RANK.AVG(C5,$C$5:$C$17,0)
Furthermore, you can also use the PERCENTRANK.INC, PERCENTRANK.EXC, PERCENTILE.INC, PERCENTILE.EXC, QUARTILE.INC and QUARTILE.EXC functions to compute the relative standing of data in Excel.
3. Determining Correlation and Regression
Finding the correlation and regression is a significant aspect of statistical analysis in Excel. In this section of the article, we will learn about determining the correlation and regression for a set of data in Excel.
3.1 Using SLOPE, INTERCEPT, and STYEX Functions
You can simply use the SLOPE & INTERCEPT functions to find slope and intercept value by using a set of X Values and Y Values as shown in the image below. The STYEX function gives us the standard error of Y Values for given X Values. We can use it to make predictions about the Y Value from an X Value. Now, let’s use the procedure discussed in the following section.
Here, we use the formula given below in cell C21.
=STEYX(C5:C17,B5:B17)
Then, press ENTER.
Consequently, you will have the Standard Error of the Y Values for given X Values in cell C21.
3.2 Applying CORREL Function
The CORREL function helps us find how closely two sets of data are related to each other. You can follow the steps mentioned in this article to use the CORREL function. In our dataset, we have used the following formula in the C19 cell.
=CORREL(C5:C17,D5:D17)
4. Applying Array Functions for Statistical Analysis
Here, we have the frequencies against each Bin Range as demonstrated in the following picture. We have used the FREQUENCY function, one of the most commonly used array functions, to statistically analyze data in Excel. The formula in cell C17 is:
=FREQUENCY(C5:C14,B17:B23)
Furthermore, you can use the MODE.MULT function, LINEST function, TREND function, and GROWTH function to statistically analyze data in Excel.
Note: If you are using an older version of Excel, you might need to press CTRL + SHIFT + ENTER to use the array formulas. As we are using Excel 365, simply pressing ENTER will do for us.
5. Utilizing Data Analysis ToolPak to Calculate Moving Average
The above image represents the Moving Average of our dataset. In this section, we have discussed the way of finding the Moving Average using Data Analysis ToolPak, one of the most useful tools in Excel when it comes to statistically analyzing data.
However, the Data Analysis ToolPak option is not available in the Excel Ribbon by default. You will need to manually activate this feature. You can follow this article to activate the Data Analysis ToolPak and also learn about its various uses.
Now go to the Data tab from Ribbon >> choose the Data Analysis option from the Analysis group.
As a result, the Data Analysis dialogue box will appear on your worksheet as shown in the above image.
After that, go to the Input Range field to select the cells of the Average Marks column >> click on the Output Range field and select cell G5 >> click OK.
Some Common Data Analysis Tools in Excel
- Anova: Single Factor → It is used for the analysis of variance for two or more observations.
- Anova: Two Factor with Replication → For each combination of the levels of the variables, it creates an analysis of variance with two independent variables and various observations.
- Anova: Two Factor Without Replication → For each combination of the levels of the variables, it creates an analysis of variance with two independent variables and a single observation.
- Correlation → When there are more than two measurements on a sample of people, it computes a matrix of correlation coefficients for each possible pair of measurements.
- Covariance → When there are more than two measurements on a sample of people, it computes a matrix of covariance coefficients for each possible pair of measurements.
- Descriptive Statistics → It produces a report containing a summary of the central tendency, variability, and other properties of values within a defined range of cells.
- Exponential Smoothing → It makes a prediction about the next value of a sequence, using the sequence of the previous values and previous predictions that were made.
- F-Test Two Sample for Variances → It creates a comparison of two variances using by performing an F-Test.
- Histogram → It builds a tabular depiction of the frequency distribution of values within a chosen cell range.
- Random Number Generation → Based on one of the seven potential distributions, generates a specific quantity of random numbers.
- Rank and Percentile → It creates a table that displays each value in a set of values along with its ordinal rank and percentile rank.
- Regression → This creates a report of the liners regression statistics applying a set of data that includes one dependent variable and one or more independent variables.
- Sampling → It generates a sample of values from the cells in the specified range.
You’ll get the following analysis tools in the Data Analysis ToolPak.
Frequently Asked Questions (FAQs)
- How do I perform a t-test in Excel?
To perform a t-test in Excel, you can use the T.TEST function. This function takes two arguments: the data sets to compare and the type of t-test you want to perform.
- How do you tell if data is statistically significant in Excel?
Your final output is statistically significant if the p-value falls below your significance level. Both one-tailed and two-tailed t-tests come with p-values in Excel. Only one direction can be identified by one-tailed t-tests when comparing means.
- How do I perform an ANOVA in Excel?
To perform an ANOVA in Excel, you can use the Analysis ToolPak. Using this data analysis tool, you can accomplish either a one-way ANOVA or a two-way ANOVA, whichever you need.
- How do I perform a regression analysis with multiple independent variables in Excel?
To perform a regression analysis with multiple independent variables in Excel, you can use the Regression tool in the Analysis ToolPak add-in. In addition to the coefficients and p-values, this tool allows you to specify the dependent variable and multiple independent variables.
Things to Remember
- Before performing any data analysis in Excel, you have to clear about your data type, e.g. continuous, categorical.
- Next, you must pick the enriched list of statistical analysis tools e.g. t-test, ANOVA, regression and correlation.
- Once you’ve conducted your analysis, it’s important to interpret your results in a meaningful way. This means understanding what the numbers mean and how they relate to your research question.
- Finally, it’s important to validate your results by checking for errors and ensuring that your analysis is robust. This includes checking for outliers, testing assumptions, and conducting sensitivity analyses.
Practice Section
In the Excel Workbook, we have provided a Practice Section on the right side of the worksheet. Please practice it yourself.
Sample Practice Section provided in each worksheet of the Practice Workbook.
Conclusion
So, we have come to the very end of the article. Here, we have tried to explain some useful methods to statistically analyze data in Excel. In this article, you will find a lot of associated links to different functions. We highly recommend you visit those links for a better understanding of statistically analyzing data. If you have any queries or remarks about the article, please let us know in the comment section. You can also have a look at our other helpful articles on Excel functions and formulas on our website, ExcelDemy, a one-stop Excel solution provider.