How to Statistically Analyze Data in Excel (5 Useful Ways)

Statistical data analysis is quite common to analyze and interpret our data in scientific way. It helps us to find trends of data and also to take decisions based on the trends. Most fundamental data analysis operations can be accomplished with the aid of Microsoft Excel. Additionally, Excel could be used to effectively statistically analyze data. In this article, we are going to learn five simple yet useful methods to know how to statistically analyze data in Excel. So, let’s start this article and explore these methods.

How to statistically analyze data in excel

The above image represents applications of some common statistical functions used in Excel. Later, we’ll discuss in detail.


Download Practice Workbook

Download the following practice workbook and practice along with it.


Introduction to Statistical Data Analysis

Statistical data analysis involves collecting and analyzing a group of data, interpreting it in a way that provides valuable and meaningful information on the behavior and patterns of the data, and ultimately using this information to make informed decisions.

Types:

There are two major types of statistical data analysis. They are described in brief in the following section.

  • Descriptive Statistics

Descriptive statistics are utilized to depict the basic features of a dataset in a study. This includes determining measures of variability such as standard deviation, range, and variance, along with measures of central tendency such as mean, median, and mode.

  • Inferential Statistics

This is applied to make inferences and predictions about a large population of data based on a sample dataset. Hypothesis testing, regression analysis, ANOVA, etc. are included in this type of statistical data analysis.

Advantages:

  • Statistical data analysis helps to identify patterns and trends in the dataset.
  • It removes any chance of making biased decisions thus making the decision-making process better and more accurate.
  • Statistical data analysis also allows us to test hypotheses and make calculative decisions rather than intuitive decisions.

How to Statistically Analyze Data in Excel: 5 Useful Ways

In this section of the article, we will learn five easy ways to statistically analyze data in Excel. Each of these five ways will cover numerous Excel functions to statistically analyze data.


1. Calculating Central Tendency and Variability

Marks of 7th Grade students, a sample dataset to show statistical analysis of data in Excel

This is our dataset, which will be used in this method. Here, the marks of 7th grade students are given based on Math, History, Science, and Literature subjects.

Central Tendency and Variability are two useful tools of statistics to summarize a set of data. The Central Tendency refers to the middle position of a dataset. On the other hand, Variability indicates how spread out the data is.


1.1 Using AVERAGE Function

Using AVERAGE function to get the average marks

In the above image, you see the Average Marks for Ethan in cell C20.

Here, we have used the AVERAGE function which returns the arithmetic mean of a set of data. The used formula for C20 cell is:

=AVERAGE(C9:F9)

Here, the range of cells C9:F9 refers to the marks of various subjects for Ethan.

1.2 Employing AVERAGEIF and AVERAGEIFS Functions

Now, we will learn about the AVERAGEIF and AVERAGEIFS functions of Excel to statistically analyze data. The AVERAGEIF function simply calculates the average of a selected range of values based on a single criterion, whereas the AVERAGEIFS function finds the average based on multiple criteria.

Using AVERAGEIF and AVERAGIFS functions to get the average marks based on single or multiple criteria

The above image demonstrates the diverse uses of the AVERAGEIF and AVERAGEIFS functions. Let’s explore the facts in detail.

Let’s say, you want to find the average of the marks obtained by the students of Grade 6. To do this, we have applied the following formula in the B22 cell.

=AVERAGEIF(B5:B16,B19,D5:D16)

Here, cell B19 indicates the criteria based on which we will find the Average Marks.

Let’s assume another situation where we want to find the Average Marks of the students based on two criteria, which are their Grade and Section. The used formula for C24 cell is:

=AVERAGEIFS($D$5:$D$16,$B$5:$B$16,B19,$C$5:$C$16,C19)

Here, the range of cells $D$5:$D$16 indicates the cells of the Marks column, the range of cells $B$5:$B$16 refers to the cells of the Grade column, and the range of cells $C$5:$C$16 indicates the cells of the Section column. Cells B19 and C19 represent the two criteria.

1.3 Utilizing HARMEAN and GEOMEAN Functions

The HARMEAN function gives us the Harmonic Mean of a selected set of values. The harmonic mean is defined by the reciprocal of the average of the reciprocals of values in a dataset. For instance, let’s say we have six numbers as our data. The numbers are 1,2,3,4,5 and 6. Then our harmonic mean value will be as follows.

Harmonic Mean = 11+1/2+1/3+1/4+1/5+1/66 = 2.4489

Moreover, we will learn the application of the GEOMEAN function in Excel. The GEOMEAN function calculates the Geometric Mean of a selected dataset. The geometric mean is calculated by finding the nth root after multiplying the n values of a dataset. Here, n is the total number of values in a dataset. For instance, let’s say we have 5 numbers as our dataset. These are 1, 2, 3, 4, and 5. So, the Geometric Mean will be,

Geometric Mean = 51*2*3*4*5 = 2.6051.

Calculating Harmonic mean and Geometric mean using Excel functions

The above image demonstrates a practical example of using both HARMEAN and GEOMEAN functions. The formula for calculating the harmonic mean in the B21 cell is:

=HARMEAN(C5:C17)

Here, the range of cells C5:C17 represents the cells of the Average Marks column.

One thing is clear that the value of the harmonic mean in this case is lower than the average value of the Average Marks. The arithmetic mean is 82.58, but the harmonic mean is 82.37. That means it limits the value of the large value of Average Marks.

In the case of finding the geometric mean, we have used the following formula in cell C22.

=GEOMEAN(C5:C17)

Like the harmonic mean, the geometric mean (82.47) is different from the arithmetic mean (82.58). Investors use the geometric mean as it provides a more accurate average value, whenever row values are given across a number of periods.


1.4 Applying STANDARDIZE Function

In this section of the article, we will learn to use the STANDARDIZE function in Excel. The STANDARDIZE  function returns the normalized value of a distribution. This value is defined by the mean value and the standard deviation of the selected numbers.

Calculating average and standard deviation

Using STANDARDIZE function to find the standardized values

  • Following that, insert the following formula in cell D5.

=STANDARDIZE(C5,$C$20,$C$21)

Here, cell C5 represents the Average Marks of Mathew, cell $C$20 indicates the Mean value, and cell $C$21 refers to the STDEV.P(Standard Deviation) value.

Similarly, you can use the MODE.SNGL, MEDIAN, VAR.S, VAR.P, STDEV.S, and STDEV.P functions to further statistically analyze data in Excel.


2. Computing Relative Standing

Computing relative standing is another important aspect of statistically analyzing data in Excel. In this section of the article, we will use various functions of Excel to do this. Firstly, let’s discuss the RANK.EQ and RANK.AVG functions.

The RANK.EQ function returns the relative ranks for a set of data. The criterion that separates the RANK.EQ function from the RANK.AVG function is the return value in the case of ties. In other words, if two ranks are tied, then the RANK.EQ function will return the top rank for both values.

For instance, let’s say both the 5th and 6th-ranked values are the same. In that case, the RANK.EQ function will return rank 5 for both of those values, and the next rank value will be rank 7. Here, we have the Total Marks of 7th Grade Students as our dataset.

Applying RANK.EQ function to compute relative standing

Here, you can see that the 10th and the 11th values were tied. So, the RANK.EQ function returned rank 10 for both values. We have applied the following formula in cell D5.

=RANK.EQ(C5,$C$5:$C$17,0)

Here, cell C5 refers to the first cell of the Total Marks column, and the range of cells $C$5:$C$17 represents the cells of the Total Marks column.

Using RANK.AVG function

Here, the 10th and the 11th values were tied. So, the RANK.AVG function returned an average of 10.5 for both values.

Now, let’s learn about the RANK.AVG function. The RANK.AVG function also returns the relative ranks of a dataset. But, in the case of ties, it will return an average rank for the tied values. For example, let’s say the 4th and 5th ranked values are tied. So, the RANK.AVG function will return a rank of 4.5 for both values. The rank of the next value will be 6. Now, let’s use the instructions outlined below to utilize the RANK.AVG function in Excel to statistically analyze data. We have used the following formula in cell E5.

=RANK.AVG(C5,$C$5:$C$17,0)

Furthermore, you can also use the PERCENTRANK.INC, PERCENTRANK.EXC, PERCENTILE.INC, PERCENTILE.EXC, QUARTILE.INC and QUARTILE.EXC functions to compute the relative standing of data in Excel.


3. Determining Correlation and Regression

Finding the correlation and regression is a significant aspect of statistical analysis in Excel. In this section of the article, we will learn about determining the correlation and regression for a set of data in Excel.


3.1 Using SLOPE, INTERCEPT, and STYEX Functions

You can simply use the SLOPE & INTERCEPT functions to find slope and intercept value by using a set of X Values and Y Values as shown in the image below. The STYEX function gives us the standard error of Y Values for given X Values. We can use it to make predictions about the Y Value from an X Value. Now, let’s use the procedure discussed in the following section.

Using SLOPE, INTERCEPT, and STYEX Functions to find slope and intercept value

Here, we use the formula given below in cell C21.

=STEYX(C5:C17,B5:B17)

Here, the range of cells C5:C17 indicates the cells of the Y Values column and the range of cells B5:B17 refers to the cells of the X Values column.

Then, press ENTER.

Consequently, you will have the Standard Error of the Y Values for given X Values in cell C21.


3.2 Applying CORREL Function

Applying CORREL Function to  how closely two sets of data are related to each other

The CORREL function helps us find how closely two sets of data are related to each other. You can follow the steps mentioned in this article to use the CORREL function. In our dataset, we have used the following formula in the C19 cell.

=CORREL(C5:C17,D5:D17)

Here, the range C5:C17 indicates the cells of the Total Marks column, and the range D5:D17 refers to the cells of the Average Marks column.

4. Applying Array Functions for Statistical Analysis

Applying Array Functions for Statistical Analysis

Here, we have the frequencies against each Bin Range as demonstrated in the following picture. We have used the FREQUENCY function, one of the most commonly used array functions, to statistically analyze data in Excel. The formula in cell C17 is:

=FREQUENCY(C5:C14,B17:B23)

Here, the range B17:B23 represents the first seven cells of the Bin Ranges column.

Furthermore, you can use the MODE.MULT function, LINEST function, TREND function, and GROWTH function to statistically analyze data in Excel.

Note: If you are using an older version of Excel, you might need to press CTRL + SHIFT + ENTER to use the array formulas. As we are using Excel 365, simply pressing ENTER will do for us.


5. Utilizing Data Analysis ToolPak to Calculate Moving Average

Utilizing Data Analysis ToolPak to Calculate Moving Average

The above image represents the Moving Average of our dataset. In this section, we have discussed the way of finding the Moving Average using Data Analysis ToolPak, one of the most useful tools in Excel when it comes to statistically analyzing data.

However, the Data Analysis ToolPak option is not available in the Excel Ribbon by default. You will need to manually activate this feature. You can follow this article to activate the Data Analysis ToolPak and also learn about its various uses.

Utilizing Data Analysis ToolPak to Calculate Moving Average

Now go to the Data tab from Ribbon >> choose the Data Analysis option from the Analysis group.

Moving average window

As a result, the Data Analysis dialogue box will appear on your worksheet as shown in the above image.

After that, go to the Input Range field to select the cells of the Average Marks column >> click on the Output Range field and select cell G5 >> click OK.


Some Common Data Analysis Tools in Excel

  • Anova: Single Factor → It is used for the analysis of variance for two or more observations.
  • Anova: Two Factor with Replication → For each combination of the levels of the variables, it creates an analysis of variance with two independent variables and various observations.
  • Anova: Two Factor Without Replication → For each combination of the levels of the variables, it creates an analysis of variance with two independent variables and a single observation.
  • Correlation → When there are more than two measurements on a sample of people, it computes a matrix of correlation coefficients for each possible pair of measurements.
  • Covariance → When there are more than two measurements on a sample of people, it computes a matrix of covariance coefficients for each possible pair of measurements.
  • Descriptive Statistics → It produces a report containing a summary of the central tendency, variability, and other properties of values within a defined range of cells.
  • Exponential Smoothing → It makes a prediction about the next value of a sequence, using the sequence of the previous values and previous predictions that were made.
  • F-Test Two Sample for Variances → It creates a comparison of two variances using by performing an F-Test.
  • Histogram → It builds a tabular depiction of the frequency distribution of values within a chosen cell range.
  • Random Number Generation → Based on one of the seven potential distributions, generates a specific quantity of random numbers.
  • Rank and Percentile → It creates a table that displays each value in a set of values along with its ordinal rank and percentile rank.
  • Regression → This creates a report of the liners regression statistics applying a set of data that includes one dependent variable and one or more independent variables.
  • Sampling → It generates a sample of values from the cells in the specified range.

You’ll get the following analysis tools in the Data Analysis ToolPak.

Data Analysis window


Frequently Asked Questions (FAQs)

  • How do I perform a t-test in Excel?

To perform a t-test in Excel, you can use the T.TEST function. This function takes two arguments: the data sets to compare and the type of t-test you want to perform.

  • How do you tell if data is statistically significant in Excel?

Your final output is statistically significant if the p-value falls below your significance level. Both one-tailed and two-tailed t-tests come with p-values in Excel. Only one direction can be identified by one-tailed t-tests when comparing means.

  • How do I perform an ANOVA in Excel?

To perform an ANOVA in Excel, you can use the Analysis ToolPak. Using this data analysis tool, you can accomplish either a one-way ANOVA or a two-way ANOVA, whichever you need.

  • How do I perform a regression analysis with multiple independent variables in Excel?

To perform a regression analysis with multiple independent variables in Excel, you can use the Regression tool in the Analysis ToolPak add-in. In addition to the coefficients and p-values, this tool allows you to specify the dependent variable and multiple independent variables.


Things to Remember

  • Before performing any data analysis in Excel, you have to clear about your data type, e.g. continuous, categorical.
  • Next, you must pick the enriched list of statistical analysis tools e.g. t-test, ANOVA, regression and correlation.
  • Once you’ve conducted your analysis, it’s important to interpret your results in a meaningful way. This means understanding what the numbers mean and how they relate to your research question.
  • Finally, it’s important to validate your results by checking for errors and ensuring that your analysis is robust. This includes checking for outliers, testing assumptions, and conducting sensitivity analyses.

Practice Section

In the Excel Workbook, we have provided a Practice Section on the right side of the worksheet. Please practice it yourself.

Sample Practice Section provided in each worksheet of the Practice Workbook.

Excel data for practice


Conclusion

So, we have come to the very end of the article. Here, we have tried to explain some useful methods to statistically analyze data in Excel. In this article, you will find a lot of associated links to different functions. We highly recommend you visit those links for a better understanding of statistically analyzing data. If you have any queries or remarks about the article, please let us know in the comment section. You can also have a look at our other helpful articles on Excel functions and formulas on our website, ExcelDemy, a one-stop Excel solution provider.

Get FREE Advanced Excel Exercises with Solutions!
Zahid Hasan
Zahid Hasan

Zahid Hassan, BSc, Industrial and Production Engineering, Bangladesh University of Engineering and Technology, has worked with Exceldmy for 1.5 years. He has written 95+ articles for Exceldemy. He has worked as an Excel & VBA Content Developer. He also worked as a VBA Developer for the Template team. Currently, he is working as a Junior Software Developer for the Excel Add-in project. He is interested in Software Development, Python, VBA, VB.NET, and Data Science, expanding his expertise in... Read Full Bio

We will be happy to hear your thoughts

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo