If you are looking for ways to calculate P-value or probability value in linear regression in Excel, then you are in the right place. P-value is used to determine the probability of the results of hypothetical tests. We can analyze the results based on 2 hypotheses; the Null hypothesis and the Alternative hypothesis. Using the P-value we can determine whether the result supports the Null hypothesis or Alternative hypothesis.
So, let’s get started with the main article.
How to Calculate P Value in Linear Regression in Excel: 3 Ways
Here, we have some predicted sales values and actual sales values of some of the products of a company. We will compare these sales values and determine the probability value and then we will determine if P supports the null hypothesis or the alternative hypothesis. The null hypothesis reckons there is no difference between the two types of sales values and the alternative hypothesis will consider differences between these two sets of values.
We have used Microsoft Office 365 version here, you can use any other versions according to your convenience.
Read More: How to Do Linear Regression in Excel
Method-1: Using ‘t-Test Analysis Tool’ to Calculate P Value
Here, we will use the analysis toolpak containing the t-Test analysis tool to determine the P-value for these two sets of sales data.
Steps:
If you didn’t activate the data analysis tool then first enable this toolpak at first.
➤ Click on the File tab.
➤ Select Options.
After that, the Excel Options dialog box will appear.
➤ Select the Add-ins option on the left panel.
➤ Choose the Excel Add-ins option in the Manage box and then press Go.
Afterward, the Add-ins dialog box will pop up.
➤ Check the Analysis ToolPak option and press OK.
➤ Now, go to the Data Tab >> Analysis Group >> Data Analysis Option.
Then, the Data Analysis wizard will appear.
➤ Select the option t-Test: Paired Two Sample for Means from different options of Analysis Tools.
After that, the t-Test: Paired Two Sample for Means dialog box will open up.
➤ As Input we have to provide two variable ranges; $C$4:$C$11 for Variable 1 Range and $D$4:$D$11 for Variable 2 Range, as Output Range we have selected $E$4.
➤ You can change the value for Alpha from 0.05 (automatically generated) to 0.01 because the designated value for this constant is generally 0.05 or 0.01.
➤ Finally, press OK.
After that, you will get the P-value for two cases; the one-tail value is 0.00059568 and the two-tail value is 0.0011913. We can see the one-tail P-value is half times the two-tail P-value. Because the two-tail P-value considers both the increase and decrease of the marks whereas the one-tail P-value considers only one of these cases.
Moreover, we can see that for the Alpha value of 0.05 we are getting the P values less than 0.05 which means it neglects the null hypothesis and so the data is highly significant.
Read More: How to Interpret Linear Regression Results in Excel
Method-2: Using T.TEST Function to Calculate P Value in Linear Regression in Excel
In this section, we will be using the T.TEST function to determine the P values for tails 1 and 2.
Steps:
We will start with determining the P-value for tail 1 or in one direction.
➤ Type the following formula in cell F5.
Here, C4:C11 is the range of Predicted Sales, D4:D11 is the range of Actual Sales, 1 is the tail value and the last 1 is for the Paired type.
After pressing ENTER, we are getting the P-value 0.00059568 for tail 1.
➤ Apply the following formula in cell F6 to determine the P-value for tail 2 or in both directions.
=T.TEST(C4:C11,D4:D11,2,1)
Here, C4:C11 is the range of Predicted Sales, D4:D11 is the range of Actual Sales, 2 is the tail value and the last 1 is for the Paired type.
Read More: Multiple Linear Regression on Excel Data Sets
Method-3: Using CORREL, T.DIST.2T Functions to Calculate P Value in Linear Regression
We will determine P-value for correlation here by using the CORREL, T.DIST.2T functions.
To do this we created some columns with headers Total Item, Correl. Factor, t Value, and P value and we entered the value for total items also which is 8.
Steps:
➤ Firstly, We determine the Correl.Factor by entering the following formula in cell C14.
=CORREL(C4:C11,D4:D11)
Here, C4:C11 is the range of Predicted Sales, and D4:D11 is the range of Actual Sales.
➤ To determine the t value type the following formula in cell D14.
=(C14*SQRT(B14-2))/SQRT(1-C14*C14)
Here, C14 is the correlation factor, and B14 is the total number of products.
- SQRT(B14-2) becomes
SQRT(8-2) → SQRT(6) gives the square root of 6.
Output → 2.4494897
- C14*SQRT(B14-2) becomes
0.452421561*2.4494897
Output → 1.10820197
- 1-C14*C14 becomes
1-0.452421561*0.452421561
Output → 0.79531473
- SQRT(1-C14*C14) becomes
SQRT(0.79531473) → returns the square root of 0.79531473.
Output →0.891804199 - (C14*SQRT(B14-2))/SQRT(1-C14*C14) becomes
(1.10820197)/0.891804199
Output → 1.242651665
➤ Finally, by using the following function we will determine the P-value for correlation.
=T.DIST.2T(D14,B14-2)
Here, D14 is the t value, B14-2 or 8-2 or 6 is the degree of freedom and T.DIST.2T will return the P-value for correlation with the two-tailed distribution.
Read More: How to Do Multiple Regression Analysis in Excel
Things to Remember
⦿ Generally, we use two common Alpha values; 0.05 and 0.01.
⦿ There are two hypotheses, the null hypothesis, and the alternative hypothesis, the null hypothesis considers no difference between two sets of data and the other one takes into account the difference between two sets of data.
⦿ When the P-value is less than 0.05 it denies the null hypothesis and for values greater than 0.05 it supports the null hypothesis. By assessing the P-value we can come up with the following conclusions.
P=0.05 → significant data
P=0.05-0.1 → marginally significant data
P>0.1 → insignificant data
Practice Section
For doing practice by yourself we have provided a Practice section like below in a sheet named Practice. Please do it by yourself.
Download Workbook
Conclusion
In this article, we tried to cover the ways to calculate P-value in linear regression in Excel. Hope you will find it useful. If you have any suggestions or questions, feel free to share them in the comment section.
Hi, you present 3 methods to calculate p-values for the same data set. Supposedly, results should be independent of the method chosen. So an explanation for the difference obtained in one of the methods is needed. Thanks.
Hi Jose A. Valiente, thanks for your valuable suggestion. Actually, the third method is quite different from the first 2 methods as this method determines the P value for the correlation of the two sets of values using a correlation factor.