**Kernel Density Estimation** is a form of probability distribution function used in statistics. This method is used for nonparametric statistical data. In this article, we will discuss how to perform kernel density estimation in Excel with proper illustrations and explanations.

**Table of Contents**hide

**Download Practice Workbook**

Download this practice workbook to exercise while you are reading this article.

**What Is Kernel Density Estimation?**

The kernel density estimation is the process to estimate the probability density function for random values. This process makes the curve edges smooth based on the weighted values. It depends on the smoothing parameter called bandwidth. The mathematical formula to calculate kernel density estimation is given below:

Here,

**K**is kernel**h**is bandwidth**n**is the number of data

The kernel density estimation process is also applicable to machine learning applications.

**Steps to Perform Kernel Density Estimation in Excel**

Here, we will discuss the whole process to apply kernel density estimation in Excel. When we work with any statistical operation, there are two data available for consideration population and sample data. The population is the whole data of a system. But all-time population data is not available. For this, we mostly work with sample data. Here, we will consider the data of the share price of Microsoft. We have collected data from **finance.yahoo.com**. To extract your desired data from the following website look at this article **How to Import Stock Prices into Excel from Yahoo Finance**.

**📌 Step 1: Collect Data for Kernel Density Estimation**

- Here, we insert data for the last
**6**months of Microsoft. This data indicates the change of share price day to day in percentile.

We collected data from the mentioned website.

**Read More: ****How to Find Maximum Likelihood Estimation in Excel**

**📌 Step 2: Sort the Collected Data Using Excel SORT Function**

We will sort the raw data in this section. The **SORT function** will be used for this purpose.

- First, we add a new row on the right side.

- Then, put the following formula on
**Cell D5**.

`=SORT(C5:C131,1,1,FALSE)`

Here, **1st** argument is the range, **2nd** argument is the index number, **3rd** argument is the sorting order, and** 4th** argument for **False** for sorting based on row.

**Read More: ****How to Create Estimation Tool in Excel (with Easy Steps)**

**📌 Step 3: Add Supporting Data to Calculate Kernel Density**

We need some other data to calculate Kernel density. We will calculate those data from the raw data.

- First, add new rows for the supporting data.

- To calculate the sample size, we will use the
**COUNT function**. Put the following formula on**Cell H5**.

`=COUNT(D5:D131)`

- Assume, the value of
**bandwidth (h)**is**00%**.

**📌 Step 4: Calculate Kernel Density**

Here, we need to create a formula to calculate Kernel density estimation in Excel. The formula for Kernel density estimation is mentioned here.

- Apply the customized Excel formula on
**Cell E5**and expand that to the last cell.

`=1/($H$5*$H$6)*SUM(NORM.S.DIST((D5-D$5:D$131)/H$6,FALSE))`

**Similar Readings**

**How to Do Interior Estimation in Excel (2 Suitable Methods)****Create a Project Time Estimation Sheet in Excel****How to Calculate Residential Construction Cost Estimator in Excel****How to Make an Effort Estimation Sheet in Excel (4 Easy Methods)**

**📌 Step 5: Plot a Suitable Excel Graph of Kernel Density vs Return**

We already calculated kernel density. Now, plot that into a graph.

- Choose the
**Range D5:E131**. - Click on the
**Insert**tab. - Now, choose the
**Scatter with Smooth Lines**from the**Charts**group.

- Look at the graph.

We can see the **Density Vs. Return** graph. We get this graph for bandwidth **1.00%**. This bandwidth is an assumption.

**📌 Step 6: Compare the Kernel Density vs Return Graphs for Different Bandwidths**

Previously, we told that the value of bandwidth is an assumption. Now, we will change the bandwidth to see the change in the graph.

- Look at the image below.

1st graph is for **h=1.00%**. The line is smooth. In the 2nd graph, **h=0.01%**, the line is a bit noisy. Increase the value of h in the 3rd graph and set that to** 0.10%**. See this is much better than the 2nd one. In the 4th graph, **h=4.00%**, and the appeared graph is close to a straight line. But this assumption may not be correct all the time.

**📌 Additional Step: Calculate Bandwidth with Formula and See the Changes in Graph**

There are multiple ways to calculate **bandwidth (h)**, for Kernel density estimation. The most two popular methods are given below.

**Scott’s Rule: **This method is used for normal distribution. The formula to calculate the bandwidth is Scott’s rule is:

**Silverman’s Rule: **This method can be used for all kinds of distribution processes. It is more robust but does not work for complicated cases. The formula of bandwidth in **Silverman’s Rule** is:

**i. Calculate Standard Deviation for Scott’s Rule**

Here, we calculate the Standard Deviation using the **STDEV.S function**. Look at the following formula.

- We get the standard deviation and will use this to calculate the bandwidth.
- Now, apply the following formula on formula bandwidth (h).

`=STDEV.S(D5:D131)`

See the graph has been changed.

**ii. Apply Scott’s Rule to Calculate Bandwidth**

- We will apply the following formula for bandwidth based on
**Scott’s**rule.

`=$H$7*(4/(3*$H$5))^(1/5)`

As the bandwidth changed, the graph also changed accordingly.

**iii. Calculate IQR for Silverman’s Rule**

The **IQR** stands for interquartile range. It is the measure of the middle fifty. We can mathematically present this as,

`IQR = Q3-Q1`

This is the difference between the 3rd and 1st quarters.

- Apply the following formula to
**Cell H8**.

`=PERCENTILE.EXC(C5:C131,0.75)-PERCENTILE.EXC(C5:C131,0.25)`

We will use this **IQR** to calculate bandwidth in **Silverman’s Rule.**

**iv. Use Silverman’s Rule for Bandwidth**

- In this section, we will apply the following customized formula based on
**Silverman’s Rule**to calculate the bandwidth.

`=0.9*MIN(H7,H8/1.34)*H5^(-1/5)`

Based on this bandwidth, the kernel density and graph also change.

**Conclusion**

In this article, we described all the steps to calculate kernel density estimation in Excel with explanations. I hope this will satisfy your needs. Please have a look at our website **ExcelDemy** and give your suggestions in the comment box.