From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

In this article, we will show how to create a complete data science workflow in Excel.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel
Image by Editor
 

Data science doesn’t always require specialized programming languages or complex tools. By using Microsoft Excel’s right tools and techniques, you can execute a full data science workflow. It can handle many data science tasks with surprising power.

In this article, we will show how to create a complete data science workflow in Excel.

Consider a Sales & Marketing dataset that represents a company’s monthly performance across products and regions.

Import Your Data

Open a new Excel workbook and import your dataset. You can import data from various sources using the Get Data option in Excel.

Importing Data from Different Sources:

From CSV/Text Files:

  • Go to the Data tab >> click Get Data >> click From File >> select From Text/CSV.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

  • Browse your data file.
  • Select your file and click Import.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

  • In the preview window, verify data looks.
  • Click Load or Transform Data to open Power Query Editor.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Power Query offers easier data cleaning and automation.

From the Web:

  • Go to the Data tab >> select Get Data >> select From Web.
  • Enter the URL of the web page containing your data.
  • The Navigator will display available tables; select the one you want.
  • Click Load to import the data.

From a Database:

  • Go to Data tab >> select Get Data >> select From Database >> choose your database type.
  • Enter server credentials and connection information.
  • Select tables or write a custom query.
  • Click Load to import the data.

Data Cleaning and Transformation

Power Query Editor offers built-in features for data transformation and data cleaning.

Trim Text: 

  • Select columns.
  • Go to the Transform tab >> select Format >> select Trim.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Check Data Types: 

  • Go to the Home tab >> select Data Type.
  • Or, click on the column Header icon to change the data type.
    • Date: Date
    • Product Name, Category, Region: Text
    • Units Sold, Unit Cost, Marketing Spend, Total Sales, Net Profit: Whole Number or Decimal

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Remove Rows:

  • Remove Duplicates: 
    • Select all columns.
    • Go to the Home tab >> select Remove Rows >> select Remove Duplicates.
  • Remove Blank Rows: 
    • Select all columns.
    • Go to the Home tab >> select Remove Rows >> select Remove Blank Rows.
  • Remove Errors: 
    • Select all columns.
    • Go to the Home tab >> select Remove Rows >> select Remove Errors.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

  • Click Close & Load to bring the clean data into Excel.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Dataset:

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Exploratory Data Analysis (EDA)

Start by exploring key metrics using PivotTables and descriptive statistics.

Descriptive Statistics

Enable the Analysis ToolPak:

  • Go to the File tab >> select Options.
  • In the Excel Options box;
    • Select Add-ins >> select Excel Add-ins in the Manage box >> click Go.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

    • Select Analysis ToolPak >> click OK.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Summary Statistics of Units Sold:

  • Go to the Data tab >> select Data Analysis >> choose Descriptive Statistics.
  • Click OK.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

  • In the Descriptive Statistics box;
    • Input Range: Select Units Sold (e.g., E1:E61).
    • Select Labels in first row.
    • Output: Select New Worksheet ply.
    • Check the Summary statistics.
    • Click OK.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Output:

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Summary Statistics of Total Sales:

  • Follow the previous steps.
  • Insert Input Range: Total Sales (e.g., I1:I61).

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Create PivotTables

Let’s create a pivot table to analyze the sales dataset:

  • Select the data range.
  • Go to the Insert tab >> select PivotTable.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Sales by Product Category Analysis:

  • Create a PivotTable.
  • In the PivotTable Field list;
    • Drag Product Category to the Rows field.
    • Drag Total Sales to the Values field.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

  • Go to the PivotTable Analyze tab >> select PivotChart.
  • Select the Pie chart.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Sales Trends Over Time:

  • Create another PivotTable.
  • In the PivotTable Field list;
    • Drag Date to Rows field. Keep only the Month.
    • Drag Total Sales to the Values field.
  • Create a Line chart to visualize the monthly sales trend.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Regional Analysis:

  • Create a PivotTable.
  • In the PivotTable Field list;
    • Drag the Region to the Row field.
    • Drag Total Cost to the Values field.
    • Drag Total Sales to the Values field.
    • Drag Net Profit to the Values field.
  • Create a Column chart to visualize.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Product Performance:

  • Create a PivotTable
  • In the PivotTable Field list;
    • Drag Product Name to the Rows field.
    • Drag Total Sales to the Values field.
  • Sort by Total Sales descending to identify the top products.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

  • Show the top 5 products using Value Filters.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Output:

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Interactivity

Insert Slicers:

  • Select PivotTable.
  • Go to the PivotTable Analyze tab >> click Insert Slicer >> select Product Name.

Insert Timeline:

  • Select PivotTable.
  • Go to the PivotTable Analyze tab >> click Insert Timeline >> select Date.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Verify Interactivity:

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Simple Modeling (Linear Regression)

Let’s model Total Sales as a function of Marketing Spend.

  • Go to Data tab >> select Data Analysis.
  • Choose Regression >>click OK.
  • In the Regression box;
    • Input Y Range: Select the Total Sales column (e.g., I1:I61).
    • Input X Range: Select Marketing Spend column (e.g., H1:H61).
    • Check the Labels if you selected the headers (H1, I1).
    • Output Range: Select New Worksheet Ply.
    • Click OK.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Interpret the Output:

  • R Square: Explains variance. Higher = better fit.
  • You can predict revenue in a new column using:
=Intercept + Coefficient * [Marketing Spend]

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

  • R Square = 0.0021
    • Marketing Spend explains only 0.21% of the variation in Total Sales: a very weak relationship.
  •   P-value for Marketing Spend = 0.7302
    • Not statistically significant (p > 0.05). Changes in marketing spend don’t meaningfully impact sales.
  •   Intercept = 4305.10
    • This is the baseline sales when Marketing Spend is zero.

Coefficient Interpretation:

Predictor Coefficient Meaning
Intercept 4305.10 When Marketing Spend is 0, the predicted Total Sales is $4305.
Marketing Spend 0.1393 For each additional $1 spent on marketing, Total Sales increase by $0.14, on average.

Regression Equation:

Total Sales = 4305.10 + 0.1393 × Marketing Spend

The regression analysis shows that Marketing Spend has no statistically significant impact on Total Sales based on the dataset. Although the coefficient suggests a small positive relationship (every $1 spent increases sales by $0.14), the very high p-value (0.73) and low R-squared (0.002) indicate that this effect is weak and likely due to chance.

Insight Communication: Dashboard

Let’s create a dashboard sheet using created PivotTables, charts, and KPIs.

  • Add Title and Date section:
    • Add your company logo (if available).
    • Add dashboard title “Sales Performance”.

Add dynamic date:

="Report Generated on "&TEXT(TODAY(),"mmmm d, yyyy")
  • Add Summary Metrics Section:
    • Insert KPI Cards.
    • Total Sales:
=SUM(Sales_Dataset[Total Sales])
    • Average Marketing Spend:
=AVERAGE(Sales_Dataset[Marketing Spend])
    • Total Orders:
=COUNTA(Sales_Dataset[Unit Sold])
    • Average Order Value:
=SUM(Sales_Dataset[Total Sales]) / COUNTA(Sales_Dataset[Total Sales])
    • Pull values from your Summary Statistics sheet.
  • Add Visualization Sections:
    • Copy marketing vs sales scatter plot.
    • Copy your monthly sales trend chart.
    • Copy your product category pie chart.
    • Copy your regional sales bar chart.
    • Copy your product performance column chart.
    • Adjust size and position for optimal layout.
  • Connect Charts to Slicers and Timeline:
    • Right-click on the Slicers and Timeline >> select Report Connections.
    • Select all the PivotTables >> click OK.
  • Format and Finalize Dashboard:
    • Apply a consistent color scheme.
    • Add borders and backgrounds to sections.
    • Ensure all charts have clear titles.
    • Add data labels where appropriate.

From Spreadsheet to Insight: A Complete Data Science Workflow in Excel

Conclusion

Excel is an incredibly powerful tool for data analysis. By using advanced features and options, you can create the entire data science workflow. By combining Power Query, descriptive tools, modeling capabilities, and dashboard features, Excel transforms from a simple spreadsheet tool into a practical data science platform. Though specialized data science tools offer more advanced capabilities, Excel provides an accessible starting point for many data science workflows.

Get FREE Advanced Excel Exercises with Solutions!

Shamima Sultana
Shamima Sultana

Shamima Sultana, BSc, Computer Science and Engineering, East West University, Bangladesh, has been working with the ExcelDemy project for 3+ years. She has written and reviewed 1000+ articles for ExcelDemy. She has also led several teams with Excel VBA and Content Development works. Currently, she is working as the Project Manager and oversees the day-to-day work, leads the services team, allocates resources to the right area, etc. Her work and learning interests vary from Microsoft Office Suites, and... Read Full Bio

We will be happy to hear your thoughts

Leave a reply

Advanced Excel Exercises with Solutions PDF

 

 

ExcelDemy
Logo