Case Study : Simple Sukoon Data Analysis
By Shaurya Singh Bhati
Introduction
This is a hypothetical company which has two brands - SS and One.
They provide goods in the line of clothing and home decor.
​
I come from a non-programming background and this project is my first attempt at mysql and power BI project-work.
I sourced the data for this project from github.
The data was originally set for a retail company selling electronic goods. But, I edited some of the data in excel and then edited majority of data in mysql on Bigquery on Google cloud console.
​
Below in the table of contents I have specified what this post contains.
Give the whole post a read if you can or otherwise you can access the quick links here for:
​
Please leave a review or feedback as it would really help improve my work.
​
​
Table of contents :
-
About the Company
-
Process
-
Power BI
​
​
About the Company
SimpleSukoon is a company providing clothes and home-decor products.
I put together a report based on the data by first formulating a few KPIs, then used those KPIs to formulate queries in sql to find their answers.
Then in order to convey my findings more clearly and efficiently, I used Power BI desktop to create a report containing visualizations.
​
Process:
​
Excel :
-
First, I browsed the data properly in Excel. I read all the column names and the different sheets.
Then, I ascertained the schema of how all these sheets were connected.
-
Then I edited some minor values in the data like changed the brand name, products name and mrp values.
-
I had to save each sheet as a separate csv file so that I could import it locally as a table on Bigquery sql.
SQL :
-
In Bigquery SQL workspace on Google cloud console I first created a project simplesukoon.
-
In that I created a new dataset and then added tables to it.
-
To add tables I imported my locally saved csv files from my computer, one by one and named the tables properly for easier reference in queries.
-
I wrote the query to check and remove for any null values n the data.
-
I checked for any negative values, which were 50 rows but they were related to the return order type category.
So I couldn't remove it as that was also a good part of my data analysis to know how much products were returned and what cost it incurred.
-
Then I wrote down the following points to base my queries on based on the data:
-
Total Value of orders​
-
Total no. of orders
-
Most sold product
-
The different earning distribution of the products and their brands
-
Rate of repeat customers
-
No. of returned orders
-
City, Pincode with the highest orders
-
Most used mode of payment
-
Highest no. of deliveries by which delivery guy
-
-
Then I saved two results from the above queries as csv tables to use as datasets in power bi and I imported the created csv tables into my current dataset. This made it easier to create a live connection of my dataset to my report in Power BI.
​
Link to my github profile for all the queries.
​
​
-
Imported the necessary tables from the dataset simplesukoon by connecting to the sql server of google Bigquery.
-
After importing the data created a new report.
-
Then connected the tables from the managed relationships tab.
-
In the new report after renaming it to simplesukoon report I started adding my visuals.
-
Before adding my visuals I first tried my hand with their Q&A feature but it still needs a little more work to handle complex results.
-
Then I added visuals, formatted their design, played around with a few more calculations and variables from the data. Then, removed those that didn't really helped the visual reporting aspect.
-
Then I saved the report by publishing it in the Power Bi server.
-
But, to add it as viewable link in my portfolio project i saved the report on the Power BI service as a dashboard and from the file tab used embed feature to share a link here.​
​