Report On Data Visualization
Report On Data Visualization
AUG 2021
PROBLEM STATEMENT
"Selling of used bikes depreciates after its ownership changes from one to different hands.
This data shows the selling price, brand of the motorcycle, year of make, ownership degree
of a specific context of bikes. In this report, we will investigate the data using graphs and
charts and carry out the analysis of the problem for decision making."
A. NOMINAL DATA
It is defined as data that is used for naming or labelling variables without any quantitative
value. There is no intrinsic ordering to nominal data
.
109 102
69
24 30
2 1 1 2 1 3 4 6 3 4 1
i a ja
j lli W on ero nda ung wa saki TM dra eld zuki TVS ade spa ha yle
ril Ba ne BM ids H Ho os Ja K in nfi Su a t
Ap B e v a
ah al E eg V e a m o s
- Da Hy Kaw M en Y Y
y y R
rle Ro UM
Ha
Brands
ANALYSIS:
From the data provided, we understand that Bajaj, Hero, Honda, Royal Enfield, Yamaha are
the leading bikes from respective brands to be on sale. The sales are from the first owner to
2nd or 2nd to 3rd or 3rd to 4th owner. So the number of sales is directly proportional to the
purchases made. Hence we could even interpret that the above-given brands are best-selling
brands. We can even take the drawback of this analysis can even be taken as the most selling
brands are less reliable by their owners; hence they have put them up for sale.
B. ORDINAL DATA
It is data where the variables have natural, ordered categories, and the distance between the
classes is unknown.
OWNER NUMBER OF
TYPE BIKES
1st Owner 924
2nd Owner 123
3rd Owner 11
4th Owner 3
TOTAL 1061
ANALYSIS:
The data provided shows that the first owner's bikes for sale are significantly massive
compared to the succeeding ownership put up for sale. We can interpret from this data that
first owners like to sell their bikes to buy newer ones. The 2 nd, 3rd & 4th owners tend to keep
their bikes as from the graph, significantly less number of people are willing to sell their
respective purchases
C. INTERVAL DATA
It is defined as a data type that is measured along a scale, in which each point is placed at an
equal distance from one another. Interval data always appears in the form of numbers.
Number of 2003 1
Years Bikes 2004 5
1988 1 Source 2005 14 - Motorcycle
1989 0 Dataset 2006 20 | Kaggle
1990 0 2007 29
ANALYSIS:
1991 1 2008 28
The 1992 0 2009 28 histogram
1993 1 2010 60 obtained is
highly 1994 0 2011 61 skewed
1995 1 2012 70 towards the
left. 1996 0 From 2013 73 the histogram, it
is 1997 2 2014 91 understandable
1998 3 2015 100
that the bikes put up for
1999 1 2016 107
sale after 2004 have an
2000 6 2017 133
increasing
2001 2 2018 131
trend, and there is a sharp
2002 3 2019 86
decline after 2018.
2020 3
In the line chart, the number of bikes put up for sale is rapidly increasing after 2004. There
might be a variety of reasons for the increase—for example, there could be a trend for the
latest version of bikes, there could be a new scrappage policy by the government. Of the
depreciation cost would go up high hence it would be reasonable for a change.
D. RATIO DATA
It has the same properties as interval data, with the equal ratio between each data and
absolute zero to be treated as a point of origin. In other words, there can be no negative
numerical value in ratio data.
DISTANCE
TRAVELLED NUMBER OF BIKES
0-15000 324
15001 - 30000 309
30001 - 45000 189
45001 - 60000 128
60001 - 75000 54
75001 - 90000 21
> 90000 36
75001 - 90000 21
Distance Travelled
0-15000 324
ANALYSIS: From the horizontal bar chart, we can clearly understand that the bikes that run
for a short travel distance are for sale. Greater is the distance travelled, the bikes for sale in
number decreases. We could predict that the people buying their first purchases might have
made a mistake and are selling them to buy a better one.