The real benefit of data analytics is its ability to inform business decisions that fast track growth. After analyzing the data, you should be able to identify trends. These trends reveal actionable information that can be used to make changes to your website, app, product, etc. to optimize performance.
And we are happy to provide the accurate Data Analysis that will help you rule the Automotive Sector by understanding customers.
This is the dataset that I considered for performing the Data Analysis.
Dataset:
Unnamed: 0 | Make | Model | Variant | Ex-Showroom_Price | Displacement | Cylinders | Valves_Per_Cylinder | Drivetrain | Cylinder_Configuration | Emission_Norm | Engine_Location | Fuel_System | Fuel_Tank_Capacity | Fuel_Type | Height | Length | Width | Body_Type | Doors | ... | Rear_Center_Armrest | iPod_Compatibility | ESP_(Electronic_Stability_Program) | Cooled_Glove_Box | Recommended_Tyre_Pressure | Heated_Seats | Turbocharger | ISOFIX_(Child-Seat_Mount) | Rain_Sensing_Wipers | Paddle_Shifters | Leather_Wrapped_Steering | Automatic_Headlamps | Engine_Type | ASR_/_Traction_Control | Cruise_Control | USB_Ports | Heads-Up_Display | Welcome_Lights | Battery | Electric_Range | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
535 | 535 | Mahindra | Bolero Power Plus | Lx | Rs. 7,49,192 | 1493 cc | 4.0 | 2.0 | RWD (Rear Wheel Drive) | In-line | BS IV | Front, Transverse | Injection | 60 litres | Diesel | 1880 mm | 3995 mm | 1745 mm | SUV | 5.0 | ... | Yes | NaN | NaN | NaN | NaN | NaN | Yes | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
62 | 62 | Maruti Suzuki | Celerio X | Vxi (O) | Rs. 4,81,074 | 998 cc | 3.0 | 4.0 | FWD (Front Wheel Drive) | In-line | BS IV | Front, Transverse | Injection | 35 litres | Petrol | 1560 mm | 3600 mm | 1600 mm | Hatchback | 5.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
709 | 709 | Toyota | Innova Crysta | Touring Sport 2.7 Vx 7 Str | Rs. 18,92,000 | 2393 cc | 4.0 | 4.0 | RWD (Rear Wheel Drive) | In-line | BS VI | Front, Longitudinal | Injection | 55 litres | Petrol | 1795 mm | 4735 mm | 1830 mm | MUV | 5.0 | ... | Cup Holders | Yes | NaN | Yes | NaN | NaN | Yes | Yes | Yes | NaN | Yes | Yes | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
732 | 732 | Jeep | Compass | 2.0 Limited Plus 4X4 At | Rs. 24,99,000 | 1956 cc | 6.0 | 4.0 | AWD (All Wheel Drive) | V | BS 6 | Front, Longitudinal | Injection | 60 litres | Diesel | 1640 mm | 4395 mm | 1818 mm | SUV | 5.0 | ... | Cup Holders | Yes | Yes | NaN | NaN | NaN | Yes | Yes | Yes | NaN | Yes | Yes | NaN | Yes | NaN | NaN | NaN | NaN | NaN | NaN |
1160 | 1160 | Porsche | Macan | S | Rs. 85,03,000 | 2995 cc | 6.0 | 4.0 | AWD (All Wheel Drive) | In-line | BS IV | Front, Longitudinal | Injection | 65 litres | Petrol | 1624 mm | 4696 mm | 1923 mm | SUV | 5.0 | ... | Cup Holders | Yes | Yes | Yes | NaN | All | Yes | Yes | Yes | Yes | Yes | Yes | NaN | Yes | Yes | NaN | NaN | NaN | NaN | NaN |
11 | 11 | Datsun | Redi-Go | 1.0 S Amt | Rs. 4,37,065 | 999 cc | 3.0 | 4.0 | FWD (Front Wheel Drive) | In-line | BS IV | Front, Transverse | Injection | 28 litres | Petrol | 1541 mm | 3429 mm | 1560 mm | Hatchback | 5.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
585 | 585 | Mahindra | Xuv300 | 1.2 W4 | Rs. 8,30,127 | 1197 cc | NaN | NaN | RWD (Rear Wheel Drive) | In-line | BS 6 | Front, Transverse | Injection | 42 litres | Petrol | 1617 mm | 3995 mm | 1821 mm | SUV | 5.0 | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
978 | 978 | Kia | Seltos | Htx Plus 1.5 Diesel | Rs. 15,34,000 | 1493 cc | NaN | NaN | FWD (Front Wheel Drive) | In-line | BS 6 | Front, Longitudinal | Injection | 60 litres | Diesel | 1645 mm | 4315 mm | 1800 mm | SUV | 5.0 | ... | Cup Holders | Yes | NaN | Yes | NaN | Driver | Yes | Yes | Yes | Yes | Yes | Yes | NaN | NaN | Yes | NaN | NaN | NaN | NaN | NaN |
8 rows × 141 columns
The Dataset Statistics were as follows:
After this we can choose the desired features to perform Detailed Analysis.
First we Check the price distribution, we will use both normal and log scales due to the huge difference in prices.
Plot shows the price distribution of various cars.
Here I have plotted a box plot so that we can clearly examine the variance that we observed in the prices.
Plot here shows the most commmon cars sorted as per thir body types.
The box plot below shows that how price of the car varies with respect to its body type.
The plot below shiows the engine fuel type of the cars.
Here is a pie chart depiction of it.
Now let's see that what companies holds control over Indian market ( I am saying Indian because of the choice of our dataset and this can be applied to any dataset)
The plot below show the Top Car Making Companies in India.
As we have considered the Indian Cars Dataset here we are depicting that in a graph,
same can be done with other datasets as well
Distribution of cars by engine size.
Next We checked the Horsepower of the cars.
Plot showing the relation horsepower and price considering different body types.
Next we are looked into the relation between Mileage and price.
<Figure size 720x576 with 0 Axes>
checking the overall correlation of between variables and each other.
First we make a pearson correlation grid.
Now, checking an extensive scatter plot grid of more numerical variable to investigate the realtion in more detail.
I plotted a 3D scatter plot to check for obvious clusters with main features as price horsepower and mileage.
Clustering the market needs a lot of effort as the separation of clusters is not that obvious.
Here I am considering the example of corolla and clustering them in order to give a better competitor analysis. Similarly we can consider any other cluster group and analyze the market and models competitors in arather better way.
df = df[df.price < 60000]
num_cols = [ i for i in df.columns if df[i].dtype != 'object']
km = KMeans(n_clusters=8, n_init=20, max_iter=400, random_state=0)
clusters = km.fit_predict(df[num_cols])
df['cluster'] = clusters
df.cluster = (df.cluster + 1).astype('object')
df.sample(5)
make | model | car | variant | body_type | fuel_type | fuel_system | type | drivetrain | displacement | cylinders | mileage | power | torque | fuel_tank | height | length | width | doors | seats | wheelbase | airbags | price | cluster | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1126 | Hyundai | Creta | Hyundai Creta | 1.6 Crdi Sx (O) | SUV | Diesel | Injection | Manual | FWD (Front Wheel Drive) | 1582 | 4 | 19.67 | 126.25 | 260 | 55.0 | 1630.0 | 4270.0 | 1780.0 | 5 | 5 | 2590.0 | 6 | 21609 | 8 |
1115 | Skoda | Rapid | Skoda Rapid | Onyx Mt Diesel | Sedan | Diesel | Injection | Manual | FWD (Front Wheel Drive) | 1498 | 4 | 21.13 | 108.50 | 250 | 55.0 | 1466.0 | 4413.0 | 1699.0 | 4 | 5 | 2552.0 | 2 | 16220 | 4 |
573 | Hyundai | Verna | Hyundai Verna | 1.6 Crdi Sx | Sedan | Diesel | Injection | Manual | FWD (Front Wheel Drive) | 1582 | 4 | 23.90 | 126.25 | 260 | 45.0 | 1445.0 | 4440.0 | 1729.0 | 4 | 5 | 2600.0 | 2 | 16415 | 4 |
190 | Ford | Aspire | Ford Aspire | 1.5 Tdci Titanium Plus | Sedan | Diesel | Injection | Manual | FWD (Front Wheel Drive) | 1498 | 4 | 26.10 | 98.63 | 215 | 40.0 | 1525.0 | 3995.0 | 1704.0 | 4 | 5 | 2490.0 | 6 | 12073 | 5 |
220 | Toyota | Glanza | Toyota Glanza | V | Hatchback | Petrol | Injection | Manual | FWD (Front Wheel Drive) | 1197 | 4 | 21.01 | 80.88 | 113 | 37.0 | 1540.0 | 3995.0 | 1745.0 | 5 | 5 | 2520.0 | 2 | 10614 | 1 |
Now we check some scatter plots but with adding clusters
price vs power
power vs mileage
Engine size vs Fuel tank
3D plot:
checking the average prices of each cluster.
checking average prices of each cluster
checking that how many cars exist in each cluster.
Number of cars in each cluster.
first we find the cluster of the Toyota Corolla (and its variants)
We found that the cluster of the corolla is cluster 1 and also cluster 5 we can now search these clusters and check what is intersing about it.
First we check a sample of these clusters
Here is a more interactive chart that shows cars prices including all variants (with maximum and minimum value of each car).
A more interactive chart that shows cars prices including all variants (with max and min value of each car).
Count of each body type in the targeted cluster
(here we have taken the corolla cluster for performing our analysis).
seems like there are too many SUV's in the Toyota Clusters, should that be important?