Cars Analysis
This article shows several common operations while programming in R, making use of the mtcars dataset that is included as part of the R 'graphics' package.
The mtcars data was extracted from the 1974 Motor Trend US magazine. It compares the fuel consumption and performance of thirty-two different automobiles based on ten different aspects of their design.
The code used in this article was based on the ggplot manual.
1. Coding with the mtcars dataset
Add ggplot2
to the list of libraries available to this R runtime.
library(ggplot2)
Create factors with value labels.
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5), labels=c("3gears","4gears","5gears")) mtcars$am <- factor(mtcars$am,levels=c(0,1), labels=c("Automatic","Manual")) mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8), labels=c("4cyl","6cyl","8cyl"))
The data can be directly printed to a table by returning as the last value in a cell.
mtcars
2. Gas Mileage Visualized in Plotly
2.1. Density Distribution
Here we visualize the distribution of the number of cars with a particular average gas mileage (in mpg, miles per gallon), grouped by the number of gears in the car's transmission.
Although we used a 'ggplot' function, it is wrapped with the corresponding 'plotly' function ggplotly()
in order to provide interactivity to the resulting figure. Interactive features provided by plotly include 'tool tips' showing values for datapoints on the plot and the ability to display or hide elements of the graph by clicking on on the corresponding element in the plot legend.
ggplot(mtcars, aes(mpg, fill = gear)) + geom_density(alpha = I(0.5)) + labs(title = "Distribution of Gas Milage", x = "Miles Per Gallon", y = "Density")
2.2. Regression Plots
In this example, we plot separate regressions of mpg on weight for each number of cylinders.
The performance distribution of each engine type is highlighted for visual clarity.
Plot creation maintains standard conventions of plotting using in ggplot.
ggplot(mtcars, aes(wt, mpg, color = cyl)) + geom_point() + geom_smooth(method = "lm") + labs(title = "Regression of MPG on Weight", x = "Weight", y = "Miles per Gallon", color = "Cylinders")
2.3. Box Plots
In this example we show box plots of mpg, grouped by number of gears.
The observations (points) are overlayed on the boxes and jittered for visibility.
ggplot(data = mtcars, aes(gear, mpg, fill = gear)) + geom_boxplot() + labs(title = "Mileage by Gear Number", x = "", y = "Miles Per Gallon")