Iris Analysis

Analysis with Nextjournal and R using the built-in Iris dataset

The Iris data comes from Ronald Fisher's 1936 paper, The Use of Multiple Measurements in Taxonomic Problems. Four features of three different Iris species were measured and recorded. Fisher settled on a sample size of fifty for each species.

Iris virginica

1.
The data

The raw data can be viewed in a table by returning it as the final value from a cell.

iris

The summary() function shows a few statistical measures of the data.

summary(iris)

2.
Plotting

We'll use the plotting functionality built into R and attach the dataset to make things cleaner.

attach(iris)

Dot plot with regression line.

plot(Petal.Length, Petal.Width,
     col=Species, pch=19,
     main="Iris Data",
     xlab="Iris petal length",
     ylab="Iris petal width"
    )
abline(lm(Petal.Width~Petal.Length), col="red")

Scatter matrix, with color coding by species and regression lines.

par(xpd=TRUE)
pairs(~Sepal.Length+Sepal.Width+Petal.Length+Petal.Width, data=iris,
      panel=function(x,y,...) {
        par(new = TRUE)
        plot(x,y,...)
        abline(lm(y~x), col="red")
      },
      col=Species, pch=19, 
      main="Iris Matrix",
      label=c("Sepal Length", "Sepal Width", "Petal Length", "Petal Width")
     )

legend("bottomright", fill = unique(Species), legend = c(levels(Species)), bg="white")