Installation of the data.table Package in R (3 Examples)
In this tutorial you’ll learn how to install and load the data.table package in the R programming language.
Installation and Loading of data.table
We install the package.
install.packages("data.table") # Install data.table package |
install.packages("data.table") # Install data.table package
The package is loaded.
library("data.table") # Load data.table package |
library("data.table") # Load data.table package
Preparing the Examples
For an exemplary illustration, we take the iris data.
data(iris) # Load iris data set head(iris) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1 5.1 3.5 1.4 0.2 setosa # 2 4.9 3.0 1.4 0.2 setosa # 3 4.7 3.2 1.3 0.2 setosa # 4 4.6 3.1 1.5 0.2 setosa # 5 5.0 3.6 1.4 0.2 setosa # 6 5.4 3.9 1.7 0.4 setosa |
data(iris) # Load iris data set head(iris) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1 5.1 3.5 1.4 0.2 setosa # 2 4.9 3.0 1.4 0.2 setosa # 3 4.7 3.2 1.3 0.2 setosa # 4 4.6 3.1 1.5 0.2 setosa # 5 5.0 3.6 1.4 0.2 setosa # 6 5.4 3.9 1.7 0.4 setosa
Create a new data.table object from the iris data (which is a data.frame).
iris_dt <- data.table::copy(iris) # Replicate iris data set setDT(iris_dt) # Convert iris to a data.table |
iris_dt <- data.table::copy(iris) # Replicate iris data set setDT(iris_dt) # Convert iris to a data.table
Example 1: Address Certain Rows
As an example of data manipulation in data.table, we index the first rows of the iris data.table.
iris_dt[ 1:3, ] # Row 1 to 3 # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1: 5.1 3.5 1.4 0.2 setosa # 2: 4.9 3.0 1.4 0.2 setosa # 3: 4.7 3.2 1.3 0.2 setosa |
iris_dt[ 1:3, ] # Row 1 to 3 # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1: 5.1 3.5 1.4 0.2 setosa # 2: 4.9 3.0 1.4 0.2 setosa # 3: 4.7 3.2 1.3 0.2 setosa
We can also address all those rows for which variable Species is equal to virginica.
head(iris_dt[ Species == "virginica", ]) # Head of all those rows where variable "Species" equals "virginica" # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1: 6.3 3.3 6.0 2.5 virginica # 2: 5.8 2.7 5.1 1.9 virginica # 3: 7.1 3.0 5.9 2.1 virginica # 4: 6.3 2.9 5.6 1.8 virginica # 5: 6.5 3.0 5.8 2.2 virginica # 6: 7.6 3.0 6.6 2.1 virginica |
head(iris_dt[ Species == "virginica", ]) # Head of all those rows where variable "Species" equals "virginica" # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1: 6.3 3.3 6.0 2.5 virginica # 2: 5.8 2.7 5.1 1.9 virginica # 3: 7.1 3.0 5.9 2.1 virginica # 4: 6.3 2.9 5.6 1.8 virginica # 5: 6.5 3.0 5.8 2.2 virginica # 6: 7.6 3.0 6.6 2.1 virginica
Example 2: Address Certain Rows and Columns
In this Example, we index both columns and rows of a data.table. We address the values of variables Sepal.Length and Petal.Length for those rows for which variable Species is equal to virginica.
head(iris_dt[ Species == "virginica", list(Sepal.Length, Petal.Length)]) # Head of columns "Sepal.Length", "Petal.Length" of all those rows where variable "Species" equals "virginica" # Sepal.Length Petal.Length # 1: 6.3 6.0 # 2: 5.8 5.1 # 3: 7.1 5.9 # 4: 6.3 5.6 # 5: 6.5 5.8 # 6: 7.6 6.6 |
head(iris_dt[ Species == "virginica", list(Sepal.Length, Petal.Length)]) # Head of columns "Sepal.Length", "Petal.Length" of all those rows where variable "Species" equals "virginica" # Sepal.Length Petal.Length # 1: 6.3 6.0 # 2: 5.8 5.1 # 3: 7.1 5.9 # 4: 6.3 5.6 # 5: 6.5 5.8 # 6: 7.6 6.6
Example 3: Using the By-Group Argument
As a last example, we illustrate the group function in data.table. With the following line of code, for each unique value of variable Species we calculate the mean value of Petal.Width.
iris_dt[ , mean (Petal.Width), by = Species ] # Mean values of "Petal.Width" by unique Values of "Species" # Species V1 # 1: setosa 0.246 # 2: versicolor 1.326 # 3: virginica 2.026 |
iris_dt[ , mean (Petal.Width), by = Species ] # Mean values of "Petal.Width" by unique Values of "Species" # Species V1 # 1: setosa 0.246 # 2: versicolor 1.326 # 3: virginica 2.026
Note: This article was created in collaboration with Anna-Lena Wölwer. Anna-Lena is a researcher and programmer who creates tutorials on statistical methodology as well as on the R programming language. You may find more info about Anna-Lena and her other articles on her profile page.