• Ekta Aggarwal

Getting frequency distribution in R : Table function

In this tutorial we will learn how to get the frequency distribution in R using table function.


For this we will use R's inbuilt mtcars dataset which gives information about 32 cars. We can view the mtcars data using View function

View(mtcars)

Frequency Distribution for one variable:


Let us consider the variable am: which depicts whether the car is automatic or manual (0 = automatic, 1 = manual)


To understand how many automatic and manual cars are there we can use table( ) function. In table function we specify the categorical column for which we want the frequency distribution

table(mtcars$am)

Output:

0 1

19 13


In the above output we can see that 19 is written below 0, which means there are 19 cars with am = 0. Similarly, 13 is written below 1, which means there are 13 cars with am = 1.



Note: table( ) function for a single column resulted in a vector. Thus, if we want to get the number of cases where am = 0, we can subset this vector, we writing either:

 table(mtcars$am)[1]

or

table(mtcars$am)["0"]

In the first command we are subsetting the first element of the vector, while in the second command we are specifying that we are looking for am = "0".

Note that in the second command we have to enclose the 0 by quotes " "


2-D Frequency Distribution:

Let us say we want to see the frequency distribution for 2 categorical variables, thus for that use table( ) function.


Let us see consider the variables cyl, which tells us about the number of cylinders and am i.e, whether the car is automatic or manual (0 = automatic, 1 = manual)

table(mtcars$am, mtcars$cyl)

Output:

4 6 8

0 3 4 12

1 8 3 2


In the above output we can see that there is now a 2X3 dataset, where 2 rows correspond to am= 0 and am = 1 and 3 columns are for cyl = 4,6, and 8.

We can see that for am = 0 and cyl = 4, we have 3 observations. Similarly, for am = 0 and cyl = 6 we have 4 observations.


Note: table( ) function for a 2 columns resulted in a dataframe. Thus, if we want to get the number of cases where am = 0, and cyl = 8 we can subset this vector, we writing either:

 table(mtcars$am, mtcars$cyl)[1,3]

or

 table(mtcars$am, mtcars$cyl)["0","8"]

In the first command we are subsetting the first row and third column of the dataframe, while in the second command we are specifying that we are looking for am = "0" and cyl = "8".

Note that in the second command we have to enclose these by quotes " "



Tags: