• Ekta Aggarwal

Reading files / importing data in R

In this tutorial we will mainly deal with how to read / import datasets of various formats:

Let us firstly set up our working directory using setwd( )

setwd("C:\\Users\\Ekta\\Importing data in R")
getwd()

Reading a csv file


To read a .csv extension file we use read.csv command.

by default header = T, which means first row would be treated as a header row.

data1 = read.csv("Employee_info.csv",header = T)

Defining NAs

Suppose in your data, you know if values such as 999 or any other pattern appears then it should be treated as NA, we can define it using na.strings = "999"

data2 = read.csv("Employee_info.csv",header = F,na.strings = "999")

Note that class of datasets imported using read.csv( ) is data.frame

class(data1)
class(data2)

Reading a table


To read a table, we use read.table( ) function.


We define our separator manually using sep parameter. By default header = T i.e. first row would be treated as header row.

data4  = read.table("Employee_info.txt",sep = ",",header = T)

Note that class of datasets imported using read.csv( ) is data.frame

class(data4)

read.delim( )

If your file has a different separator other than a tab, a comma or a semicolon, you can use read.delim() and read.delim2() functions.



Reading an excel file


Method 1: Using readxl package

Using library readxl's read_excel function one can import excel files.

library(readxl)
data7= read_excel("Employee_info.xlsx",sheet = 1)

sheet = 1 denotes first sheet should be imported.


Method 2: Using openxlsx package

Using library openxlsx's read.xlsx function one can import excel files.

library("openxlsx")
data8 = read.xlsx('Employee_info.xlsx', sheet = 1)

sheet = 1 denotes first sheet should be imported.


Most Suitable way for txt, csv and xlsx files: data.table way


fread( ) function of data.table is highly convenient to import txt, csv and xlsx files. It automatically comprehends the separator.

install.packages("data.table")
library(data.table)
data3 = fread("Employee_info.csv")

Reading a sas dataset


A sas dataset can be imported using read_sas function from library haven.

library(haven)
data5 = read_sas("filename.sas7bdat")

Reading a SPSS file


Method 1: Using library haven

library(haven)
data_spss = read_spss("filename.sav")

Method 2: Using library foreign

Using read.spss function ( ) from library foreign one can easily import SPSS datasets.

install.packages(“foreign”)
library(foreign)
data_spss <- read.spss("filename.sav", to.data.frame=TRUE)

To save the dataset as a dataframe one needs to specify to.data.frame = TRUE.

Also, if you don't want the columns containing value labels to be converted into factors, you should specify use.value.labels = FALSE:


Reading a stata file


Using read.dta function by library foreign one can import stata datasets.

library(foreign)
file_stata = read.dta("filename")

Tags: