2.R Concepts - BDSM - Oct2020 PDF
2.R Concepts - BDSM - Oct2020 PDF
Sandip Mukhopadhyay
What is R?
S Version 1
S Version 2
S Version 3
S Version 4
Developed 30 years ago for research
applied to the high-tech industry
HISTORY AND EVOLUTION OF R
The regular development of R
1990’s: R developed
concurrently with S
1993: R made public
Acceleration of R development
R-Help and R-Devl mailing-lists
Creation of the R Core Group
HISTORY AND EVOLUTION OF R
• Great visualization
• Advanced statistics
Reasons to learn R
• Lack of scalability
packageDescription(“ggplot2”)
help(package = “ggplot2”)
find.package(“ggplot2”)
install.packages(“ggplot2”)
Some basics about R coding
• First bracket ( ) and third bracket [ ] work very similarly. Hardly there
is any use of second bracket { }.
Functions and Help in R
• There are over 1,000 functions at the core of R, and new R functions
are created all the time.
• Each R function comes with its own help page. To access a function’s
help page, type a question mark followed by the function’s name in
the console.
Reference materials / other R resources
1. R-blogs : https://www.r-bloggers.com
2. R tutorials :
https://www.programiz.com/r-
programming/
3. R Video book : https://www.r-
bloggers.com/in-depth-introduction-to-
machine-learning-in-15-hours-of-expert-
videos/
4. Stackoverflow
5. R pubs
Operators in R
Commonly used function in R
Commonly used function in R
Summary : what we have learnt
Example:
f <- 3 # numeric
f
g <- "US" # text
g
h <- TRUE # logical
h
Types of Data Structure in R : Vector
The c function (c is short for combine) creates a new vector consisting of three
values: 4, 7, and 8.
Vectors
A vector cannot hold values of different data types.
Consider the example below. We are trying to place
integer, string and boolean values together in a
vector.
Example:
vector <- c(1,2,3,4)
f <- matrix(vector, nrow=2, ncol=2)
f
[,1] [,2]
[1,] 1 3
[2,] 2 4
Matrices
To access the 2nd column of the matrix, simply provide the column number and
omit the row number.
To access the 2nd and 3rd columns of the matrix, simply provide the column
numbers and omit the row number.
Types of Data Structure in R : Arrays
Arrays - Similar to matrices; these can have more than two dimensions.
a <- matrix(c(1,1,1,1) , 2, 2)
b <- matrix(c(2,2,2,2) , 2, 2)
x <- array(c(a,b), c(2,2,2))
Types of Data Structure in R : Data frames
Data frames - These are the most commonly used data structures in R.
A data frame is similar to a general matrix, but it can contain different
modes of data, such as a number and character.
Lists - These are the most complex data structures. A list may contain a
combination of vectors, matrices, data frames, and even other lists.
Example:
vec <- c(1,2,3,4)
mat <- matrix(vec,2,2)
x <- list (vec, mat)
Data Frame Access
• dim()
dim()function is used to obtain dimensions of a data frame.
• nrow()
nrow() function returns number of rows in a data frame.
• ncol()
ncol() function returns number of columns in a data frame.
• str()
str() function compactly displays the internal structure of R objects.
summary()
use the summary() function to return result summaries for each column of the
dataset.
Few R functions for understanding data in data frames
• head()
head()function is used to obtain the first n observations where n is set as 6 by
default.
• tail()
tail()function is used to obtain the last n observations where n is set as 6 by
default.
• edit()
• The edit() function will invoke the text editor on the R object.
Import and export of data in R
Reading Spreadsheets
read.xlsx(“filename”,…)
where, filename argument defines the path of the file to be read; the
dots “…” define the other optional arguments.
Working with directory
getwd()
getwd() command returns the absolute filepath of the current working
directory.
setwd()
setwd() command resets the current working directory to another
location as per users’ preference.
dir()
This function returns a character vector of the names of files or
directories in the named directory.