UINT 1 R
UINT 1 R
Prepared By,
UNIT 1
The Language: Introduction
1 Definition of R: -
5 Uses of R: -
6 Applications of R: -
8 Disadvantages of R: -
Feature R Python
R is a language and
Python is a general−
environment for statistical
purpose programming
Introduction programming which
language for data
includes statistical
analysis and scientific
computing and graphics.
computing
10 Basics of R Programming: -
1 Syntax of R program: -
A program in R is made up of three things: Variables, Comments, and
Keywords. Variables are used to store the data, Comments are used to
improve code readability, and Keywords are reserved words that hold a
specific meaning to the compiler.
# My first program in R
Programming
myString <− "Hello, World!"
print (myString)
In the above code, the first statement defines a string variable string, where
we assign a string "Hello World!". The next statement print () is used to
print the value which is stored in the variable string.
print () is a function which is used to print the values on to the output
screen. It also has arguments; we can use it if needed. For example, in the
above program, the output is printed with quotes by default we can remove
it if needed.
1.1 R Script File
The R script file is another way on which we can write our programs, and then we
execute those scripts at our command prompt with the help of R interpreter
known as Rscript. We make a text file and write the following code. We will
save this file with .R extension as: Demo.R
1.2 Comments
In R programming, comments are the programmer readable explanation in
the source code of an R program. The purpose of adding these comments
is to make the source code easier to understand. These comments are
generally ignored by compilers and interpreters.
Comments starts with a #. When executing code, R will ignore anything
that starts with #
In R programming there is only single-line comment. R doesn't support
multi-line comment. But if we want to perform multi-line comments, then
we can add our code in a false block.
1.2.1 Types of Comments: -
Generally, it has 3 types of comments:
1. Single-line comments
2. Multi-line comments
3. Documentation comments
1) Single-line Comments: -
These are comments which requires only one line. They usually drafted to
explain what a single-line of code or what it is supposed to produce .so
that it can helps users for refer the source code.
R does not support multi-line comments but you can perform a trick which is
something as follows −
Live Demo
if(FALSE) {
"This is a demo for multi-line comments and it should be put
inside either a
single OR double quote"
}
Though above comments will be executed by R interpreter, they will not interfere
with your actual program. You should put such comments inside, either single or
double quote.
# This is a comment
# written in
# more than just one line
"Hello World!"
3) Documentation Comments: -
Comments that are drafted usually for a quick documentation look-up.
An operator
An operator is a symbol that tells the compiler to perform specific mathematical or logical
manipulations. R language is rich in built-in operators and provides following types of
operators.
Types of Operators
We have the following types of operators in R programming −
Arithmetic Operators
Relational Operators
Logical Operators
Assignment Operators
Miscellaneous Operators
Arithmetic Operators
Following table shows the arithmetic operators supported by R language. The operators
act on each element of the vector.
Operato
Description Example
r
v <- c( 2,5.5,6)
t <- c(8, 3, 4)
+ Adds two vectors
print(v+t)
it produces the following result −
[1] 10.0 8.5 10.0
v <- c( 2,5.5,6)
Subtracts second vector from the t <- c(8, 3, 4)
−
first print(v-t)
it produces the following result −
[1] -6.0 2.5 2.0
v <- c( 2,5.5,6)
t <- c(8, 3, 4)
* Multiplies both vectors print(v*t)
it produces the following result −
[1] 16.0 16.5 24.0
v <- c( 2,5.5,6)
t <- c(8, 3, 4)
Divide the first vector with the print(v/t)
/
second When we execute the above code, it
produces the following result −
[1] 0.250000 1.833333 1.500000
v <- c( 2,5.5,6)
t <- c(8, 3, 4)
The first vector raised to the
^ print(v^t)
exponent of second vector
it produces the following result −
[1] 256.000 166.375 1296.000
Relational Operators
Following table shows the relational operators supported by R language. Each element of
the first vector is compared with the corresponding element of the second vector. The
result of comparison is a Boolean value.
Operato
Description Example
r
v <- c(2,5.5,6,9)
v <- c(2,5.5,6,9)
v <- c(2,5.5,6,9)
v <- c(2,5.5,6,9)
Checks if each element of the first vector t <- c(8,2.5,14,9)
is greater than or equal to the
>= print(v>=t)
corresponding element of the second
vector. it produces the following result −
[1] FALSE TRUE FALSE TRUE
v <- c(2,5.5,6,9)
Operato
Description Example
r
v <- c(3,0,TRUE,2+2i)
It is called Element-wise Logical OR
operator. It combines each element of t <- c(4,0,FALSE,2+3i)
| the first vector with the corresponding print(v|t)
element of the second vector and gives a it produces the following result −
output TRUE if one the elements is TRUE.
[1] TRUE FALSE TRUE TRUE
v <- c(3,0,TRUE,2+2i)
It is called Logical NOT operator. Takes print(!v)
! each element of the vector and gives the
opposite logical value. it produces the following result −
[1] FALSE TRUE FALSE FALSE
The logical operator && and || considers only the first element of the vectors and give a
vector of single element as output.
Operato
Description Example
r
v <- c(3,0,TRUE,2+2i)
v <- c(0,0,TRUE,2+2i)
Assignment Operators
These operators are used to assign values to vectors.
v1 <- c(3,1,TRUE,2+3i)
v2 <<- c(3,1,TRUE,2+3i)
v3 = c(3,1,TRUE,2+3i)
<−
print(v1)
or
print(v2)
= Called Left Assignment
print(v3)
or
it produces the following result −
<<−
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i
Miscellaneous Operators
These operators are used to for specific purpose and not general mathematical or logical
computation.
Operato
Description Example
r
Colon
operator. It
creates the v <- 2:8
: series of print(v)
numbers in it produces the following result −
sequence for
a vector. [1] 2 3 4 5 6 7 8
v1 <- 8
This operator v2 <- 12
is used to t <- 1:10
identify if an
%in% print(v1 %in% t)
element
belongs to a print(v2 %in% t)
vector. it produces the following result −
[1] TRUE
[1] FALSE
R KEYWORDS
Keywords are specific reserved words in R, each of which has a specific feature
associated with it. Almost all of the words which help one to use the
functionality of the R language are included in the list of keywords. So one can
imagine that the list of keywords is not going to be a small one! In R, one can
view these keywords by using either help(reserved) or? reserved. Here is the
list of keywords in R:
for TRUE NA
Following are some most important keywords along with their examples:
# condition
if( a > 0 )
{
print("Positive Number") # Statement
}
Output:
Positive Number
Example:
x <- 5
Output:
[1] "5 is less than 10"
while: It is a type of control statement which will run a statement or a set of
statements repeatedly unless the given condition becomes false. It is also an
entry-controlled loop, in this loop the test condition is tested first, then the
body of the loop is executed, the loop body would not be executed if the
test condition is false.
Example:
# R program to demonstrate the use of while loop
val = 1
Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
repeat: It is a simple loop that will run the same statement or a group of
statements repeatedly until the stop condition has been encountered.
Repeat loop does not have any condition to terminate the loop, a
programmer must specifically place a condition within the loop’s body and
use the declaration of a break statement to terminate this loop. If no
condition is present in the body of the repeat loop, then it will iterate
infinitely.
Example:
# R program to demonstrate the use of repeat loop
val = 1
Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
function: Functions are useful when you want to perform a certain task
multiple number of times. In R functions are created using function
keyword.
Example:
# A simple R function to check
# Whether x is even or odd
evenOdd = function(x)
{
if (x %% 2 == 0)
return("even")
else
return("odd")
}
print(evenOdd(4))
print(evenOdd(3))
Output:
[1] "even"
[1] "odd"
# Loop
for (i in val)
{
if (i == 8)
{
# test expression
next
}
print(i)
}
Output:
[1] 6
[1] 7
[1] 9
[1] 10
[1] 11
break: The break keyword is a jump statement that is used to terminate the
loop at a particular iteration.
Example:
# R Break Statement Example
a<-1
while (a < 10)
{
print(a)
if(a == 5)
break
a=a+1
}
Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
# A simple R program
# to illustrate TRUE / FALSE
# Sample values
x=4
y=3
Output:
[1] TRUE
[1] FALSE
v = as.null(c(1, 2, 3, 4))
print(v)
Output: NULL
Inf and NaN: In R is.finite and is.infinite return a vector of the same length as
x, where x is an R object to be tested. This indicating which elements are
finite (not infinite and not missing) or infinite. Inf and -Inf keyword mean
positive and negative infinity whereas NaN keyword means ‘Not a Number’.
# A simple R program
# to illustrate Inf and NaN
# To check Inf
x = c(Inf, 2, 3)
print(is.finite(x))
# To check NaN
y = c(1, NaN, 3)
print(is.nan(y))
Output:
[1] FALSE TRUE TRUE
[1] FALSE TRUE FALSE
NA: NA stands for “Not Available” and is used to represent missing values.
There are also constants NA_integer_, NA_real_, NA_complex_ and
NA_character_ of the other atomic vector types which support missing
values and all of these are reserved words in the R language.
# A simple R program
# to illustrate NA
# To check NA
x = c(1, NA, 2, 3)
print(is.na(x))
Output:
[1] FALSE TRUE FALSE FALSE
Logical True, False It is a special data type for data with only two
possible values which can be construed as
true/false.
Integer 3L, 66L, 2346L Here, L tells R to store the value as an integer,
Complex Z=1+2i, t=7+3i A complex value in R is defined as the pure
imaginary value i.
x <- 10.5
y <- 55
e = as.integer(3)
class(e)
Output: [1] "integer"
Another way of creating an integer variable is by using
the L keyword as follows
x = 5L
class(x)
Output: [1] "integer"
c) Complex Data type: -
R supports complex data types that are set of all the complex numbers.
The complex data type is to store numbers with an imaginary component.
sqrt(−1) Output:
[1] NaN Warning
message:
In sqrt(−1): NaNs produced
To overcome this error, we coerce the value (−1)
into a complex value and denote it as ‘i’.
sqrt(as.complex(−1) Output:
[1] 0+1i
R supports character data types where you have all the alphabets and
special characters. It stores character values or strings. Strings in R can
contain alphabets, numbers, and symbols. The easiest way to denote that
a value is of character type in R data type is to wrap the value inside
single or double inverted commas.
str1 = "Sam"
class(str1)
Output: [1] "character"
We can also use the as.character() function to
convert objects into character values.
For example:
x=
as.character(55.7)
print(x)
Output:[1]
"55.7" class(x)
e) Logical Data type: -
R has logical data types that take either a value of true or false. A logical
value is often created via a comparison between variables. Boolean
values, which have two possible values, are represented by this R data
type: FALSE or TRUE.
x=3
y=5
a=x>y
a
Output:
FALSE
Three standard logical operations., AND (&), OR (|), and NOT(!) yield a
variable of the logical data type.
For example: -
x= TRUE; y = FALSE
x&y
Output:
[1] FALSE
x|y
Output:
[1] TRUE
!x
Output:
[1] FALSE
f) Raw data type: -
To save and work with data at the byte level in R, use the raw data type.
By displaying a series of unprocessed bytes, it enables low-level operations
on binary data. Here are some speculative data on R’s raw data types
print(x)
Output:
[1] 01 02 03 04 05
B. Length () of an object:-
To get the length of the
vector Syntax: length(object)
C. Typeof()function:-
Typeof() function in R us used to return the type of data used as the
arguments .
Syntax: typeof(x) Parameter: x
specified data
D. Attributes of ()object :-
Attriubtes () function in R is used to get all the attributes of data.
This function is also used to set new attributes to data.
Syntax: attributes(x)
Parameter: x : objects whose attributes to be accessed.
b) Lists: -
"A list is a special type of vector in which each element can be a different
type."
A list is a generic object consisting of an ordered collection of objects.
Lists are heterogeneous data structures.
These are also one−dimensional data structures.
A list can be a list of vectors, list of matrices, a list of characters and a list
of functions and so on.
# List of strings
thislist <- list("apple", "banana", "cherry")
# Print the list
thislist
c) Data Frames: -
1) A data frame must have column name and every row should have a
unique name.
2) Each column must have the identical number of items
3) Each item in a single column must be of the same data type.
4) Different columns may have different data types.
d) Matrix: -
Matrix is a rectangular arrangement of numbers in rows and columns.
In a matrix, as we know rows are the ones that run horizontally and
columns are the ones that run vertically.
In R programming, matrices are two-dimensional, homogeneous data
structures.
To create a matrix in R you need to use the function called matrix ().
The arguments to this matrix () are the set of elements in the vector.
we have to pass how many numbers of rows and how many numbers of
columns you want to have in your matrix.
Create a matrix: -
Data: -The first argument in matrix function is data. It is the input vector
which is the data elements of the matrix.
Nrow:-The second argument is the number of rows which we want to
create in the matrix.
Ncol:-The third argument is the number of columns which we want to
create in the matrix.
Byrow:-The byrow parameter is a logical clue. If its value is true, then
the input vector elements are arranged by row.
dim_name: -The dim_name parameter is the name assigned to the rows
and columns.
matrix1<-matrix(c(11, 13, 15, 12, 14, 16),nrow =2, ncol =3, byrow = TRUE)
matrix1
Output
[,1] [,2][,3]
[1,] 1113 15
[2,] 1214 16
e) Arrays:-
"An array is a collection of a similar data type with contiguous memory
allocation."
In R, arrays are the data objects which allow us to store data in more
than two dimensions.
In R, an array is created with the help of the array() function.
This array () function takes a vector as an input and to create an array it
uses vectors values in the dim parameter.
R Array Syntax
There is the following syntax of R arrays:
array_name <- array(data, dim= (row_size, column_size, matrices, dim_name s))
multiarra
y
f) Factors: -
The factor is a data structure which is used for fields which take only
predefined finite number of values.
These are the variable which takes a limited number of different values.
These are the data objects which are used to categorize the data and to
store it on multiple levels.
It can store both integers and strings values, and are useful in the column
that has a limited number of unique values.
Attributes of Factors in R Language
x: It is the vector that needs to be converted into a factor.
Levels: It is a set of distinct values which are given to the input
vector x.
Labels: It is a character vector corresponding to the number of labels.
Exclude: This will mention all the values you want to exclude.
Ordered: This logical attribute decides whether the levels are
ordered.
nmax: It will decide the upper limit for the maximum number of
levels.
# Create a factor
music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz",
"Rock", "Jazz"))
Result:
12 Variables in R programming: -
A variable is a memory allocated for the storage of specific data and the
name associated with the variable is used to work around this reserved
block.
12.1 Declaring and Initializing Variables in R Language: -
R supports three ways of variable assignment:
1. Using (=) equal operator- operators use an arrow or an equal sign to
assign values to variables.
2. Using the (<-) leftward operator- data is copied from right to left.
3. Using the (->) rightward operator- data is copied from left to right.
12.2 R Variables Syntax
Types of Variable Creation in R:
Using equal to operators
variable_name = value
# R program to illustrate
# Initialization of variables
Output
[1] "hello"
[1] "hello"
[1] "hello"
12.4 Nomenclature of R Variables
The following rules need to be kept in mind while naming a R variable:
A valid variable name consists of a combination of alphabets,
numbers, dot(.), and underscore (_) characters. Example: var.1_ is
valid
Apart from the dot and underscore operators, no other special
character is allowed. Example: var$1 or var#1 both are invalid
Variables can start with alphabets or dot characters. Example: .var or
var is valid
The variable should not start with numbers or underscore. Example:
2var or _var is invalid.
If a variable starts with a dot the next thing after the dot cannot be a
number. Example: .3var is invalid
The variable name should not be a reserved keyword in R. Example:
TRUE, FALSE,etc.
This built−in function is used to determine the data type of the variable
provided to it. The R variable to be checked is passed to this as an
argument and it prints the data type in return.
Syntax
class(variable)
Example
var1 = "hello"
print(class(var1
))
Output
[1] "character"
12.5.2 ls() function
This built−in function is used to know all the present variables in the
workspace. This is generally helpful when dealing with a large number
of variables at once and helps prevents overwriting any of them.
Syntax
ls()
Example
print(ls())
Output:
[1] "var1" "var2" "var3"
# Removing
variable rm(var3)
print(var3)
Output
Error in print(var3) : object 'var3' not found
Execution halted
# R program to illustrate
# usage of global
variables
# global variable
global = 5
Output:
[1] 5
[1] 10
Time Complexity:
O(l) Auxiliary Space:
O(l)
12.6.3Local Variables:
Local variables are those variables that exist only within a certain part of
a program like a function and are released when the function call ends.
Variables defined within a function or block are said to be local to those
functions.
Local variables do not exist outside the block in which they are declared,
i.e. they cannot be accessed or used outside that block.
Declaring local variables: Local variables are declared inside a block.
Example:
# R program to illustrate
# usage of local
variables
func = function(){
# this variable is local to the
# function func() and cannot
be # accessed outside this
function age = 18
}
print(age)
Time Complexity:
O(l) Auxiliary Space:
O(l)
Output:
Error in print(age) : object 'age' not found
The above program displays an error saying “object ‘age’ not found”.
The variable age was declared within the function “func()” so it is local to
that function and not visible to the portion of the program outside this
function. To correct the above error we have to display the value of
variable age from the function “func()” only
Example:
# R program to illustrate
# usage of local
variables
func = function(){
# this variable is local to the
# function func() and cannot
be # accessed outside this
function age = 18
print(age)
}
Output:
Age is:
[1] 18