0% found this document useful (0 votes)
66 views40 pages

UINT 1 R

UNIT 1 R PROGRAMMING 2ND SEM BCA -SEP

Uploaded by

rupasandhyam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views40 pages

UINT 1 R

UNIT 1 R PROGRAMMING 2ND SEM BCA -SEP

Uploaded by

rupasandhyam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 40

HENNUR-BAGALUR MAIN ROAD, KANNUR POST, BENGALURU-562149

Faculty of Computer Applications

DS2T2- Statistical Computing and R Programming


BCA II Semester
SEP

Prepared By,

Assistant Professor Sandhya M

Faculty of Computer Applications


Statistical Computing & R programming

UNIT 1
The Language: Introduction

 Advantages of R over Other Programming Languages


 R Studio: R script file
 Handling Packages in R: Installing R Package
 Syntax
 Comments
 Operators
 R Keywords
 R Data Types - numeric, Integer, logical, complex, character and raw, Variables,
Input and Output statement
 Data Structures – Strings, Vectors, Matrices, Arrays, Non-numeric Values, Lists and
Data Frames, Special Values, Classes, and Coercion
 Reading and Writing Files
Introduction to Language:
 R programming is well known as a “Language of Data Science”
 It is one of the most popular programming languages used by
“Researchers”,” Data analyst “, “Statisticians”, and also by “Marketers”
for retrieving, cleaning, analzing, visualizing, and for presenting the data.
 R programming was created by “Ross Ihaka “& “Robert Gentlemen” (so the
name for this R programming name comes with the first starting letter of both the author).
 R is an open -source programming language used for statistical computing,
graphics representation and for the reporting the data.
 Here statistical computing means the bond between the statistic and computer science and also
it forms mathematical analysis, which concerns about the collection, organization, analysis,
interpretation & presentation of the data.
 It allows us do branching, looping, and also it allows us to do modular
programming using functions.
 Here Modular programming means dividing the code into independent pieces where one module
takes as an input and output of another one.
 R allows integration with other languages like c, c++, .Net, python, etc. to
improve efficiency.
 R program having a wide range of varieties of packages and we can use
those packages in functions.

1 Definition of R: -

"R is an interpreted computer programming language which was created


by Ross Ihaka and Robert Gentleman” It is also a software environment
used to analyze statistical information, graphical representation,
reporting, and data modelling. R is an open-source programming language
that is widely used as a statistical software and data analysis tool. R
generally comes with the Command-line interface. R is available across
widely used platforms like Windows, Linux, and macOS.
2 History of R: -

It was designed by Ross Ihaka and Robert Gentleman at the University of


Auckland, New Zealand, and is currently developed by the R Development
Core Team. R programming language is an implementation of the S
(Statistical) programming language. It also combines with lexical scoping
semantics inspired by Scheme. Moreover, the project conceives in 1992,
with an initial version released in 1995 and a stable beta version in 2000.

3 Why R programming language?

 R programming is used as a leading tool for machine learning,


statistics, and data analysis. Objects, functions, and packages can
easily be created by R.
 It’s a platform−independent language. This means it can be applied
to all operating system.
 It’s an open−source free language. That means anyone can install it
in any organization without purchasing a license.
 R programming language is not only a statistic package but also
allows us to integrate with other languages (C, C++). Thus, you can
easily interact with many data sources and statistical packages.
4 Features of R programming: -
1) It is a simple and effective programming language which has been well
developed.
2) It is data analysis software.
3) It is a well−designed, easy, and effective language which has the concepts
of user−defined, looping, conditional, and various I/O facilities.
4) It has a consistent and incorporated set of tools which are used for data
analysis.
5) For different types of calculation on arrays, lists and vectors, R contains a
suite of operators.
6) It provides effective data handling and storage facility.
7) It is an open−source, powerful, and highly extensible software.
8) It provides highly extensible graphical techniques.
9) It allows us to perform multiple calculations using vectors.
10) R is an interpreted language.

5 Uses of R: -

1) Weather Service uses R to predict severe flooding.


2) Social networking companies are using R to monitor their user experience.
3) Newspapers companies are using R to create infographics and interactive
data journalism applications.

6 Applications of R: -

a) We use R for Data Science. It gives us a broad variety of libraries related


to statistics. It also provides the environment for statistical computing
and design.
b) R is used by many quantitative analysts as its programming tool. Thus, it
helps in data importing and cleaning.
c) R is the most prevalent language. So many data analysts and research
programmers use it. Hence, it is used as a fundamental tool for finance’s
giants like Google, Facebook, Bing, Twitter, Accenture, Wipro and many
more using R nowadays.
7 Advantages of R: -
1) R is the most comprehensive statistical analysis package. As new
technology and concepts often appear first in R.
2) As R programming language is an open source. Thus, you can run R
anywhere and at any time.
3) R programming language is suitable for GNU/Linux and Windows operating
system.
4) R programming is cross−platform which runs on any operating system.
5) In R, everyone is welcome to provide new packages, bug fixes, and code
enhancements.

8 Disadvantages of R: -

a. In the R programming language, the standard of some packages is less


than perfect.
b. Although, R commands give little pressure to memory management. So R
programming language may consume all available memory.
c. In R basically, nobody to complain if something doesn’t work.
d. R programming language is much slower than other programming
languages such as Python and MATLAB.

9 Difference between R & Python Programming?


 R programming and Python are both used extensively for data science. Both
are very useful and open-source language as well.

Feature R Python

R is a language and
Python is a general−
environment for statistical
purpose programming
Introduction programming which
language for data
includes statistical
analysis and scientific
computing and graphics.
computing

It has many features which areIt can be used to develop


useful for statistical analysis GUI applications and web
Objective
and applications as well as
representation. with embedded systems
Feature R Python

It has many easy−to−use It can easily perform


Workability packages for performing matrix computation as
tasks well as optimization

Integrated Various popular R IDEs are Various popular Python


development RStudio, RKward, R IDEs are Spyder,
environment commander, etc. Eclipse+Pydev, Atom, etc.

There are many packages Some essential packages


Libraries and
and libraries like ggplot2, and libraries are Pandas,
packages
caret, etc. NumPy, Scipy, etc.

It is mainly used for It takes a more


Scope complex data analysis in streamlined approach for
data science. data science projects.

10 Basics of R Programming: -

1 Syntax of R program: -
 A program in R is made up of three things: Variables, Comments, and
Keywords. Variables are used to store the data, Comments are used to
improve code readability, and Keywords are reserved words that hold a
specific meaning to the compiler.

# My first program in R
Programming
myString <− "Hello, World!"
print (myString)

 In the above code, the first statement defines a string variable string, where
we assign a string "Hello World!". The next statement print () is used to
print the value which is stored in the variable string.
 print () is a function which is used to print the values on to the output
screen. It also has arguments; we can use it if needed. For example, in the
above program, the output is printed with quotes by default we can remove
it if needed.
1.1 R Script File

The R script file is another way on which we can write our programs, and then we
execute those scripts at our command prompt with the help of R interpreter
known as Rscript. We make a text file and write the following code. We will
save this file with .R extension as: Demo.R

1.2 Comments
 In R programming, comments are the programmer readable explanation in
the source code of an R program. The purpose of adding these comments
is to make the source code easier to understand. These comments are
generally ignored by compilers and interpreters.
 Comments starts with a #. When executing code, R will ignore anything
that starts with #
 In R programming there is only single-line comment. R doesn't support
multi-line comment. But if we want to perform multi-line comments, then
we can add our code in a false block.
1.2.1 Types of Comments: -
 Generally, it has 3 types of comments:
1. Single-line comments
2. Multi-line comments
3. Documentation comments

1) Single-line Comments: -
 These are comments which requires only one line. They usually drafted to
explain what a single-line of code or what it is supposed to produce .so
that it can helps users for refer the source code.

#My First program in R programming


string <−"Hello World!"
print(string)
2) Muti-line Comments: -

R does not support multi-line comments but you can perform a trick which is
something as follows −

Live Demo
if(FALSE) {
"This is a demo for multi-line comments and it should be put
inside either a
single OR double quote"
}

myString <- "Hello, World!"


print ( myString)
[1] "Hello, World!"

Though above comments will be executed by R interpreter, they will not interfere
with your actual program. You should put such comments inside, either single or
double quote.

 R allows commenting multiple single at once.

# This is a comment
# written in
# more than just one line
"Hello World!"

3) Documentation Comments: -
 Comments that are drafted usually for a quick documentation look-up.

An operator
An operator is a symbol that tells the compiler to perform specific mathematical or logical
manipulations. R language is rich in built-in operators and provides following types of
operators.

Types of Operators
We have the following types of operators in R programming −
 Arithmetic Operators
 Relational Operators
 Logical Operators
 Assignment Operators
 Miscellaneous Operators
Arithmetic Operators
Following table shows the arithmetic operators supported by R language. The operators
act on each element of the vector.

Operato
Description Example
r

v <- c( 2,5.5,6)
t <- c(8, 3, 4)
+ Adds two vectors
print(v+t)
it produces the following result −
[1] 10.0 8.5 10.0

v <- c( 2,5.5,6)
Subtracts second vector from the t <- c(8, 3, 4)

first print(v-t)
it produces the following result −
[1] -6.0 2.5 2.0

v <- c( 2,5.5,6)
t <- c(8, 3, 4)
* Multiplies both vectors print(v*t)
it produces the following result −
[1] 16.0 16.5 24.0

v <- c( 2,5.5,6)
t <- c(8, 3, 4)
Divide the first vector with the print(v/t)
/
second When we execute the above code, it
produces the following result −
[1] 0.250000 1.833333 1.500000

%% Give the remainder of the first v <- c( 2,5.5,6)


vector with the second t <- c(8, 3, 4)
print(v%%t)
it produces the following result −
[1] 2.0 2.5 2.0

%/% v <- c( 2,5.5,6)


t <- c(8, 3, 4)
The result of division of first vector
print(v%/%t)
with second (quotient)
it produces the following result −
[1] 0 1 1

v <- c( 2,5.5,6)
t <- c(8, 3, 4)
The first vector raised to the
^ print(v^t)
exponent of second vector
it produces the following result −
[1] 256.000 166.375 1296.000

Relational Operators
Following table shows the relational operators supported by R language. Each element of
the first vector is compared with the corresponding element of the second vector. The
result of comparison is a Boolean value.

Operato
Description Example
r

v <- c(2,5.5,6,9)

Checks if each element of the first vector t <- c(8,2.5,14,9)


> is greater than the corresponding print(v>t)
element of the second vector. it produces the following result −
[1] FALSE TRUE FALSE FALSE

< Checks if each element of the first vector v <- c(2,5.5,6,9)


is less than the corresponding element of t <- c(8,2.5,14,9)
the second vector.
print(v < t)
it produces the following result −
[1] TRUE FALSE TRUE FALSE

v <- c(2,5.5,6,9)

Checks if each element of the first vector t <- c(8,2.5,14,9)


== is equal to the corresponding element of print(v == t)
the second vector. it produces the following result −
[1] FALSE FALSE FALSE TRUE

v <- c(2,5.5,6,9)

Checks if each element of the first vector t <- c(8,2.5,14,9)


<= is less than or equal to the corresponding print(v<=t)
element of the second vector. it produces the following result −
[1] TRUE FALSE TRUE TRUE

v <- c(2,5.5,6,9)
Checks if each element of the first vector t <- c(8,2.5,14,9)
is greater than or equal to the
>= print(v>=t)
corresponding element of the second
vector. it produces the following result −
[1] FALSE TRUE FALSE TRUE

v <- c(2,5.5,6,9)

Checks if each element of the first vector t <- c(8,2.5,14,9)


!= is unequal to the corresponding element print(v!=t)
of the second vector. it produces the following result −
[1] TRUE TRUE TRUE FALSE
Logical Operators
Following table shows the logical operators supported by R language. It is applicable only
to vectors of type logical, numeric or complex. All numbers greater than 1 are considered
as logical value TRUE.
Each element of the first vector is compared with the corresponding element of the
second vector. The result of comparison is a Boolean value.

Operato
Description Example
r

It is called Element-wise Logical AND v <- c(3,1,TRUE,2+3i)


operator. It combines each element of t <- c(4,1,FALSE,2+3i)
the first vector with the corresponding
& print(v&t)
element of the second vector and gives a
output TRUE if both the elements are it produces the following result −
TRUE. [1] TRUE TRUE FALSE TRUE

v <- c(3,0,TRUE,2+2i)
It is called Element-wise Logical OR
operator. It combines each element of t <- c(4,0,FALSE,2+3i)
| the first vector with the corresponding print(v|t)
element of the second vector and gives a it produces the following result −
output TRUE if one the elements is TRUE.
[1] TRUE FALSE TRUE TRUE

v <- c(3,0,TRUE,2+2i)
It is called Logical NOT operator. Takes print(!v)
! each element of the vector and gives the
opposite logical value. it produces the following result −
[1] FALSE TRUE FALSE FALSE

The logical operator && and || considers only the first element of the vectors and give a
vector of single element as output.

Operato
Description Example
r
v <- c(3,0,TRUE,2+2i)

Called Logical AND operator. Takes first t <- c(1,3,TRUE,2+3i)


&& element of both the vectors and gives print(v&&t)
the TRUE only if both are TRUE. it produces the following result −
[1] TRUE

v <- c(0,0,TRUE,2+2i)

Called Logical OR operator. Takes first t <- c(0,3,TRUE,2+3i)


|| element of both the vectors and gives print(v||t)
the TRUE if one of them is TRUE. it produces the following result −
[1] FALSE

Assignment Operators
These operators are used to assign values to vectors.

Operator Description Example

v1 <- c(3,1,TRUE,2+3i)
v2 <<- c(3,1,TRUE,2+3i)
v3 = c(3,1,TRUE,2+3i)
<−
print(v1)
or
print(v2)
= Called Left Assignment
print(v3)
or
it produces the following result −
<<−
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i

-> Called Right Assignment


or c(3,1,TRUE,2+3i) -> v1
->> c(3,1,TRUE,2+3i) ->> v2
print(v1)
print(v2)
it produces the following result −
[1] 3+0i 1+0i 1+0i 2+3i
[1] 3+0i 1+0i 1+0i 2+3i

Miscellaneous Operators
These operators are used to for specific purpose and not general mathematical or logical
computation.

Operato
Description Example
r

Colon
operator. It
creates the v <- 2:8
: series of print(v)
numbers in it produces the following result −
sequence for
a vector. [1] 2 3 4 5 6 7 8

v1 <- 8
This operator v2 <- 12
is used to t <- 1:10
identify if an
%in% print(v1 %in% t)
element
belongs to a print(v2 %in% t)
vector. it produces the following result −
[1] TRUE
[1] FALSE

%*% This operator


is used to M = matrix( c(2,6,5,1,10,4), nrow = 2,ncol = 3,byrow = TRUE)
multiply a
matrix with t = M %*% t(M)
print(t)
it produces the following result −
its transpose. [,1] [,2]
[1,] 65 82
[2,] 82 117

R KEYWORDS

Keywords are specific reserved words in R, each of which has a specific feature
associated with it. Almost all of the words which help one to use the
functionality of the R language are included in the list of keywords. So one can
imagine that the list of keywords is not going to be a small one! In R, one can
view these keywords by using either help(reserved) or? reserved. Here is the
list of keywords in R:

if function FALSE NA_integer

else in NULL NA_real

while next Inf NA_complex_

repeat break NaN NA_character_

for TRUE NA

Following are some most important keywords along with their examples:

 if: If statement is one of the Decision-making statements in the R


programming language. It is one of the easiest decision-making
statements. It is used to decide whether a certain statement or block of
statements will be executed or not i.e if a certain condition is true then a
block of statement is executed otherwise not.
Example: # R program to illustrate if statement

# assigning value to variable a


a <- 5

# condition
if( a > 0 )
{
print("Positive Number") # Statement
}

Output:
Positive Number

 else: It is similar to if statement but when the test expression in if condition


fails, then statements in else condition are executed.

Example:

x <- 5

# Check value is less than or greater than 10


if(x > 10)
{
print(paste(x, "is greater than 10"))
}
else
{
print(paste(x, "is less than 10"))
}

Output:
[1] "5 is less than 10"
 while: It is a type of control statement which will run a statement or a set of
statements repeatedly unless the given condition becomes false. It is also an
entry-controlled loop, in this loop the test condition is tested first, then the
body of the loop is executed, the loop body would not be executed if the
test condition is false.
Example:
# R program to demonstrate the use of while loop

val = 1

# using while loop


while (val <= 5 )
{
# statements
print(val)
val = val + 1
}

Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

 repeat: It is a simple loop that will run the same statement or a group of
statements repeatedly until the stop condition has been encountered.
Repeat loop does not have any condition to terminate the loop, a
programmer must specifically place a condition within the loop’s body and
use the declaration of a break statement to terminate this loop. If no
condition is present in the body of the repeat loop, then it will iterate
infinitely.

Example:
# R program to demonstrate the use of repeat loop

val = 1

# using repeat loop


repeat
{
# statements
print(val)
val = val + 1

# checking stop condition


if(val > 5)
{
# using break statement
# to terminate the loop
break
}
}

Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

 for: It is a type of control statement that enables one to easily construct a


loop that has to run statements or a set of statements multiple times. For
loop is commonly used to iterate over items of a sequence. It is an entry-
controlled loop, in this loop the test condition is tested first, then the body
of the loop is executed, the loop body would not be executed if the test
condition is false.

Example: # R program to demonstrate the use of for loop

# using for loop


for (val in 1:5)
{
# statement
print(val)
}

Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

 function: Functions are useful when you want to perform a certain task
multiple number of times. In R functions are created using function
keyword.
Example:
# A simple R function to check
# Whether x is even or odd

evenOdd = function(x)
{
if (x %% 2 == 0)
return("even")
else
return("odd")
}
print(evenOdd(4))
print(evenOdd(3))

Output:
[1] "even"
[1] "odd"

 next: Next statement in R is used to skip any remaining statements in the


loop and continue the execution of the program. In other words, it is a
statement that skips the current iteration without loop termination.
Example:

# R program to illustrate next in for loop

val <- 6:11

# Loop
for (i in val)
{
if (i == 8)
{
# test expression
next
}
print(i)
}

Output:
[1] 6
[1] 7
[1] 9
[1] 10
[1] 11

 break: The break keyword is a jump statement that is used to terminate the
loop at a particular iteration.
Example:
# R Break Statement Example
a<-1
while (a < 10)
{
print(a)
if(a == 5)
break
a=a+1
}

Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5

 TRUE/FALSE: The TRUE and FALSE keywords are used to represent a


Boolean true and Boolean false. If the given statement is true, then the
interpreter returns true else the interpreter returns false.
Example:

# A simple R program
# to illustrate TRUE / FALSE

# Sample values
x=4
y=3

# Comparing two values


z=x>y
p=x<y

# Print the logical value


print(z)
print(p)

Output:
[1] TRUE
[1] FALSE

 NULL: In R, NULL represents the null object. NULL is used to represent


missing and undefined values. NULL is the logical representation of a
statement which is neither TRUE nor FALSE.
Example:
# A simple R program
# to illustrate NULL

v = as.null(c(1, 2, 3, 4))
print(v)
Output: NULL

 Inf and NaN: In R is.finite and is.infinite return a vector of the same length as
x, where x is an R object to be tested. This indicating which elements are
finite (not infinite and not missing) or infinite. Inf and -Inf keyword mean
positive and negative infinity whereas NaN keyword means ‘Not a Number’.

# A simple R program
# to illustrate Inf and NaN

# To check Inf
x = c(Inf, 2, 3)
print(is.finite(x))

# To check NaN
y = c(1, NaN, 3)
print(is.nan(y))

Output:
[1] FALSE TRUE TRUE
[1] FALSE TRUE FALSE

 NA: NA stands for “Not Available” and is used to represent missing values.
There are also constants NA_integer_, NA_real_, NA_complex_ and
NA_character_ of the other atomic vector types which support missing
values and all of these are reserved words in the R language.
# A simple R program
# to illustrate NA

# To check NA
x = c(1, NA, 2, 3)
print(is.na(x))

Output:
[1] FALSE TRUE FALSE FALSE

Data types in R programming: -

 R Data types are used in computer programming to specify the kind of


data that can be stored in a variable.
 In programming languages, we need to use various variables to store
various information. Variables are the reserved memory location to store
values. As we create a variable in our program, some space is reserved in
memory.
 In R, there are several data types such as integer, string, etc. The
operating system allocates memory based on the data type of the variable
and decides what can be stored in the reserved memory.

Data type Example Description

Logical True, False It is a special data type for data with only two
possible values which can be construed as
true/false.

Numeric 12,32,112,5432 Decimal value is called numeric in R, and it is


the default computational data type.

Integer 3L, 66L, 2346L Here, L tells R to store the value as an integer,
Complex Z=1+2i, t=7+3i A complex value in R is defined as the pure
imaginary value i.

Character 'a', '"good'", In R programming, a character is used to


"TRUE", '35.4' represent string values. We convert objects
into character values with the help of as.
character () function.

Raw as.raw A raw data type is used to holds raw bytes.

a) Numeric Data type: -


 In R, if we assign any decimal value to a variable, it becomes a variable of a
numeric data type.
For example, the statement below assigns a numeric data type to the
variable “x”.

x <- 10.5
y <- 55

# Print values of x and y


x

b) Integer data type: -


 R supports integer data types which are the set of all integers. You can
create as well as convert a value into an integer type using the as.integer()
function. You can also use the capital ‘L’ notation as a sufÏx to denote that
a particular value is of the integer R data type.

e = as.integer(3)
class(e)
Output: [1] "integer"
Another way of creating an integer variable is by using
the L keyword as follows

x = 5L
class(x)
Output: [1] "integer"
c) Complex Data type: -
 R supports complex data types that are set of all the complex numbers.
The complex data type is to store numbers with an imaginary component.

sqrt(−1) Output:
[1] NaN Warning
message:
In sqrt(−1): NaNs produced
To overcome this error, we coerce the value (−1)
into a complex value and denote it as ‘i’.
sqrt(as.complex(−1) Output:
[1] 0+1i

d) Character Data type: -

 R supports character data types where you have all the alphabets and
special characters. It stores character values or strings. Strings in R can
contain alphabets, numbers, and symbols. The easiest way to denote that
a value is of character type in R data type is to wrap the value inside
single or double inverted commas.

str1 = "Sam"
class(str1)
Output: [1] "character"
We can also use the as.character() function to
convert objects into character values.
For example:
x=
as.character(55.7)
print(x)
Output:[1]
"55.7" class(x)
e) Logical Data type: -

 R has logical data types that take either a value of true or false. A logical
value is often created via a comparison between variables. Boolean
values, which have two possible values, are represented by this R data
type: FALSE or TRUE.

x=3
y=5
a=x>y
a
Output:
FALSE
Three standard logical operations., AND (&), OR (|), and NOT(!) yield a
variable of the logical data type.
For example: -
x= TRUE; y = FALSE
x&y
Output:
[1] FALSE
x|y
Output:
[1] TRUE
!x
Output:
[1] FALSE
f) Raw data type: -
 To save and work with data at the byte level in R, use the raw data type.
By displaying a series of unprocessed bytes, it enables low-level operations
on binary data. Here are some speculative data on R’s raw data types

# Create a raw vector

x <- as.raw(c(0x1, 0x2, 0x3, 0x4, 0x5))

print(x)

Output:

[1] 01 02 03 04 05

11 Data Structure in R Programming: -


 A data structure is a particular way of organizing data in a computer so
that it can be used effectively.
 The idea is to reduce the space and time complexities of different tasks.
Data structures in R programming are tools for holding multiple values.
 R’s base data structures are often organized by their dimensionality (1D,
2D, or nD) and whether they’re homogeneous (all elements must be of the
identical type) or heterogeneous (the elements are often of various types).
 This gives rise to the six data types which are most frequently utilized in
data analysis.
 The most essential data structures used in R include:
1) Vectors
2) Lists
3) Data frames
4) Matrices
5) Arrays
6) Factors
a) Vectors: -
 "A vector is a collection of elements which is most commonly of mode
character, integer, logical or numeric" A vector can be one of the
following two types:
1) Atomic vector
2) Lists
 A vector is an ordered collection of basic data types of a given length. The
only key thing here is all the elements of a vector must be of the identical
data type e.g., homogeneous data structures. Vectors are one-dimensional
data structures.
 The elements which are contained in vector known as components of the
vector. We can check the type of vector with the help of the typeof ()
function.
 The length is an important property of a vector. A vector length is basically
the number of elements in the vector, and it is calculated with the help of
the length () function.
 They have three common properties, i.e., function type, function length,
and attribute function.
A. Classof () function: -
 To find the data of an object, we have to use class () function.
 The syntax for doing that we need to pass the object as an argument to
the function class () to find the data type of an object
Syntax: class(object)

# R program to illustrate Vector


# Vectors(ordered collection of same data
type)
X = c(1, 3, 5, 7, 8)
# Printing those elements in console
print(X)
class(x)
length(x)
Output: [1] 1 3 5 7 8
“numeric”

B. Length () of an object:-
 To get the length of the
vector Syntax: length(object)
C. Typeof()function:-
 Typeof() function in R us used to return the type of data used as the
arguments .
Syntax: typeof(x) Parameter: x
specified data
D. Attributes of ()object :-
 Attriubtes () function in R is used to get all the attributes of data.
 This function is also used to set new attributes to data.
 Syntax: attributes(x)
 Parameter: x : objects whose attributes to be accessed.

b) Lists: -

 "A list is a special type of vector in which each element can be a different
type."
 A list is a generic object consisting of an ordered collection of objects.
 Lists are heterogeneous data structures.
 These are also one−dimensional data structures.
 A list can be a list of vectors, list of matrices, a list of characters and a list
of functions and so on.

# List of strings
thislist <- list("apple", "banana", "cherry")
# Print the list
thislist

c) Data Frames: -

 A data frame is a two−dimensional array−like structure or a table in which


a column contains values of one variable, and rows contains one set of
values from each column.
 A data frame is used to store data table and the vectors which are present
in the form of a list in a data frame, are of equal length.
 In a simple way, it is a list of equal length vectors.
 A matrix can contain one type of data, but a data frame can contain
different data types such as numeric, character, factor, etc.
Data frames having the following Constraints: -

1) A data frame must have column name and every row should have a
unique name.
2) Each column must have the identical number of items
3) Each item in a single column must be of the same data type.
4) Different columns may have different data types.

empid <- c(1:4)


empname <- c("Sam","Rob","Max","John")
empdept <- c("Sales","Marketing","HR","R & D")
emp.data <- data.frame(empid, empname, empdept)
print(emp.data)

d) Matrix: -
 Matrix is a rectangular arrangement of numbers in rows and columns.
 In a matrix, as we know rows are the ones that run horizontally and
columns are the ones that run vertically.
 In R programming, matrices are two-dimensional, homogeneous data
structures.
 To create a matrix in R you need to use the function called matrix ().
 The arguments to this matrix () are the set of elements in the vector.
 we have to pass how many numbers of rows and how many numbers of
columns you want to have in your matrix.

Create a matrix: -
 Data: -The first argument in matrix function is data. It is the input vector
which is the data elements of the matrix.
 Nrow:-The second argument is the number of rows which we want to
create in the matrix.
 Ncol:-The third argument is the number of columns which we want to
create in the matrix.
 Byrow:-The byrow parameter is a logical clue. If its value is true, then
the input vector elements are arranged by row.
 dim_name: -The dim_name parameter is the name assigned to the rows
and columns.

matrix1<-matrix(c(11, 13, 15, 12, 14, 16),nrow =2, ncol =3, byrow = TRUE)
matrix1
Output
[,1] [,2][,3]
[1,] 1113 15
[2,] 1214 16

e) Arrays:-
 "An array is a collection of a similar data type with contiguous memory
allocation."
 In R, arrays are the data objects which allow us to store data in more
than two dimensions.
 In R, an array is created with the help of the array() function.
 This array () function takes a vector as an input and to create an array it
uses vectors values in the dim parameter.
 R Array Syntax
 There is the following syntax of R arrays:
array_name <- array(data, dim= (row_size, column_size, matrices, dim_name s))

1) Data:-The data is the first argument in the array() function. It is an input


vector which is given to the array.
2) Matrices:-In R, the array consists of multi−dimensional matrices.
3) row_size:-This parameter defines the number of row elements which an
array can store.
4) column_size:-This parameter defines the number of columns elements
which an array can store.
5) dim_names:- This parameter is used to change the default names of rows
and columns.
# An array with one dimension with values
ranging from 1 to 24
thisarray <- c(1:24)
thisarray

# An array with more than one dimension


multiarray <- array(thisarray, dim = c(4, 3,
2))

multiarra
y
f) Factors: -
 The factor is a data structure which is used for fields which take only
predefined finite number of values.
 These are the variable which takes a limited number of different values.
 These are the data objects which are used to categorize the data and to
store it on multiple levels.
 It can store both integers and strings values, and are useful in the column
that has a limited number of unique values.
Attributes of Factors in R Language
 x: It is the vector that needs to be converted into a factor.
 Levels: It is a set of distinct values which are given to the input
vector x.
 Labels: It is a character vector corresponding to the number of labels.
 Exclude: This will mention all the values you want to exclude.
 Ordered: This logical attribute decides whether the levels are
ordered.
 nmax: It will decide the upper limit for the maximum number of
levels.
# Create a factor
music_genre <-
factor(c("Jazz", "Rock", "Classic", "Classic", "Pop", "Jazz",
"Rock", "Jazz"))

# Print the factor


music_genre

Result:

[1] Jazz Rock Classic Classic Pop Jazz Rock Jazz


Levels: Classic Jazz Pop Rock

12 Variables in R programming: -
 A variable is a memory allocated for the storage of specific data and the
name associated with the variable is used to work around this reserved
block.
12.1 Declaring and Initializing Variables in R Language: -
 R supports three ways of variable assignment:
1. Using (=) equal operator- operators use an arrow or an equal sign to
assign values to variables.
2. Using the (<-) leftward operator- data is copied from right to left.
3. Using the (->) rightward operator- data is copied from left to right.
12.2 R Variables Syntax
Types of Variable Creation in R:
 Using equal to operators
variable_name = value

 using leftward operator


variable_name <- value

using rightward operator



value -> variable_name
12.3 Creating Variables in R

# R program to illustrate
# Initialization of variables

# using equal to operator


var1 = "hello"
print(var1)

# using leftward operator


var2 <- "hello"
print(var2)

# using rightward operator


"hello" -> var3
print(var3)

Output
[1] "hello"
[1] "hello"
[1] "hello"
12.4 Nomenclature of R Variables
The following rules need to be kept in mind while naming a R variable:
 A valid variable name consists of a combination of alphabets,
numbers, dot(.), and underscore (_) characters. Example: var.1_ is
valid
 Apart from the dot and underscore operators, no other special
character is allowed. Example: var$1 or var#1 both are invalid
 Variables can start with alphabets or dot characters. Example: .var or
var is valid
 The variable should not start with numbers or underscore. Example:
2var or _var is invalid.
 If a variable starts with a dot the next thing after the dot cannot be a
number. Example: .3var is invalid
 The variable name should not be a reserved keyword in R. Example:
TRUE, FALSE,etc.

12.5 Important Methods for R Variables


 R provides some useful methods to perform operations on variables.
These methods are used to determine the data type of the variable,
finding a variable, deleting a variable, etc.
 Following are some of the methods used to work on variables

12.5.1 class() function

 This built−in function is used to determine the data type of the variable
provided to it. The R variable to be checked is passed to this as an
argument and it prints the data type in return.
Syntax
class(variable)
Example

var1 = "hello"
print(class(var1
))
Output
[1] "character"
12.5.2 ls() function

 This built−in function is used to know all the present variables in the
workspace. This is generally helpful when dealing with a large number
of variables at once and helps prevents overwriting any of them.
Syntax
ls()
Example

# using equal to operator


var1 = "hello"

# using leftward operator


var2 <- "hello"

# using rightward operator


"hello" -> var3

print(ls())

Output:
[1] "var1" "var2" "var3"

12.5.3 rm() function

 This is again a built−in function used to delete an unwanted variable


within your workspace. This helps clear the memory space allocated to
certain variables that are not in use thereby creating more space for
others. The name of the variable to be deleted is passed as an argument
to it.
Syntax
rm(variable)
Example :−

# using equal to operator


var1 = "hello"

# using leftward operator


var2 <- "hello"

# using rightward operator


"hello" -> var3

# Removing
variable rm(var3)
print(var3)

Output
Error in print(var3) : object 'var3' not found
Execution halted

12.6 Scope of Variables in R programming: -


 The location where we can find a variable and also access it if required
is called the scope of a variable.
 There are mainly two types of variable scopes:
1) Global variable
2) Local variable
12.6.1Naming convention for Variables
 The variable name in R has to be Alphanumeric characters with an
exception of underscore(‘_’) and period(‘.’), the special characters
which can be used in the variable names.
 The variable name has to be started always with an alphabet.
 Other special characters like(‘!’, ‘@’, ‘#’, ‘$’) are not allowed in the
variable names.
12.6.2Global Variables:
 Global variables are those variables that exist throughout the
execution of a program. It can be changed and accessed from any part
of the program.
 As the name suggests, Global Variables can be accessed from any part
of the program.
1) They are available throughout the lifetime of a program.
2) They are declared anywhere in the program outside all of the
functions or blocks.
3) Declaring global variables: Global variables are usually declared
outside of all of the functions and blocks. They can be accessed from
any portion of the program.

# R program to illustrate
# usage of global
variables

# global variable
global = 5

# global variable accessed from


# within a function
display = function(){
print(global)
}
display()

# changing value of global variable


global = 10 display()

Output:
[1] 5
[1] 10
Time Complexity:
O(l) Auxiliary Space:
O(l)

12.6.3Local Variables:
 Local variables are those variables that exist only within a certain part of
a program like a function and are released when the function call ends.
 Variables defined within a function or block are said to be local to those
functions.
 Local variables do not exist outside the block in which they are declared,
i.e. they cannot be accessed or used outside that block.
 Declaring local variables: Local variables are declared inside a block.
Example:

# R program to illustrate
# usage of local
variables

func = function(){
# this variable is local to the
# function func() and cannot
be # accessed outside this
function age = 18
}

print(age)
Time Complexity:
O(l) Auxiliary Space:
O(l)
Output:
Error in print(age) : object 'age' not found

 The above program displays an error saying “object ‘age’ not found”.
The variable age was declared within the function “func()” so it is local to
that function and not visible to the portion of the program outside this
function. To correct the above error we have to display the value of
variable age from the function “func()” only
 Example:

# R program to illustrate
# usage of local
variables

func = function(){
# this variable is local to the
# function func() and cannot
be # accessed outside this
function age = 18
print(age)
}

cat("Age is:\n") func()

Output:
Age is:
[1] 18

You might also like