0% found this document useful (0 votes)

5 views24 pages

R Viva Ques

The document provides an overview of R programming, its advantages, and essential functionalities for data analysis, including package installation, data structures, and control flows. It covers descriptive statistics, predictive analytics, and practical coding examples in R for statistical computations and visualizations. Additionally, it explains key concepts like regression analysis, correlation, and the use of various statistical functions in R.

Uploaded by

Eshika Upadhyay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views24 pages

R Viva Ques

Uploaded by

Eshika Upadhyay

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

UNIT 3: Getting Started with R (6 Hours)

Introduction to R, Advantages, Installing Packages, Importing Data, Commands & Syntax,

Packages & Libraries, Data Structures, Control Flows, Loops, Functions, Apply family.

1. What is R and why is it used?

Answer:
R is a free, open-source programming language and software environment used for statistical
computing, data analysis, and visualization. It is widely used in data science, machine learning,
and academia.

2. What are the main advantages of R?

Answer:

● Open source and free to use

● Rich set of packages for data analysis

● Great visualization capabilities (e.g., ggplot2)

● Large and active community

● Widely used in academia and industry

3. How do you install and load a package in R?

Answer:

R
CopyEdit
install.packages("ggplot2") # To install
library(ggplot2) # To load

4. How can you import data from a CSV file in R?

Answer:
R
CopyEdit
data <- read.csv("filename.csv")

5. What is the difference between a vector and a list in R?

Answer:

● Vector: Homogeneous data type (e.g., all numeric)

● List: Heterogeneous, can contain different types (e.g., numbers, strings, vectors)

6. What are data frames and how are they different from matrices?
Answer:

● Data Frame: Table-like structure with columns of different types

● Matrix: 2D structure with elements of the same type

7. How do you create a vector in R?

Answer:

R
CopyEdit
vec <- c(1, 2, 3, 4)

8. How do you create a matrix in R?

Answer:

R
CopyEdit
mat <- matrix(1:6, nrow=2, ncol=3)

9. How do you create a list in R?

Answer:

R
CopyEdit
lst <- list(name="Alex", age=25, scores=c(90, 80, 85))

10. What are factors in R? Why are they used?

Answer:
Factors are used to represent categorical data. They store both the values and the levels.

R
CopyEdit
factor_var <- factor(c("low", "medium", "high"))

11. What are conditionals in R? Give an example.

Answer:

R
CopyEdit
x <- 5
if (x > 0) {
print("Positive")
} else {
print("Non-positive")
}

12. What are loops in R? Give an example of a for loop.

Answer:

R
CopyEdit
for (i in 1:5) {
print(i^2)
}

13. How do you define a function in R?

Answer:

R
CopyEdit
square <- function(x) {
return(x^2)
}
square(4)

14. What is the purpose of the apply() function?

Answer:
apply() is used to apply a function over the rows or columns of a matrix or data frame.

R
CopyEdit
apply(matrix(1:9, nrow=3), 1, sum) # Row-wise sum

15. What is the difference between lapply() and sapply()?

Answer:

● lapply() always returns a list

● sapply() tries to simplify the result into a vector or matrix

R
CopyEdit
x <- list(a=1:3, b=4:6)
lapply(x, sum) # List of sums
sapply(x, sum) # Vector of sums

16. What is the tapply() function used for?

Answer:
tapply() applies a function over subsets of a vector, defined by a factor or grouping variable.

R
CopyEdit
ages <- c(21, 25, 19, 23)
gender <- factor(c("M", "F", "M", "F"))
tapply(ages, gender, mean) # Mean age by gender

17. What is a library in R?

Answer:
A library in R is a collection of R packages. We load packages using the library() function.

18. How do you check the structure of a data object?

Answer:

R
CopyEdit
str(data)

19. How do you check the summary statistics of a data frame?

Answer:

R
CopyEdit
summary(data)

20. What are the common data types in R?

Answer:

● Numeric

● Integer

● Character

● Logical

● Complex

● Factor
UNIT 4: Descriptive Statistics Using R (6 Hours)
Topics: Data Import, Data Visualization, Measures of Central Tendency, Measures
of Dispersion, Covariance, Correlation, Coefficient of Determination

🔸 Section A: Theoretical Questions

1. What is descriptive statistics?
Answer:
Descriptive statistics refers to methods for summarizing and organizing data. It includes:

● Measures of central tendency (mean, median, mode)

● Measures of dispersion (range, variance, standard deviation)

● Charts and graphs for data visualization

2. What are measures of central tendency? Explain each with an example.

Answer:
These describe the center of a data set:

● Mean: Arithmetic average

● Median: Middle value when data is sorted

● Mode: Most frequent value

Example: For data = {2, 3, 4, 5, 5}

● Mean = (2+3+4+5+5)/5 = 3.8

● Median = 4

● Mode = 5

3. What are measures of dispersion? Why are they important?

Answer:
They measure how spread out the data is. Common ones include:

● Range: Difference between max and min

● Variance: Average squared deviation from the mean

● Standard Deviation: Square root of variance

● IQR: Interquartile range (Q3 - Q1)

They help understand data variability and consistency.

4. What is covariance? What does its sign indicate?

Answer:
Covariance measures the directional relationship between two variables:

● Positive covariance = variables move in same direction

● Negative covariance = move in opposite direction

● Near zero = weak or no relationship

5. What is correlation? How is it different from covariance?

Answer:
Correlation standardizes covariance to a [-1, 1] scale:

● +1 = strong positive linear relationship

● 0 = no linear relationship

● -1 = strong negative linear relationship

Unlike covariance, correlation is unit-free.

6. What is the coefficient of determination (R²)?

Answer:
R² tells us how much of the variation in the dependent variable is explained by the independent
variable(s).
● Value ranges from 0 to 1

● Closer to 1 = better model fit

🔸 Section B: Practical Questions Using R

7. How do you import data from a CSV file in R?
R
CopyEdit
data <- read.csv("datafile.csv")

8. How do you calculate mean, median, and mode in R?

R
CopyEdit
mean(data$column)
median(data$column)
# Mode function (R doesn’t have built-in mode)
getmode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
getmode(data$column)

9. How do you calculate variance and standard deviation in R?

R
CopyEdit
var(data$column) # Variance
sd(data$column) # Standard deviation

10. How do you calculate the range and IQR in R?

R
CopyEdit
range(data$column)
IQR(data$column)
11. How do you calculate covariance and correlation in R?
R
CopyEdit
cov(data$x, data$y)
cor(data$x, data$y)

12. How do you calculate the coefficient of determination in R?

R
CopyEdit
model <- lm(y ~ x, data=data)
summary(model)$r.squared

13. How do you create a histogram in R?

R
CopyEdit
hist(data$column, main="Histogram", col="skyblue")

14. How do you create a bar chart in R?

R
CopyEdit
barplot(table(data$category), main="Bar Chart", col="lightgreen")

15. How do you create a boxplot in R? What does it represent?

R
CopyEdit
boxplot(data$column, main="Boxplot", col="orange")

Explanation: Shows median, quartiles, outliers, and spread of the data.

16. How do you create a scatter plot in R?

R
CopyEdit
plot(data$x, data$y, main="Scatter Plot", xlab="X", ylab="Y")
17. How do you create a line graph in R?
R
CopyEdit
plot(data$x, type="l", main="Line Graph", col="blue")

18. What is the use of the summary() function in R?

R
CopyEdit
summary(data)

Explanation: Gives min, 1st quartile, median, mean, 3rd quartile, and max for each column.

19. How do you create a pairwise correlation matrix in R?

R
CopyEdit
cor(data[, c("var1", "var2", "var3")])

20. How do you visualize correlation in R (advanced)?

R
CopyEdit
library(corrplot)
corrplot(cor(data[,c("x", "y", "z")]), method="circle")
UNIT 5: Predictive Analytics Using R – Sample Viva
Questions & Answers (Excluding Textual Analytics)

🔸 SECTION A: Theoretical Questions

1. What is Predictive Analytics?
Answer:
Predictive analytics uses statistical techniques (like regression models) to forecast future
outcomes based on historical data.

2. What is regression analysis?

Answer:
Regression analysis is a statistical method used to examine the relationship between a
dependent variable and one or more independent variables.

3. What is the difference between simple and multiple linear regression?

● Simple Linear Regression: One independent variable → one dependent variable

● Multiple Linear Regression: Two or more independent variables → one dependent

variable

4. What are the assumptions of a linear regression model?

1. Linearity

2. Independence of errors

3. Homoscedasticity (constant variance of errors)

4. Normality of errors

5. No multicollinearity (in multiple regression)

5. What does the R-squared value indicate?
Answer:
R² (Coefficient of determination) tells us the proportion of variance in the dependent
variable that is predictable from the independent variables.

6. What is the adjusted R-squared?

Answer:
Adjusted R² adjusts the R² value for the number of predictors, and is a better indicator when
comparing models with different numbers of variables.

7. What is heteroscedasticity?
Answer:
It occurs when the variance of residuals is not constant across all levels of the independent
variable(s). It violates a key regression assumption and affects the accuracy of coefficient
estimates.

8. What is multicollinearity?
Answer:
Multicollinearity occurs when independent variables are highly correlated with each other,
making it difficult to isolate their individual effects.

9. How can you detect multicollinearity?

● Correlation matrix of predictors

● Variance Inflation Factor (VIF): VIF > 10 indicates strong multicollinearity

10. What is the difference between confidence intervals and prediction

intervals?
● Confidence interval: Range in which the mean value of the dependent variable is
expected to fall
● Prediction interval: Range in which an individual prediction is expected to fall (wider)

🔸 SECTION B: R Coding/Output-Based Questions

11. How do you run a simple linear regression in R?
R
CopyEdit
model <- lm(y ~ x, data = dataset)
summary(model)

12. How do you run a multiple linear regression in R?

R
CopyEdit
model <- lm(y ~ x1 + x2 + x3, data = dataset)
summary(model)

13. How do you extract the R-squared and Adjusted R-squared values in R?
R
CopyEdit
summary(model)$r.squared
summary(model)$adj.r.squared

14. How do you interpret regression coefficients?

Answer:
Each coefficient shows the change in the dependent variable for a 1-unit change in the
corresponding independent variable, keeping other variables constant.

15. How do you check for heteroscedasticity in R?

R
CopyEdit
plot(model$fitted.values, model$residuals)
abline(h = 0, col = "red")
Or use:

R
CopyEdit
library(lmtest)
bptest(model) # Breusch-Pagan Test

16. How do you detect multicollinearity in R?

R
CopyEdit
library(car)
vif(model)

17. How do you calculate prediction and confidence intervals in R?

R
CopyEdit
# Confidence interval
predict(model, newdata = data.frame(x = 50), interval = "confidence")

# Prediction interval
predict(model, newdata = data.frame(x = 50), interval = "prediction")

18. How do you visualize a regression line?

R
CopyEdit
plot(dataset$x, dataset$y, main="Regression Line")
abline(model, col = "blue")

19. How do you check residual normality in R?

R
CopyEdit
qqnorm(model$residuals)
qqline(model$residuals)

20. What is the summary() function used for in regression?

Answer:
It gives the regression output:

● Coefficients

● R-squared values

● F-statistic

● p-values for significance testing

21. How do you interpret the p-value in regression output?

Answer:
If p-value < 0.05, the variable is statistically significant and likely affects the dependent
variable.
UNIT 3: Getting Started with R

1. What is R and why is it used in data analytics?

Answer:
R is a programming language and software environment for statistical computing and
graphics.
It is widely used in data science and analytics for:

● Data manipulation (via dplyr, data.table)

● Statistical modeling (regression, hypothesis testing)

● Data visualization (ggplot2, base R plots)

● Machine learning (caret, randomForest)

2. What are the advantages of R over other languages like Python or Excel?
Answer:

● Open source and free

● Rich ecosystem of packages for specialized tasks

● Great data visualization capabilities

● Powerful for statistical modeling

● Active community support

● Integrates well with RStudio and markdown for reproducible reports

3. What are packages and libraries in R?

Answer:

● A package is a collection of R functions, data, and documentation bundled together.

● A library is where these packages are stored once installed.

● You install a package using install.packages("package_name")

● You load it into the current session using library(package_name)

4. What are the primary data structures in R and where are they used?
Structure Description Use Case

Vector 1D homogeneous Store numeric/character

data

Matrix 2D homogeneous Mathematical operations

Array n-D Multidimensional data

homogeneous

List 1D Complex data (mix of types)

heterogeneous

Data Frame 2D Tabular data (like Excel)

heterogeneous

Factor Categorical data Used in statistical models

5. What is the difference between a vector and a list?

Answer:

● A vector holds elements of the same type (numeric, character, etc.)

● A list can contain different types (e.g., numeric, character, vectors, even data frames)

6. What are conditionals and control flows in R?

Answer:
They are used to control execution based on conditions:

● if, else if, else → decision making

● switch → for selecting among alternatives

● Loops: for, while, repeat → for iteration

7. What is the Apply family in R and why is it useful?
Answer:
The apply() family provides efficient alternatives to loops:

● apply() – applies function to rows or columns of a matrix

● lapply() – applies function to each element of a list

● sapply() – simplified version of lapply, returns vector or matrix

● tapply() – applies function over subsets of a vector

● Advantage: More efficient and readable than loops

UNIT 4: Descriptive Statistics Using R

1. What is descriptive statistics?
Answer:
Descriptive statistics summarizes the basic features of data through numerical measures and
visualizations. It includes:

● Measures of central tendency (mean, median, mode)

● Measures of dispersion (range, variance, standard deviation)

● Shape of distribution (skewness, kurtosis)

2. What is the difference between mean, median, and mode?

● Mean: Average

● Median: Middle value

● Mode: Most frequent value

Use:

● Use median for skewed data or outliers

● Use mean for symmetric, normal data

3. What are measures of dispersion and why are they important?

Answer:
They measure variability or spread in data.

● Range: Difference between max and min

● Variance: Average of squared differences from the mean

● Standard Deviation: Square root of variance

These help understand data reliability and consistency.

4. Explain covariance and correlation.

Answer:

● Covariance: Measures direction of linear relationship

○ Positive → move together

○ Negative → move opposite

● Correlation: Measures strength and direction, scaled from -1 to +1

○ Pearson’s correlation is most common

○ Correlation = Covariance / (SD₁ × SD₂)

5. What is coefficient of determination (R²)?

Answer:
It indicates how much of the variance in the dependent variable is explained by the
independent variable(s).

● R² ranges from 0 to 1

● A higher R² indicates a better model fit

6. What are common charts used for visualization and what do they show?
Chart Description

Histogram Distribution of numeric data

Bar Chart Frequency of categorical data

Box Plot Distribution with median, IQR, outliers

Line Graph Trend over time

Scatter Relationship between two variables

Plot

UNIT 5: Predictive Analytics (Excluding

Textual Analysis)
1. What is simple linear regression?
Answer:
A statistical method to model a linear relationship between one independent and one
dependent variable.
Equation:
Y = β₀ + β₁X + ε

● β₀: intercept

● β₁: slope

● ε: error term

2. What is multiple linear regression?

Answer:
It models the relationship between a dependent variable and two or more independent
variables.
Y = β₀ + β₁X₁ + β₂X₂ + ... + βnXn + ε
3. How do you interpret regression coefficients?
Answer:
Each β coefficient represents the expected change in Y for a one-unit increase in the
respective X, holding all other Xs constant.

4. What are confidence intervals and prediction intervals?

● Confidence Interval: Range for the mean value of the dependent variable at a specific
X

● Prediction Interval: Range for an individual predicted value

Prediction intervals are always wider.

5. What is heteroscedasticity? How does it affect regression?

Answer:

● When residuals have non-constant variance, i.e., spread increases with X

● Violates OLS assumption → leads to inefficient, biased estimates

● Detected using Breusch-Pagan test or residual plots

6. What is multicollinearity and why is it a problem?

Answer:
When independent variables are highly correlated, it becomes difficult to:

● Assess their individual effect

● Leads to inflated standard errors

● Detected using VIF (Variance Inflation Factor)

7. What is the role of R² and Adjusted R² in regression?

● R²: Tells how much variance in Y is explained by Xs
● Adjusted R²: Penalizes for adding irrelevant predictors

● Use Adjusted R² for comparing models with different numbers of predictors

8. What are some common pitfalls in regression modeling?

● Overfitting: Too many predictors fit the noise

● Underfitting: Too few predictors miss the trend

● Ignoring assumptions: leads to incorrect inference

● Not checking residuals: violates model reliability

PACKAGES:

Unit 3: Getting Started with R

Focus: Data structures, import, control flows, basic programming

Purpose Package Why It’s Needed

Importing Excel/CSV files readr, For reading .csv and .xlsx files
readxl

Data wrangling dplyr, For data manipulation and better data frame
tibble handling

Viewing data types and str, Built-in base functions (no need to install)
structure summary

✅ Installation code:
r
CopyEdit
install.packages(c("readr", "readxl", "dplyr", "tibble"))

✅ Unit 4: Descriptive Statistics Using R

Focus: Data description, central tendency, dispersion, visualizations

Purpose Package Why It’s Needed

Data visualization ggplot2 For histograms, bar charts, box plots,

etc.

Statistical summaries psych Provides functions like describe()

Correlation and stats Contains cor(), cov()

covariance (base)

Basic boxplots and plots Base R Functions like hist(), boxplot(),

plot()

✅ Installation code:
r
CopyEdit
install.packages(c("ggplot2", "psych"))

✅ Unit 5: Predictive Analytics (Regression)

Focus: Linear and multiple regression, diagnostics

Purpose Package Why It’s Needed

Simple & multiple linear stats (base lm() function is built-in

regression R)

Regression diagnostics car For VIF (variance inflation factor) and

multicollinearity

Plotting regression and ggplot2, For enhanced regression visuals

diagnostics GGally

Summary statistics & tests lmtest For Breusch-Pagan test (heteroscedasticity)

✅ Installation code:
r
CopyEdit
install.packages(c("car", "GGally", "lmtest"))

📦 Full Master Install Command

You can install everything at once using this combined command:

r
CopyEdit
install.packages(c("readr", "readxl", "dplyr", "tibble", "ggplot2",
"psych", "car", "GGally", "lmtest"))

Financial Accounting If Rs Principles 5 e 2019
50% (4)
Financial Accounting If Rs Principles 5 e 2019
2 pages
AI Associate Study Guide
No ratings yet
AI Associate Study Guide
9 pages
ADEC Presentation_Motivline
No ratings yet
ADEC Presentation_Motivline
70 pages
Physics Implementation Lab
100% (4)
Physics Implementation Lab
9 pages
Attack On Titan v07 (2013) (Digital) (LostNerevarine-Empire)
75% (4)
Attack On Titan v07 (2013) (Digital) (LostNerevarine-Empire)
185 pages
Unit 2
No ratings yet
Unit 2
32 pages
Preflop Range Quiz Worksheet: Instructions
No ratings yet
Preflop Range Quiz Worksheet: Instructions
41 pages
Mitutoyo SJ 201P - Manual RUGOSIMETRO
100% (1)
Mitutoyo SJ 201P - Manual RUGOSIMETRO
187 pages
R Programming 2 MARKS
No ratings yet
R Programming 2 MARKS
12 pages
r inter
No ratings yet
r inter
6 pages
R Most Important Question
No ratings yet
R Most Important Question
12 pages
BA Viva Questions
No ratings yet
BA Viva Questions
8 pages
R Questions With Solution
No ratings yet
R Questions With Solution
11 pages
R Programming
No ratings yet
R Programming
79 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
10 pages
Teaching Notes of R
No ratings yet
Teaching Notes of R
78 pages
R Viva Questions
100% (1)
R Viva Questions
4 pages
r programming
No ratings yet
r programming
7 pages
unit 4 ba shivdas
No ratings yet
unit 4 ba shivdas
17 pages
BA IMPORTANT QUESTIONS
No ratings yet
BA IMPORTANT QUESTIONS
8 pages
Ba Important Questions
No ratings yet
Ba Important Questions
8 pages
Apunts BLOC 1 Estadística
No ratings yet
Apunts BLOC 1 Estadística
15 pages
RBasics Handout
No ratings yet
RBasics Handout
6 pages
DWDM - Lab Manual1
No ratings yet
DWDM - Lab Manual1
40 pages
Introduction To R
No ratings yet
Introduction To R
39 pages
Data Science and Big Data Analysis Mcqs
No ratings yet
Data Science and Big Data Analysis Mcqs
53 pages
Maximum Possible Questions for Theory Exam Business Analytics
No ratings yet
Maximum Possible Questions for Theory Exam Business Analytics
5 pages
R Basic Viva Questions
No ratings yet
R Basic Viva Questions
3 pages
Capital Gains
No ratings yet
Capital Gains
8 pages
Presentation of R
No ratings yet
Presentation of R
109 pages
Lab01
No ratings yet
Lab01
36 pages
Possible Questions on R Programming and Metaverse
No ratings yet
Possible Questions on R Programming and Metaverse
20 pages
Getting Started With R
No ratings yet
Getting Started With R
155 pages
R Programming Notes
No ratings yet
R Programming Notes
23 pages
R Programming
No ratings yet
R Programming
11 pages
BA End Sem Important (3)
No ratings yet
BA End Sem Important (3)
18 pages
DSA1101 2019 Week1 Part2
No ratings yet
DSA1101 2019 Week1 Part2
38 pages
R Interview
No ratings yet
R Interview
20 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
How To Use The R Programming Language For Statistical Analyses
No ratings yet
How To Use The R Programming Language For Statistical Analyses
38 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
R - Solved QB Unit-II
No ratings yet
R - Solved QB Unit-II
14 pages
Unit - I: Topic - 1
No ratings yet
Unit - I: Topic - 1
13 pages
R Software - Notes
No ratings yet
R Software - Notes
18 pages
basics of R
No ratings yet
basics of R
12 pages
R Manual
No ratings yet
R Manual
10 pages
Data Analytics Using R
100% (1)
Data Analytics Using R
27 pages
unit3_R[1] (1)
No ratings yet
unit3_R[1] (1)
30 pages
Mendenhall R
No ratings yet
Mendenhall R
14 pages
Introduction to R for Business Analytics(1)
No ratings yet
Introduction to R for Business Analytics(1)
7 pages
Module 5-6
No ratings yet
Module 5-6
12 pages
Stats Lab1
No ratings yet
Stats Lab1
11 pages
ProgrammingForDS13_introR
No ratings yet
ProgrammingForDS13_introR
25 pages
R - II UNIT
No ratings yet
R - II UNIT
10 pages
R Basic
No ratings yet
R Basic
16 pages
Lecture 1
No ratings yet
Lecture 1
35 pages
Chapter - 03 - Review of Basic Data
No ratings yet
Chapter - 03 - Review of Basic Data
92 pages
Unit 4
No ratings yet
Unit 4
27 pages
Unit 2 R
No ratings yet
Unit 2 R
16 pages
MultivariateRGGobi PDF
No ratings yet
MultivariateRGGobi PDF
60 pages
R programing
No ratings yet
R programing
12 pages
Untitled
No ratings yet
Untitled
59 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
50 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - Read Online Or Download Now
100% (7)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - Read Online Or Download Now
35 pages
R WorkSamples
No ratings yet
R WorkSamples
44 pages
#02 R Basics
No ratings yet
#02 R Basics
30 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Ecs Review 2
No ratings yet
Ecs Review 2
8 pages
Tonmoy
No ratings yet
Tonmoy
2 pages
EM78P153K ELANMicroelectronics
No ratings yet
EM78P153K ELANMicroelectronics
68 pages
Lab Inventory V2
No ratings yet
Lab Inventory V2
42 pages
Memo KD .3.9
No ratings yet
Memo KD .3.9
7 pages
Design Of Biomedical Devices And Systems 3rd Paul H King Richard C Fries pdf download
No ratings yet
Design Of Biomedical Devices And Systems 3rd Paul H King Richard C Fries pdf download
84 pages
Assignment Module 7
No ratings yet
Assignment Module 7
12 pages
Mine Illumination
No ratings yet
Mine Illumination
12 pages
Module 5 and 6 (Searchable) PDF
No ratings yet
Module 5 and 6 (Searchable) PDF
87 pages
Monitoring Perubahan Penggunaan Lahan Pesisir Di Kabupaten Gianyar Menggunakan Citra Landsat 1997 Dan 2018
No ratings yet
Monitoring Perubahan Penggunaan Lahan Pesisir Di Kabupaten Gianyar Menggunakan Citra Landsat 1997 Dan 2018
8 pages
Ring Foundation
100% (2)
Ring Foundation
4 pages
ch2 BIM 1
No ratings yet
ch2 BIM 1
170 pages
Squeeze-Tube-Options-Aurora-V-Line-Atlas (1)
No ratings yet
Squeeze-Tube-Options-Aurora-V-Line-Atlas (1)
2 pages
Taylor Yakimovitch Resume
No ratings yet
Taylor Yakimovitch Resume
3 pages
Short Circuit Worksheet
No ratings yet
Short Circuit Worksheet
1 page
Domain White Pages Access
No ratings yet
Domain White Pages Access
4 pages
Relé Estado Sólido
No ratings yet
Relé Estado Sólido
1 page
Document - Onl Alarmes Erros e Eventos Fiberhome
No ratings yet
Document - Onl Alarmes Erros e Eventos Fiberhome
212 pages
PVC-U Ball Valve +GF+ Type 546: Dimensions and Spare Parts List
No ratings yet
PVC-U Ball Valve +GF+ Type 546: Dimensions and Spare Parts List
11 pages
Ensayo Sobre Agua para Elefantes
100% (1)
Ensayo Sobre Agua para Elefantes
4 pages
Ascribe WhitePaper On WhitePapers Nov2019
No ratings yet
Ascribe WhitePaper On WhitePapers Nov2019
11 pages
Order Scheduling Status Change Scenarios
No ratings yet
Order Scheduling Status Change Scenarios
7 pages
IASS2004 - Eekhout - Blob Structures
No ratings yet
IASS2004 - Eekhout - Blob Structures
8 pages

R Viva Ques

Uploaded by

R Viva Ques

Uploaded by

UNIT 3: Getting Started with R (6 Hours)

Introduction to R, Advantages, Installing Packages, Importing Data, Commands & Syntax,

1. What is R and why is it used?

2. What are the main advantages of R?

●​ Open source and free to use​

●​ Rich set of packages for data analysis​

●​ Great visualization capabilities (e.g., ggplot2)​

●​ Large and active community​

●​ Widely used in academia and industry​

3. How do you install and load a package in R?

4. How can you import data from a CSV file in R?

5. What is the difference between a vector and a list in R?

●​ Vector: Homogeneous data type (e.g., all numeric)​

●​ Data Frame: Table-like structure with columns of different types​

●​ Matrix: 2D structure with elements of the same type​

7. How do you create a vector in R?

8. How do you create a matrix in R?

9. How do you create a list in R?

10. What are factors in R? Why are they used?

11. What are conditionals in R? Give an example.

12. What are loops in R? Give an example of a for loop.

13. How do you define a function in R?

14. What is the purpose of the apply() function?

15. What is the difference between lapply() and sapply()?

●​ lapply() always returns a list​

●​ sapply() tries to simplify the result into a vector or matrix​

16. What is the tapply() function used for?

17. What is a library in R?

18. How do you check the structure of a data object?

19. How do you check the summary statistics of a data frame?

20. What are the common data types in R?

🔸 Section A: Theoretical Questions

●​ Measures of central tendency (mean, median, mode)​

●​ Measures of dispersion (range, variance, standard deviation)​

●​ Charts and graphs for data visualization​

2. What are measures of central tendency? Explain each with an example.

●​ Mean: Arithmetic average​

●​ Median: Middle value when data is sorted​

●​ Mode: Most frequent value​

●​ Mean = (2+3+4+5+5)/5 = 3.8​

3. What are measures of dispersion? Why are they important?

●​ Range: Difference between max and min​

●​ Variance: Average squared deviation from the mean​

●​ Standard Deviation: Square root of variance​

●​ IQR: Interquartile range (Q3 - Q1)​

They help understand data variability and consistency.

4. What is covariance? What does its sign indicate?

●​ Positive covariance = variables move in same direction​

●​ Negative covariance = move in opposite direction​

●​ Near zero = weak or no relationship​

5. What is correlation? How is it different from covariance?

●​ +1 = strong positive linear relationship​

●​ -1 = strong negative linear relationship​

6. What is the coefficient of determination (R²)?

●​ Closer to 1 = better model fit​

🔸 Section B: Practical Questions Using R

8. How do you calculate mean, median, and mode in R?

9. How do you calculate variance and standard deviation in R?

10. How do you calculate the range and IQR in R?

12. How do you calculate the coefficient of determination in R?

13. How do you create a histogram in R?

14. How do you create a bar chart in R?

15. How do you create a boxplot in R? What does it represent?

Explanation: Shows median, quartiles, outliers, and spread of the data.

16. How do you create a scatter plot in R?

18. What is the use of the summary() function in R?

19. How do you create a pairwise correlation matrix in R?

20. How do you visualize correlation in R (advanced)?

🔸 SECTION A: Theoretical Questions

2. What is regression analysis?

3. What is the difference between simple and multiple linear regression?

●​ Multiple Linear Regression: Two or more independent variables → one dependent

4. What are the assumptions of a linear regression model?

2.​ Independence of errors​

3.​ Homoscedasticity (constant variance of errors)​

4.​ Normality of errors​

● Open source and free to use

● Rich set of packages for data analysis

● Great visualization capabilities (e.g., ggplot2)

● Large and active community

● Widely used in academia and industry

● Vector: Homogeneous data type (e.g., all numeric)

● Data Frame: Table-like structure with columns of different types

● Matrix: 2D structure with elements of the same type

● lapply() always returns a list

● sapply() tries to simplify the result into a vector or matrix

● Measures of central tendency (mean, median, mode)

● Measures of dispersion (range, variance, standard deviation)

● Charts and graphs for data visualization

● Mean: Arithmetic average

● Median: Middle value when data is sorted

● Mode: Most frequent value

● Mean = (2+3+4+5+5)/5 = 3.8

● Range: Difference between max and min

● Variance: Average squared deviation from the mean

● Standard Deviation: Square root of variance

● IQR: Interquartile range (Q3 - Q1)

● Positive covariance = variables move in same direction

● Negative covariance = move in opposite direction

● Near zero = weak or no relationship

● +1 = strong positive linear relationship

● -1 = strong negative linear relationship

● Closer to 1 = better model fit

● Multiple Linear Regression: Two or more independent variables → one dependent

2. Independence of errors

3. Homoscedasticity (constant variance of errors)

4. Normality of errors

5. No multicollinearity (in multiple regression)

● Variance Inflation Factor (VIF): VIF > 10 indicates strong multicollinearity

● p-values for significance testing

● Data manipulation (via dplyr, data.table)

● Statistical modeling (regression, hypothesis testing)

● Data visualization (ggplot2, base R plots)

● Machine learning (caret, randomForest)

● Open source and free

● Rich ecosystem of packages for specialized tasks

● Great data visualization capabilities

● Powerful for statistical modeling

● Active community support

● Integrates well with RStudio and markdown for reproducible reports

● A package is a collection of R functions, data, and documentation bundled together.

● A library is where these packages are stored once installed.

● You load it into the current session using library(package_name)

● A vector holds elements of the same type (numeric, character, etc.)

● if, else if, else → decision making

● switch → for selecting among alternatives

● Loops: for, while, repeat → for iteration

● apply() – applies function to rows or columns of a matrix

● lapply() – applies function to each element of a list

● sapply() – simplified version of lapply, returns vector or matrix

● tapply() – applies function over subsets of a vector

● Advantage: More efficient and readable than loops

● Measures of central tendency (mean, median, mode)

● Measures of dispersion (range, variance, standard deviation)

● Shape of distribution (skewness, kurtosis)

● Median: Middle value

● Mode: Most frequent value

● Use median for skewed data or outliers

● Range: Difference between max and min

● Variance: Average of squared differences from the mean

● Standard Deviation: Square root of variance

● Covariance: Measures direction of linear relationship

○ Positive → move together

○ Negative → move opposite

● Correlation: Measures strength and direction, scaled from -1 to +1

○ Pearson’s correlation is most common

○ Correlation = Covariance / (SD₁ × SD₂)

● A higher R² indicates a better model fit