0% found this document useful (0 votes)

196 views

RM Lab Kartik Bba (B&i)

The document provides a practical file on research methodology lab using MS Excel and R. It contains an index listing various functions and tools in Excel like count, sum, average, if functions and conditional formatting, charts, pivot tables. It also introduces basics of R like importing and analyzing data, descriptive statistics, hypothesis testing using t-test, ANOVA, correlations. The file serves as a guide to learn analyzing and visualizing data using Excel and R functions.

Uploaded by

Chelsea Sanders

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

196 views

RM Lab Kartik Bba (B&i)

Uploaded by

Chelsea Sanders

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 127

Research Methodology Lab

(Using MS Excel and R)

PRACTICAL FILE

Submitted for partial fulfillment for the award of the

Degree of

BACHELOR OF BUSINESS ADMINISTRATION

(BBA B&I 2017 - 2020)

Under the supervision of

CA. RuchiKansil

Submitted by
NAME : KARTIK RATERIA
ENROLLMENT NO. : 01417701817

SCHOOL OF BUSINESS STUDIES

VIVEKANANDA INSTITUTE OF
PROFESSIONAL STUDIES
(Affiliated to Guru Gobind Singh Indraprastha University)

1
INDEX
Topic Page No.
Functions
Count 06
CountA 07
Count Blank 08
Sum 08
Max 09
Min 09
Average 10
CountIf 11
Average If 12
Sum If 13
Concatenate 14
VlookUp 15
HLookUp 16
Vlookup+ Dropdown 17

Other tools
Transpose table 20
Text to Column 21
Conditional Formatting – Highlight Cell rules (greater than, less than, between, 23
equal to, text that contains, a date occurring, duplicate values)
Conditional Formatting – Top/ Bottom rules 28
Conditional Formatting – Color Scales 29
Conditional Formatting – Data Bars 30
Format as Tables 30
Format Cells – Number, Alignment, Font, Border, Fill 31
Cell Styles 34
Data validation – settings ( any value, number , custom) 35
Data validation – input message 38
Data validation – error alert 42
Customization – quick access toolbar 45
Customization- ribbon 46
backstage view 47
save as adobe pdf 47
Data Visualization and Analysis
Frequency 51
Relative frequency 51
Percentage frequency 52
Bar Graph 52
Histogram using Graph tab 53
Pivot Table and its tools 54
Pivot Chart and its tools 57

2
Histogram frequency distribution 58
Histogram – Chart output 60
Histogram – Pareto (sorted diagram) 61
Histogram – Cumulative percentage 61
Descriptive statistics 62
Descriptive statistics for various scales 63
Correlation 66
Hypothesis Testing
One sample t test using dummy (one-tailed) 69
One sample t test using dummy (two-tailed) 70
One sample t test using test average (one- tailed) 71
One sample t test using test average (two- tailed) 73
t test using function (all combinations) 74
Two sample - Independent sample t test 75
Two sample - Paired Sample t test 78
One sample z test 79
Two sample z test 80
ANOVA – Single Factor 82
ANOVA – Two Factor without replication 84
ANOVA – Two Factor with replication 88
F test 89
Chi square test 94

Introduction to R

Downloading R 99
Four Panes in R 102
Import of Data Sheet in Excel 103
Descriptive Statistics 106
Correlation 108
Hypothesis Testing 109
One Sample T Test 110
Two Sample- Independent Sample T Test 110
Two Sample- Paired Sample T Test 114
One Way ANOVA 119
F Test 122
Chi Square Test 124

3
FUNCTIONS
Application of basic functions in MS Excel.

4
Considering this Data

S.NO. NAMES HOUSE SCORES

1 Chhavi White 2.9
2 Anil Green 2.4
3 Uma Saffron 3.4
4 Aryan White
5 Kartik Green 3.9
6 Sneh Saffron 4.5
7 Promila White 3.7
8 Hardik Green 4.7
9 Avirup Das Gupta Saffron 4.3
10 Satyam White 1.8
11 Lovish Green 4.6
12 Raghav Saffron
13 Pulkit White 5
14 Shashwat Green 4.7
15 Pallavi Saffron 5
16 Prishita White
17 Kanchan Green 3.8

1. Count- To count the total number of students.

5
The COUNT function counts the number of cells that contain numbers, and counts numbers
within the list of arguments. For example, you can enter the following formula to count the
numbers in the range D2:D18 =COUNT (D2:D18). In this example, if all of the cells in the
range contain numbers, the result is 14.

Syntax

COUNT (value1, [value2] ...)

The COUNT function syntax has the following arguments:

 Value1 Required. The first item, cell reference, or range within which you want to
count numbers.

 value2, Optional. Up to 255 additional items, cell references, or ranges within

which you want to count numbers.

Hence, the cells which contain numbers is 14.

6
2. CountA- To count the number of cells in a range those are not empty.
Syntax
COUNTA (value 1,[value 2]...)

Therefore, the answer of this function is also 14.

The difference between Count Function and CountA Function is that the
Count Function counts all the cells which have numbers where as CountA
Function counts all those cells which are not empty.
So, in the score column, if the marks would have been written in alphabets
in one the cell then the Count would 13 where as CountA would remain
same.

7
3. Count Blank- To count the number of cells in a range that are blank.
Syntax
COUNTBLANK(range)

So, the answer of this function would be 3 as there are 3 cells blank in the score column.

4. Sum- To find the sum of selected data.

Syntax
SUM( number 1, [number2],[number3]...)

So, answer of this function is 54.7.

8
5. Maximum Function- Gives the maximum value from the range.
Syntax
MAX( number1,[number2],...[number n])

So, maximum value of the score column is 5.

6. Minimum Function- Gives the minimum value from the range.
Syntax
MIN( number1,[number2],...[number n])

Hence, the answer is 1.8 as it is the smallest value.

9
7. Average- Gives the average value of the range.
Syntax

AVERAGE (number1, [number2], ...)

The AVERAGE function syntax has the following arguments:

 Number1 Required. The first number, cell reference, or range for which you want
the average.

 Number2 Optional. Additional numbers, cell references or ranges for which you
want the average, up to a maximum of 255.

Hence, average of all the marks are 3.907.

10
8. CountIf Function- To count the number of students in “Saffron” house.

Criteria Formula Example Description

Count if greater =COUNTIF(C2:C8,">=5" Count cells where value is
than or equal to ) greater than or equal to 5.
Count if less than =COUNTIF(C2:C8,"<=5" Count cell where value is less
or equal to ) than 5.

So, according to this function, it tells that there are 5 people who belongs to
Saffron house.

11
9. Average If Function- The Microsoft Excel AVERAGEIF function returns the average
(arithmetic mean) of all numbers in a range of cells, based on a given criteria.

Syntax
AVERAGEIF (range, criteria, [average range])

For this we will add one more score column for taking references.

So, answer according to this function is 3.1.

12
10. Sum If Function- is a function to sum cells that meet single criteria. SUMIF can be
used to sum cells based on dates, numbers, and text that match specific criteria. SUMIF
supports logical operators (>,<,<>,=) and wildcards (*,?) for partial matching.

Syntax
SUMIF(range, criteria, [sum range])

For this function, first we will add the two scores applying Sum Function.
For this,we will insert an another Column from Insert option in the Home
section.

So the answer according to this function is again 3.1.

13
11. Concatenate Function- This function is used to merge data of two cells.
Syntax
CONCATENATE(text 1, text2,...)

14
12. VLook Up Function- Vlookup (short for 'vertical' lookup) is a built-in
Excel function that is designed to work with data that is organised into columns.
For a specified value, the function finds (or 'looks up') the value in one column
of data, and returns the corresponding value from another column.
Syntax
=VLOOKUP (value, table, col_index, [range lookup])
Definitions-

 Value- The value to look for in the first column of a table.

 Table- The table from which to retrieve a value.
 Column Index- The column in the table from which to retrieve a value.
 Range Lookup- [optional] TRUE= approximate
match.FALSE=exactmatch.

15
13. HLookUp Function- HLOOKUP is an Excel function to lookup and retrieve data from a
specific row in table. The "H" in HLOOKUP stands for "horizontal", where lookup values appear in
the first row of the table, moving horizontally to the right. HLOOKUP supports approximate and
exact matching, and wildcards (* ?) for finding partial matches.
Syntax
=HLOOKUP (value, table, row_index, [range lookup])

The HLOOKUP function syntax has the following arguments:

 Lookup_value Required. The value to be found in the first row of the table. Lookup_value
can be a value, a reference, or a text string.

 Table_array Required. A table of information in which data is looked up. Use a reference

to a range or a range name.

o The values in the first row of table_array can be text, numbers, or logical values.

o If range_lookup is TRUE, the values in the first row of table_array must be placed in
ascending order: ...-2, -1, 0, 1, 2,... , A-Z, FALSE, TRUE; otherwise, HLOOKUP may not give the
correct value. If range_lookup is FALSE, table_array does not need to be sorted.

o Uppercase and lowercase text are equivalent.

o Sort the values in ascending order, left to right. For more information, see Sort data in
a range or table.

 Row_index_num Required. The row number in table_array from which the matching value
will be returned. A row_index_num of 1 returns the first row value in table_array, a row_index_num
of 2 returns the second row value in table_array, and so on. If row_index_num is less than 1,

16
HLOOKUP returns the #VALUE! error value; if row_index_num is greater than the number of rows
on table_array, HLOOKUP returns the #REF! error value.

 Range_lookup Optional. A logical value that specifies whether you want HLOOKUP to

find an exact match or an approximate match. If TRUE or omitted, an approximate match is returned.
In other words, if an exact match is not found, the next largest value that is less than lookup_value is
returned. If FALSE, HLOOKUP will find an exact match. If one is not found, the error value #N/A is
returned.

14.VLOOKUP+ DROPDOWN LIST : -

The VLOOKUP function in Excel can become interactive and more powerful when applying a Data
Validation (drop down menu/list) as the Lookup_Value. So as you change your selection from the
drop down list, the VLOOKUP value also changes.

Now, we can add ad dropdown list through data validation as shown below

17
This function helps to extract data of a specific person or thing from the abundant data from a single
click. It also prevents from writing the syntax again and again.

Importance of these functions-

 Saves times in procuring the data
 It is also cost effective
 Helps to collect and store huge data
 It makes business operations effective and efficient
 Helps to quicker and easily analysis of collected data
 Based on analysis, the business can now take correct measures to remove any malfunctioning
of any factor
 Also, hypothesis can be made with more accuracy
 Makes data writing process easier and effective

18
19
OTHER
TOOLS

 TRANSPOSE TABLE
If you have a worksheet with data in columns that you need to rotate to rearrange it in rows, use
the Transpose feature. With it, you can quickly switch data from columns to rows, or vice versa.

For example, if your data looks like this, with Sales Regions in the column headings and and Quarters
along the left side:

20
Going to paste special, select transpose

21
Here’s how to do it:

1. Select the range of data you want to rearrange, including any row or column labels,
and press Ctrl+C.
2. Choose a new location in the worksheet where you want to paste the transposed table,
ensuring that there is plenty of room to paste your data. The new table that you paste there
will entirely overwrite any data / formatting that’s already there.
Right-click over the top-left cell of where you want to paste the transposed table, then
choose Transpose
3. After rotating the data successfully, you can delete the original table and the data in
the new table will remain intact.

 TEXT TO COLUMN:-
How to Use Text-to-Columns in Excel
1. Open Excel and start a new Blank workbook.
2. Add entries to the first column and select them all.
3. Choose the Data tab atop the ribbon.
4. Select Text to Columns.
5. Ensure Delimited is selected and click Next.
6. Clear each box in the Delimiters section and instead choose Comma and Space.
7. Click Finish.

22
23
 Conditional Formatting:-
Highlight Cell rules (greater than, less than, between, equal to, text that contains, a date occurring,
duplicate values)

DATA USED FOR CONDITIONAL FORMATTING

Viva
Accountin Communicatio Compute La Math Tota Averag - Overal
Name g n r w s l e Voce l
Sonia 16 19 17 15 16 83 16.6 77 160
Kriti 17 19 18 17 15 86 17.2 78 164
Charu 20 15 17 19 16 87 17.4 65 152
Monika 17 17 20 18 15 87 17.4 81 168
Pooja 20 17 15 20 16 88 17.6 75 163
Poonam 17 16 18 15 19 85 17 85 170
Priya 16 19 18 19 18 90 18 60 150
Garima 15 16 15 17 17 80 16 74 154
Charu 19 18 16 20 19 92 18.4 74 166
Sakshi
Kakkar 17 20 15 16 17 85 17 84 169
Garima
Batra 19 19 17 19 18 92 18.4 61 153
Deepika
Jain 19 19 19 20 20 97 19.4 61 158
Soniya 15 15 20 20 19 89 17.8 78 167
Shalini 16 18 16 17 16 83 16.6 73 156
Mona 16 16 20 20 19 91 18.2 82 173
Pooja 18 19 18 16 17 88 17.6 78 166

24
Anita 19 19 19 18 19 94 18.8 59 153
Divya
Gandhi 16 19 15 20 16 86 17.2 78 164
Seema 17 18 19 17 19 90 18 73 163
Taruna
Gosain 20 19 20 20 15 94 18.8 75 169
Taruna 15 16 16 16 16 79 15.8 60 139
Sheetal 19 20 18 18 17 92 18.4 73 165
Mona 20 19 17 16 18 90 18 73 163
Megha
Gupta 16 16 20 16 16 84 16.8 85 169
Kamna 20 18 18 18 16 90 18 61 151
Payal
Ahuja 18 18 15 15 20 86 17.2 74 160
Pooja 19 15 20 17 20 91 18.2 73 164
Kriti
Khera 19 18 19 19 19 94 18.8 84 178
Anju 19 19 20 18 17 93 18.6 86 179
Bhawna 19 15 19 15 19 87 17.4 73 160
Monika 16 18 20 15 18 87 17.4 65 152
Sunita 16 18 20 20 18 92 18.4 82 174
Khushbo
o 19 19 19 20 20 97 19.4 73 170
Heena 15 16 20 20 15 86 17.2 58 144
Charu 18 16 16 19 17 86 17.2 89 175
Sonal 15 19 17 19 19 89 17.8 85 174
Sapna
Kharab 18 20 16 17 20 91 18.2 58 149
Deepika 19 20 17 17 20 93 18.6 56 149
Himani
Hans 16 17 17 16 19 85 17 59 144
Savina 18 16 16 17 17 84 16.8 92 176

25
1. Count the number of students who have got overall marks more than
165.

Less than

26
Between

Equal to

Duplicate
27
2. Highlight cells with green colour where total number of marks
is more than or equal to 90.

28
3. Highlight cells with blue where the name of students start with S.

4. Top/Bottom Rules

29
5. Color Scales

30
6. Data Bars

 FORMAT AS TABLES
To format data as a table:
1. Select the cells you want to format as a table.
2. From the Home tab, click the Format as Table command in the Styles group.
3. Select a table style from the drop-down menu.
A dialog box will appear, confirming the selected cell range for the table

31
 FORMAT CELLS
To apply number formatting:
1. Select the cells(s) you want to modify. Selecting a cellrange.
2. Click the drop-down arrow next to the Number Formatcommand on the Home tab.
3. The Number formatting drop-down menu will appear.
4. Select the desired formatting option.
5. The selected cells will change to the new formatting style
Following is the list-
1. Number
2. Alignment
3. Font
4. Border
5. Fill.

32
 NUMBER

 ALIGNMENT

33
 FONT

 BORDER

34
 FILL

 CELL STYLES
To apply a cell style:
 Select the cell(s) you want to modify. Selecting a cell range.
 Click the Cell Styles command on the Home tab, and then choose the
desired style from the drop-down menu. In our example, we'll choose Accent 1. Choosing
a cell style.
 The selected cell style will appear. The new cell style.

35
 DATA VALIDATION

1. Select the cell(s) you want to create a rule for.

2. Select Data >Data Validation.

3. On the Settings tab, under Allow, select an option:

o Whole Number - to restrict the cell to accept only whole numbers.
o Decimal - to restrict the cell to accept only decimal numbers.
o List - to pick data from the drop-down list.
o Date - to restrict the cell to accept only date.
o Time - to restrict the cell to accept only time.
o Text Length - to restrict the length of the text.
o Custom – for custom formula.
4. Under Data, select a condition:
o between
o not between
o equal to
o not equal to
o greater than
o less than
o greater than or equal to
o less than or equal to

36
5. On the Settings tab, under Allow, select an option:
6. Set the other required values, based on what you chose for Allow and Data. For
example, if you select between, then select the Minimum: and Maximum: values for the
cell(s).
7. Select the Ignore blank checkbox if you want to ignore blank spaces.
8. If you want to add a Title and message for your rule, select the Input Message tab,
and then type a title and input message.
9. Select the Show input message when cell is selected checkbox to display the
message when the user selects or hovers over the selected cell(s).
10. Select OK.

Now, if the user tries to enter a value that is not valid, a pop-up appears with the message,
“This value doesn’t match the data validation restrictions for this cell.”

Each worksheet is listed below, along with what kind of Data Validation you'll find

Worksheet Data Validation Type

Whole number Limit entries to whole numbers
Decimal Limit entries to decimal (percentage) values
Departments Limit selections to list choices
Cost centers table Table for Cost center list source
Cost center
budget Limit selections to Cost center list choices
Date Limit entries to dates within a range
Time Limit entries between a time frame
Text length Limit entries to a certain number of characters
HR Budget Limit entries to a certain maximum amount
Products Require entries to meet certain text guidelines
Age verification Limit entries below a certain age
Limit entries to unique values only (no repeated
Custom values entries)
E-Mail Require entries to contain the @ symbol

SETTINGS- ANY NUMBER

37
38
CUSTOM

INPUT MESSAGE
Data Validation Messages
With the options available in data validation, you can display messages to give instructions to
the people who use your spreadsheet. There are two types of data validation messages:

1. An Input Message can be displayed when a cell is selected.

2. An Error Alert can be displayed if invalid data is entered in a cell

Create an Input Message

To help people know what data should be entered in a cell, you can set up an Input
Message that is displayed when the cell is selected.

Follow these steps to show a short message when a cell is selected.

39
1. Select the cells in which you want to apply data validation
2. On the Ribbon, click the Data tab, and click Data Validation
3. (optional) On the Settings tab, choose the data validation settings
4. Click on the Input Message tab, and add a check mark to Show input message when
cell is selected

5. Type your message heading text in the Title box. This text will appear in bold print at
the top of the message.
6. Type a short message in the Input message box. Press the Enter key, to create line
breaks, if you want them.
NOTE: The limit is 255 characters

7. Click OK or follow the steps below to add an Error Alert.

8. Now, when you click on the cell, the Input Message will appear.

40
Input Message Size
Although there are 255 characters allowed in the Input Message box, the box has a maximum
height and width, and all the characters might not fit.
NOTE: The size of the message box cannot be changed -- it is automatically set by Excel.
For example, in the message box below, there are 254 "i" characters, with an "X" at the end.

However, in the message box below, there are 254 "W" characters, with an "X" at the end.
Only 126 of the characters appear in full, and the remaining characters are cut off, or not
visible.

Input Message Position

In most cases, the input message pops below the cell, with the left edge of the message at the
middle point of the cell's width.

41
If the cell is close to the right side of the Excel window, the right border of the input message
will start at the Excel window border.

If there is not enough room below the cell, the input message appears at the right side of the
cell, if there is enough room there.

If there is not enough room below the cell, or to the right, the input message appears at the
left side of the cell.

42
If there is a comment in the cell, the input message appears below the cell, with the right edge
of the message at the middle point of the cell's width. This can cause problems in column A,
where there is no room at the left, and the data validation message is cut off.

Move an Input Message

When an input message appears, you can temporarily drag it to a different location on the
worksheet.

 The location is only temporary -- the message box will return to its original position,
when you close and reopen the workbook.
 ALL input messages on that worksheet will appear in that location, until the
workbook is closed and reopened.

43
Create an Error Alert
When you add data validation to a cell, the Error Alert feature is automatically turned on. It
blocks the users from entering invalid data in the cell.

You can turn Error Alert off, to allow people to enter invalid data. Or, change the type of
Error Alert, by following the instructions below.

1. Select the cells in which you want to apply data validation

2. On the Ribbon, click the Data tab, and click Data Validation
3. On the Settings tab, choose the data validation settings
4. Click on the Error Alert tab, and add a check mark to Show error alert after invalid
data is entered.

5. Choose an Error Alert Style from the dropdown list.

o Stop: Prevents the entry of invalid data.
If the Retry button is clicked, the invalid entry is highlighted, and can be

44
overtyped.
If the Cancel button is clicked, the invalid entry is deleted, and the cell's
original content is restored.
The user cannot leave the invalid entry in the cell

o Warning: Discourages the entry of invalid data.

If the Yes button is clicked, the invalid entry is accepted, and the next cell is
selected.
If the No button is clicked, the invalid entry is highlighted, and can be
overtyped.
If the Cancel button is clicked, the invalid entry is deleted, and the cell's
original content is restored.
The user can choose to leave the invalid entry in the cell.

o Information: Announces the entry of invalid data.

If the OK button is clicked, the invalid entry is accepted, and the next cell is
selected.
If the Cancel button is clicked, the invalid entry is deleted, and the cell's
original content is restored.
The user can choose to leave the invalid entry in the cell.

45
6. Type your message heading text in the Title box. This text will appear in bold print at
the top of the message.
7. Type a short message in the Error message box. The limit is 225 characters
8. Click OK

Importance of Data Validation in Excel-

 Data validation can make sure the right types of data get into rows and columns
 Use data validation to restrict the type of data or the values that users can enter into cell.
 The most common problems encountered with a large amount of data in Excel is
validating whether the data in the column was correctly entered based on what is expected
type of data.

 If the type of data required is date but somehow other rows contain text or number. If
we import excels file with inconsistent types of data, it may cause errors on the other end.
Hence, data validation plays an important role in preventing these types of error.
Data Validation in Excel lets you control the data that can be entered in a cell. You can
restrict the user to enter only a specified range of numbers or text or date.
You can also use data validation functionality to create an Excel drop down list (which is
definitely one of the coolest and most powerful features in Excel)

 CUSTOMIZATION- QUICK ACCESS TOOLBAAR

46
Add a command to the Quick Access Toolbar

 On the ribbon, click the appropriate tab or group to display the command that you
want to add to the Quick Access Toolbar.

 Right-click the command, and then click Add to Quick Access Toolbar on the
shortcut menu.

CUSTOMIZATION- RIBBON
 Click the Office Button;
 Click the Excel Option button at the bottom, then you will enter the Excel Option
window;

47
 Click the Popular button at the left;
 Under Top Option for Working with Excel, check the Show Developer tab in the
Ribbon option.
 Click Ok button to finish editing.

BACKSTAGE VIEW
Backstage View that allows you to manipulate aspects of a file. Backstage View is accessible by
clicking on the "File" tab near the top of the application window. The backstage view gives access to
saving, opening, info about the open file (Permissions, Sharing, and Versions), creating a new file,
printing, and recently opened files.

SAVING FILE As PDF

-Click the Office Button, point to the Save As command, and click the PDF or XPS option on the
continuation menu.

-The Publish as PDF or XPS dialog box appears.

-Edit the filename and/or folder location (if necessary) and click the Publish button.
-Excel saves the workbook in a PDF file and automatically opens it in Adobe Reader.

48
49
DATA
VISUALISATION
AND ANALYSIS

50
DATA TO BE CONSIDERED

Student Score
69
Rhea Madsen

81
Jennifer Mendez
69
Brett Broyles
81
Shirley Smith

100
John Brown
81
Michael G. Welch
100
Donald Tse

82
Madeline Stevens
81
Howard Porter
81
Helen Craven

69
Lillie Schultz

78
Emily Li
69
Michael Long

88
Chris Herrman
100
Marshall Sherman

82
William Grindle
69
Pauline Haun
81
Lydia J. Evans

28
James Weaver

51
1. FREQUENCY
The Microsoft Excel FREQUENCY function returns how often values occur within a set of data. It
returns a vertical array of numbers. The FREQUENCY function is a built-in function in Excel that is
categorized as a Statistical Function. It can be used as a worksheet function (WS) in Excel.

Syntax
=FREQUENCY (data_array, bins_array)

2)RELATIVE FREQUENCY
Relative Frequency is the percentage a specific frequency is of the total
frequencies.

52
3.PERCENTAGE FREQUENCy A percentage
frequency distribution is a display of data that specifies the percentage of observations that exist for
each data point or grouping of data points. It is a particularly useful method of expressing the
relative frequency of survey responses and other data.

4.BAR GRAPH
 Open Excel. Locate and open the spreadsheet from which you want to make a bar chart.
 Select all the data that you want included in the bar chart.
 Be sure to include the column and row headers, which will become the labels in the bar chart. If you want
different labels, type them in the appropriate header cells.
 Click on the Insert tab and then on Insert Column or BarChartbutton in the Charts group. You'll see many
options when you select this button, such as 2-D columns and 3-D columns, as well as 2-D and 3-D bars. For
these purposes, we're selecting 2-D columns.
 The chart will appear. You'll also see horizontal bars giving the names of your headers at the bottom of your
graph.
 Next, give your chart a name. Click on the Chart Title section at the top of the graph and the section
becomes editable.
 Decide where to place the bar chart. It can be placed on a separate sheet or it can be embedded in the
spreadsheet. Then save it.
 If you want to delete the chart and start all over again, place your cursor on the edge of the chart (you'll get a
pop-up that says "chart area") and press your Delete key.

53
5.Histogram using graph tab
A histogram is a specific use of a column chart where each column represents the frequency
of elements in a certain range. In other words, a histogram graphically displays the number of
elements within the consecutive non-overlapping intervals, or bins.

54
6.PIVOT TABLE
A pivot table is a program tool that allows you to reorganize and summarize selected
columns and rows of data in a spreadsheet or database table to obtain a desired report. ... For
example, a store owner might list monthly sales totals for a large number of merchandise
items in an Excel spreadsheet.

Insert a Pivot Table

To insert a pivot table, execute the following steps.
1. Click any single cell inside the data set.
2. On the Insert tab, in the Tables group, click PivotTable.

The following dialog box appears. Excel automatically selects the data for you. The default
location for a new pivot table is New Worksheet.
 Click OK.

55
Consider the data given for Sales report
Date Salesperson Company Product Sales Value
1/31/2010 JJJ North Rental 29,546.00
3/10/2010 BBB North Flexelease 20,132.00
9/6/2010 GGG South Operating Lease 42,214.00
1/10/2010 EEE South Operating Lease 30,123.00
6/10/2010 BBB North Contract Hire 42,939.00
3/1/2010 DDD South Flexelease 68,804.00
1/9/2010 KKK North Contract Hire 41,979.00
6/2/2010 AAA North Contract Hire 41,485.00
2/10/2010 EEE South Capital Lease 63,237.00
7/10/2010 AAA North Operating Lease 66,944.00
1/10/2010 DDD South Rental 32,445.00
10/10/2010 FFF South Flexelease 41,345.00
1/10/2010 EEE South Rental 62,493.00
10/6/2010 GGG South Flexelease 27,628.00
5/1/2010 GGG South Capital Lease 55,421.00
10/10/2010 FFF South Contract Hire 40,622.00
4/10/2010 CCC North Contract Hire 36,208.00
8/10/2010 CCC North Flexelease 33,299.00
12/10/2010 DDD South Capital Lease 36,286.00
4/10/2010 JJJ North Rental 30,289.00
8/10/2010 HHH South Rental 20,805.00
3/10/2010 FFF South Contract Hire 60,837.00
8/10/2010 KKK North Capital Lease 47,350.00
11/10/2010 KKK North Operating Lease 49,368.00
9/10/2010 AAA North Operating Lease 39,292.00
8/8/2010 JJJ North Flexelease 38,261.00
4/10/2010 BBB North Flexelease 72,022.00
9/10/2010 EEE South Capital Lease 59,960.00
3/10/2010 AAA North Rental 71,212.00
8/10/2010 DDD South Contract Hire 58,338.00
1/10/2010 CCC North Flexelease 37,862.00
2/28/2010 AAA North Flexelease 52,639.00
9/10/2010 JJJ North Rental 61,021.00
6/10/2010 EEE South Capital Lease 64,552.00
1/10/2010 DDD South Capital Lease 51,404.00
12/10/2010 CCC North Rental 68,183.00
3/7/2010 JJJ North Operating Lease 74,061.00
12/10/2010 GGG South Capital Lease 65,538.00
5/10/2010 AAA North Rental 52,173.00
3/10/2010 KKK North Operating Lease 40,175.00
6/10/2010 JJJ North Capital Lease 54,463.00
6/10/2010 CCC North Contract Hire 42,500.00
7/10/2010 HHH South Rental 35,866.00
9/10/2010 GGG South Capital Lease 72,784.00
6/8/2010 CCC North Contract Hire 64,475.00
12/10/2010 AAA North Capital Lease 22,924.00
2/10/2010 KKK North Contract Hire 24,145.00
11/10/2010 HHH South Contract Hire 54,353.00
2/10/2010 BBB North Contract Hire 31,127.00

56
Now drag the “Product” title from Row Labels to Column Labels.

57
7.Pivot Chart
A pivot chart is especially useful for user when dealing with tremendous amounts of data. For
example, a society having a large number of employees is maintaining the working hours of
each pupil through Excel, such that at the end of each month, the employee with the highest
number of working hours, would be provided a bonus, due to the sincerity and devotion to the
society. While dealing with the complete list of society members would be very time
consuming and may even be erroneous, a pivot table, or a pivot chart, for that matter, would
allow quickly reorganizing and visualizing data in an understandable manner and facilitate
the entire process.

To insert a pivot chart, execute the following steps.

1. Click any cell inside the pivot table.
2. On the Analyze tab, in the Tools group, click PivotChart.
The Insert Chart dialog box appears.
3. Click OK.

58
8.Histogram using frequency distribution
1. On the Data tab, in the Analysis group, click the Data Analysis button.

2. In the Data Analysis dialog, select Histogram and click OK.

3. In the Histogram dialog window, do the following:

o Specify the Input range and the Bin range.

To do this, you can place the cursor in the box, and then simply select the corresponding
range on your worksheet using the mouse. Alternatively, you can click the Collapse
Dialog button , select the range on the sheet, and then click the Collapse
Dialog button again to return to the Histogram dialog box.

Tip. If you included column headers when selecting the input data and bin range, select
the Labels check box.

59
o Select the Output options.

To place the histogram on the same sheet, click Output Range, and then enter the upper-
left cell of the output table.

To paste the output table and histogram in a new sheet or a new workbook, select New
Worksheet Ply or New Workbook, respectively.

Finally, choose any of the additional options:

 To present data in the output table in descending order of frequency,

select the Pareto (sorted histogram) box.
 To include a cumulative percentage line in your Excel histogram chart,
select the Cumulative Percentage box.
 To create an embedded histogram chart, select the Chart Output box.

ORDER NO. DELIVERY (DAYS)

101 4
102 20
103 13
104 6
105 25
106 5
107 27
108 9
109 5
110 17
111 14
112 2
113 13
114 5
115 7
116 17
117 28
118 23
119 7
120 26
121 40
122 3
123 12
124 7

CHART OUTPUT AND CUMULATIVE FREQUENCY-

60
PARETO

61
DESCRIPTIVE STATISTICS

62
Descriptive statistics are one of the fundamental “must knows” with any set of data. It gives you a general idea
of trends in your data including:

-The mean, mode, median and range.

-Variance and standard deviation.
-Skewness.
-Count, maximum and minimum.

Consider the following data:-

By using if function, we assign codes to gender

Using nested if function, assign qualification code

63
Using vlookup, find gender code

To run a descriptive analysis test , go to Data analysis> Descriptive statistics

Then, click ok

64
Repeat the same with qualification and work experience

65
8.CORRELATION
We usually use correlation coefficient (a value between -1 and 1) to display how strongly two variables are related to each other. In
Excel, we also can use the CORREL function to find the correlation coefficient between two variables.

66
67
68
HYPOTHESIS
TESTING

ONE SAMPLE T TEST USING DUMMY (ONE TAILED)

The t test is a way to tell if the difference between before and after results is significant or if those
results could have happened by chance.
PROBLEM STATEMENT:- To establish that the mean work experience is greater than 20

69
4) Select work experiences as input range 1 and dummy column as input range 2.
5) Enter Hypothesised mean difference as ‘20’ and Alpha as ‘0.05’. Select the output range and click
‘OK’. Click ok.

For one tailed testing, highlight the one tailed p value and strikethrough the two tailed p value.
HYPOTHESIS –
H0 : μ ≤ 20
H1 : μ > 20

DECISION RULE:
1) If t stat is greater than t critical, reject Null.
2) If P value < alpha, reject Null.

INFERENCE: Since, absolute value of t stat is greater than t critical, we will reject null. Therefore,
we can say that, μ > 20

70
Therefore, reject null.

ONE SAMPLE T TEST USING DUMMY (TWO TAILED)

PROBLEM STATEMENT: - To know if the average work experience of workers is less than 20 or
not using dummy variables

STEPS :
1) Consider the data given of work experiences of 24 people.
2) Create a second column under the name dummy and fill in values as 0
3) Go to data analysis >t test two sample assuming unequal variances.
4) Select work experiences as input range 1 and dummy column as input range 2.
5) Enter Hypothesised mean difference as ‘20’ and Alpha as ‘0.05’. Select the output range and click
‘OK’. Click ok.

For two tailed testing, highlight the one tailed p value and strikethrough the two tailed p value.
HYPOTHESIS:-
H0 : µ ≤ 20
H1 : µ > 20

DECISION RULE:
1) If t stat is greater than t critical, reject Null.
2) If P value < alpha, reject Null.

71
INFERENCE:-
Since p value is smaller than alpha therefore, reject null. i.e, H 1 : µ > 20, ie mean experience is greater
than 20

Therefore, reject null.

ONE SAMPLE T – TEST using Test average (ONE

TAILED)
PROBLEM STATEMENT:- To know if the average work experience of workers is less than 20 or
more than 20, By taking test average as 20

STEPS : 1) Consider the data given of work experiences of 24 people.

2) Add a column of Test average and enter “20” in entire column corresponding to the observations.
3) Data < Data Analysis < “t - Test : Two sample assuming unequal variances”.Click OK.
4) Select work experiences as input range 1 and Test average column as input range 2.
5) Enter Hypothesised mean difference as ‘20’ and Alpha as ‘0.05’. Select the output range and click
‘OK’.

72
HYPOTHESIS - H0 : µ ≤ 20
H1 : µ > 20

DECISION RULE:
1) If t stat is greater than t critical, reject Null.
2) If P value < alpha, reject Null.
INFERENCE: P value is greater than null, so we’ll accept Null.

73
ONE SAMPLE T – TEST using Test average (TWO
TAILED)
PROBLEM STATEMENT:- To know if the average work experience of workers is less than 20 or
more than 20, taking test average as 20.

STEPS : 1) Consider the data given of work experiences of 24 people.

DECISION RULE: 1) If t stat is greater than t critical, reject Null.

2) If P value < alpha, reject Null.
INFERENCE:
P value is greater than null, so we’ll accept Null.

t-Test: Two-Sample Assuming unequal Variances

PROBLEM STATEMENT:- To analyse that time spent by full time students in studying statistics is
less than part time students or not.

STEPS:-
1. Go to descriptive statistics> t test : two sample assuming unequal variances

74
2. Select the input and output range, take alpha as 0.05 and hypothesised mean difference as
0.
3. Hit OK.

HYPOTHESIS.

H0= time spent by full time students in studying statistics is not less than part time students
H1= time spent by full time students in studying statistics is less than part time students
DECISION RULE:- If p value is less than alpha, reject null, and vice versa.

INFERENCE:- since p value is more than alpha, so accept null. That is, time spent by full
time students in studying statistics is not less than part time students.

75
t-Test: Two-Sample Assuming Equal Variances
Independent samples
PROBLEM STATEMENT:-
To determine if there is a relation between marks in different subjects.
HYPOTHESIS:-
H0= There is no relation between marks in two subjects
H1= There exists a significant relation between marks scored in two subjects.
consider the following data

ECONOM SCIEN HISTO

ICS CE RY
42 69 35
53 54 40
49 58 53
53 64 42
43 64 50
44 55 39
45 56 55
52 39
54 40

76
STEPS:-
1. Go to descriptive statistics > t test assuming equal variances
2. Select input range and output range. Select hypothesised mean difference as 0 and alpha
as 0.05
3. Click ok.

77
Repeat the steps for other combinations as well.

DECISION RULE if p value is less than alpha, reject null.

INFERENCE:- 1. Reject null, in case of economics-science and science-history since the value is
less than alpha
2. Accept null in case of economics-history since p value is greater then alpha.

78
TWO SAMPLE : PAIRED SAMPLE T – TEST

RESEARCH PROBLEM : Determine whether the weight loss diet was effective or not, given the
weights before and after the diet.

BEFOR
E AFTER
162 168
170 136
184 147
164 159
172 143
176 161
159 143
170 145

Hypothesis :
H1 = µbefore - µafter ˂ 0
H0 = µbefore - µafter ≥ 0
Alpha : 0.05
Hypothesized mean difference : 0

t – Test : Paired Two sample for Means

79
Therefore, as per the rule, we here reject H0
= Diet was effective.

Z-TEST
A Z-test is a hypothesis test based on the Z-statistic, which follows the standard normal distribution
under the null hypothesis.

z-Test: one Sample for Means

PROBLEM STATEMENT :- The following is the age in years of 35 employees. Determine whether
or not population mean a differs significantly from 23. assume population standard deviation to be 5
and alpha 10 %

Null= mean age is 23

Alternate = age is not 23

80
INFERENCE:- Since P value is more than alpha, accept null.
There is significant evidence that population mean age does not differ form 23

z-Test: Two Sample for Means

81
Alpha= .01
Sol:
The parameter to be tested is the difference between two means µ1- µ2
The hypotheses to be tested is that the mean annual net return from directly purchased mutual funds
(µ1) is larger than the mean of broker purchased funds. Hence the alternate hypotheses is
H0=µ1-µ2≤0
H1=µ1-µ2≥0
If z stat is greater than z critical (one tail) , reject null
If z stat is less than z critical one tail, accept null.

82
Null- directly purchased mutual funds do not outperform.
Alternate- directly purchased mutual funds outperform.

Inferences :- Since p value is less than alpha so Final answer is to reject null.
Ans- Directly performed mutual funds outperform brokers.
the value of the test statistic is 2.29. the one tail p value is 0.0110
We observe that the p value of the test is small (and the test statistics falls into the rejection region.)
As a result we conclude that there is sufficient evidence to infer that on average directly purchased
mutual funds outperform broker purchased mutual funds.

ANOVA TEST
Analysis of Variance (ANOVA) is a statistical method used to test differences between two or more
means.
ANOVA is used to test general rather than specific differences among means.

ANOVA SINGLE FACTOR

PROBLEM STATEMENT:
Here you can find the marks of students in economics, science or history. Determine whether
the Means of marks are equal or not.

Consider the following data

STUDE ECONO SCIEN HISTO

83
NT MICS CE RY
A 42 69 35
B 53 54 40
C 49 58 53
D 53 64 42
E 43 64 50

Hypothesis Testing:
H1:At least one of the means is different.
H0 :μ1 = μ2 = μ3

STEPS:-
1. Go to data analysis > anova single factor
2. Put table as Input.
3. Keep alpha as 0.05. Click ok.

84
DECISION RULE:
1) If f stat is greater than f critical, reject Null.
2) If P value < alpha, reject Null.

INFERENCE:
F > F crit , So we will reject Null. This implies that mean marks of all subjects are not equal.
However, this does not tell us the subjects in which the mean marks are different, so for this we will
conduct 3 pairs of t-test assuming equal variances between each pair of subject so as to know the
subjects in which mean marks are different.

ANOVA: two factor without replication

Problem statement:
To test whether or not marks of students differ with respect to student and subject both.

Student Economics Science History

s
A 42 69 35
B 53 54 40
C 49 58 53
D 53 64 42
E 43 64 50

85
Hypothesis Testing:
H0 -Row wise: There is no significant difference in marks of students.
H0: There is no significant difference in marks for three subjects- Economics, Science and
History.
H1- Row wise: There is significant difference in marks of students.
H1: There is significant difference in marks for three subjects- Economics, Science and
History.

STEPS:-
1. Go to data analysis> anova: Two factor without replication
2. Put table as Input.
3. Keep alpha as 0.05. click ok.

86
Result:

Anova: Two-Factor Without

Replication

SUMMARY Count Sum Average Variance

A 3 146 48.66667 322.3333

B 3 147 49 61

C 3 160 53.33333 20.33333

D 3 159 53 121

E 3 157 52.33333 114.3333

Economics 5 240 48 28

Science 5 309 61.8 34.2

History 5 220 44 54.5

ANOVA
Source of Variation SS df MS F P-value F crit
Rows 60.93333 4 15.23333 0.300263 0.869889 3.837853
Columns 872.1333 2 436.0667 8.595269 0.010172 4.45897
Error 405.8667 8 50.73333

Total 1338.933 14

87
DECISION RULE:

1) If f stat is greater than f critical, reject Null.

2) If P value < alpha, reject Null.

Inference:
Row wise:
Here, F Stat is 0.30 and F critical is 3.83, so Null hypothesis is accepted.
Here, P value is 0.8 which is greater than alpha (5%). Therefore, Null hypothesis is accepted.
Column wise:
Here, F Stat is (8.595) and F critical is (4.458), so Null hypothesis is rejected.
Here, P value is (0.10) which is less than alpha (5%). Therefore, Null hypothesis is rejected.

Conclusion:
Row wise: There is enough evidence that marks of student do not differ significantly.
Column wise: There is enough evidence that marks for three subjects- Economics, Science
and History differ.
Row Wise: There is no significant relation in the marks of the students.
Column Wise: There is significant relation in the marks for three subjects.

88
ANOVA: two factor with replication
Problem statement:-
To check if there is a significant relation between area and tests by using two factor anova.

STEPS:-
1. Go to data analysis > anova Two factor with replication
2. Put table as Input.
3. In rows, write total rows per sample.
4. Keep alpha as 0.05. click ok.

89
HYPOTHESIS:-
µ0 = there is no significant relation between area and tests by using two factor anova.

µ1= there is a significant relation between area and tests by using two factor anova.

Rules:-
1. If p value is less than alpha, reject null.
2. If f value is greater than 5 % , reject null.

Inference:-
Since p value is more than alpha, so accept null. That is, there is no significant relation
between area and tests by using two factor anova.

F TEST
The objective of the test to determine the likelihood of a value in a sample, given that the null
hypothesis is true . An F test is a statistical test that compares the variance of two samples so
as to test the hypothesis that the samples have been taken from populations with different
variance. Its basic purpose is to check for differences between sample variances.

90
Direct Broker Any statistical test in which the test statistic has an F distribution under
9.33 3.24 null hypothesis is called an F test. The F test distribution is named after
6.94 -6.76 R.A Fisher, the famous statistician.
16.17 12.8
16.97 11.1
5.94 2.73 Data
12.61 -0.13
3.33 18.22
16.13 -0.8
11.2 -5.75
1.14 2.59
4.68 3.71
3.09 13.15
7.26 11.05
2.05 -3.12
13.07 8.94
0.59 2.74
13.57 4.07
0.35 5.6
2.69 -0.85
18.45 -0.28
4.23 16.4
10.28 6.39
7.1 -1.9
-3.09 9.49
5.6 6.7
5.27 0.19
8.09 12.39
15.05 6.54
13.21 10.92
1.72 -2.15
14.69 4.36
-2.97 -11.07
10.37 9.24
-0.63 -2.67
-0.15 8.97
0.27 1.87
4.59 -1.53
6.38 5.23
-0.24 6.87
10.32 -1.69
10.29 9.43
Two sample - testing of Variance - F test - One
4.39 8.31 tailed
-2.06 -3.99
7.66 -4.44
10.83 8.63
14.48 7.06
4.8 1.57 91
13.12 -8.44
-6.54 -5.72
-1.06 6.95
PROBLEM STATEMENT : Can we conclude at 5% level that variance of returns of directly
purchased mutual funds is higher than mutual funds bought through brokers?

H0 : Variance of directly purchased mutual funds is (less than) equal to mutual funds bought through brokers.
H1 : Variance of directly purchased mutual funds is higher that mutual funds bought through brokers.

Rules:-
1. If p value is less than alpha, reject null.
2. If f value is greater than 5 % , reject null.

92
P value is higher than Alpha, so we will accept Null, ie :
Variance of directly purchased mutual funds is (less than) equal to mutual funds bought through brokers.

Two sample - testing of Variance - F test - Two tailed

PROBLEM STATEMENT : Can we conclude at 5% level that variance of returns of

directly purchased mutual funds and mutual funds bought through brokers is same?

H0 : There is No difference in variances.

H1 : There is Difference in variances.

STEPS:-
1. Go to Data > Data analysis > F test; Two sample for variances
2. Select variable 1 range as direct and variable 2 range as broker
3. select alpha as 0.05
4. Hit enter

93
The value of test statistic is F = 0.86499. Excel outputs one-tail p value.
Because we are conducting a two tail test, we will double the p value one tail.
So : 2*0.30684438

INFERENCE:- P value is higher than Alpha, so we will accept Null, ie : There is no difference in
values.
FINAL ANSWER-

CHI SQUARE TEST

94
The Chi-square test is intended to test how likely it is that an observed distribution
is due to chance. It is also called a "goodness of fit" statistic, because it measures
how well the observed distribution of data fits with the distribution that is expected
if the variables are independent

Objective :- To Determine whether brand preference is independent of age group.

Consider the following given data showing brand preferences with respect to age.
Problem statement : To find whether or not there is association between age groups and brand
preference
Data:-

Brand Brand Brand Row

Age/Brand 1 2 3 Total
15-25 65 76 72 213
26-35 60 40 64 164
36-45 45 52 50 147
46-55 55 65 60 180
Column
total 225 233 246 704

STEPS : 1) Calculate the row total and column totals. Also find the grand total of all the totals.
2) Calculated expected for each observation through the following formula :
expected =(row total*column total)/table total
3) Calculate Observed – Expected

95
4) Calculate (O-E)^2/E.
5) Calculate the sum of (O-E)^2/E.

6) Calculate p value using the formula : “=chitest(A15:A26, B15:B26)” and then press ENTER

DECISION RULE :

 If chi square statistics is greater than tabulated value reject null.

 If p value is less than alpha, reject Null.

HYPOTHESIS

96
Null : There is no association between brand preference and age.
Alternate : There is association between brand preference and age.

pvalue:- 0.768154

INFERENCE : p value is greater than alpha, so accept null.

pvalue:
- 0.768154

INFERENCE: p value is greater than alpha, so accept null. That is,

There is no association between brand preference and age.

97
INTRODUCTI
ON TO R

WHAT IS R ?
R is the most popular data analytics tool as it is open-source, flexible, offers multiple packages and
has a huge community. But apart from being used for analytics, R is also a programming language.

DOWNLOADING R
To download R, go to-
https://cran.r-project.org/bin/windows/base/

98
To Download R studio desktop for windows, go to-
www.rstudio.com

OR:-

In order to install R Studio, execute the following steps:

1.Go to link https://www.rstudio.com/products/rstudio/download/
2.Click on R Studio Desktop (Open Source License).

99
3. Open the downloaded .exe file and Install R.

3.Click on Windows10 Option in Installers.

100
101
FOUR PANES IN R

When we open RStudio, we see the four panes. We can change the order of the windows
under RStudio preferences. We can also change their shape by either clicking the minimize
or maximize buttons on the top right of each panel, or by clicking and dragging the middle
of the borders of the windows.

102
The RStudio interface consists of four main panes, or windows.

Source –

Top left text editor or script window. This is where you can save and edit collections of
commands.

Console :

Bottom left: console or command window. Here you can type any valid R command after
the > prompt followed by Enter and R will execute that command.

Environment / History :-

Environment & history window. The environment window contains objects (data, values,
functions) R has currently stored in its memory. The history window shows all commands that were
executed in the console.

Files / Plots / Packages :-

Bottom right: files, plots, packages, help, & viewer pane. Here you can open files, view
plots, install and load packages, read man pages, and view markdown and other documents in
the viewer tab.

IMPORTING DATA INTO R:-

1.In R Studio, go to File tab on the left hand top corner of the displayed window.

2.In File tab, go to Import Dataset and select “From Excel” option.

103
3.A pop up window for Import Excel Data will appear.

4.Browse through various files and open the required excel file.

104
5.Click on Import after previewing the data.

6.The file will be imported to RStudio.

105
TO ADD VARIABLE IN R:-
a<-c(5,6,24,16,17,10,23,11,17,3,21,18,18,12,12,17,10,3,7,13,23,9,22,8)

Where in, “a” is the name of the variable

And the data under the brackets is individual scores , that will consist if the data set.

DESCRIPTIVE STATISTICS
Data:
Group
A
76
87
98
45
66
78
76
88
78

106
87
54
65
76
89
65
78
54
87
45

Using Excel Function:

Group A

Mean 73.26315789
Standard Error 3.530023363
Median 76
Mode 76
Standard Deviation 15.38701511
Sample Variance 236.7602339
Kurtosis -
0.589566068
Skewness -
0.521029258
Range 53
Minimum 45
Maximum 98
Sum 1392
Count 19

Using R Studio:
Mean:
> mean(rm_lab$`Group A`)
[1] 73.26316
Median:
> median(rm_lab$`Group A`)
[1] 76
Standard Deviation:
> sd(rm_lab$`Group A`)
[1] 15.38702

107
Minimum:
> min(rm_lab$`Group A`)
[1] 45
Maximum:
> max(rm_lab$`Group A`)
[1] 98
Sum:
> sum(rm_lab$`Group A`)
[1] 1392
Range:
> range(rm_lab$`Group A`)
[1] 45 98
Sample Variance:
> var(rm_lab$`Group A`)
[1] 236.7602

CORRELATION
To find correlation between mutual funds purchased by brokers and purchased directly

SYNTAX:-
cor.test(direct_broker$Direct,direct_broker$Broker

108
Type the command in the console

Hit enter.

Following is the output received:-

109
Pearson's product-moment correlation

data: direct_broker$Direct and direct_broker$Broker

t = 1.2335, df = 48, p-value = 0.2234
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.1083480 0.4325304
sample estimates:
cor
0.1752861

Correlation comes out to be - 0.1752861

T TEST- TWO INDEPENDENT SAMPLE

Problem statement- The net annual returns (the returns on investment after deducting all relevant
fees) in percentages are given.
Can investors do better by buying mutual funds directly from banks or other financial institution than
by purchasing mutual funds through brokers.
Can we conclude at 5 % significance level that directly purchased mutual funds outperform mutual
funds bought through brokers?
HYPOTHESIS
H0 : Variance of directly purchased mutual funds is (less than) equal to mutual funds bought through brokers.
H1 : Variance of directly purchased mutual funds is higher that mutual funds bought through brokers.

SYNTAX
> t.test(rm_lab$`FULL TIME`,rm_lab$`PART TIME`)

Rules:-
1. If p value is less than alpha, reject null.
2. If f value is greater than 5 % , reject null.

110
Following is the output received:-

Welch Two Sample t-test

data: rm_lab$`FULL TIME` and rm_lab$`PART TIME`

t = 0.42248, df = 31.769, p-value = 0.6755
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.107013 1.686179
sample estimates:
mean of x mean of y
3.583333 3.293750

INFERENCE
P value is greater than alpha so accept null, i.e Variance of directly purchased mutual
funds is (less than) equal to mutual funds bought through brokers

T TEST- TWO INDEPENDENT SAMPLE-ONE TAILED

111
HYPOTHESIS
H0 : Variance of directly purchased mutual funds is (less than) equal to mutual funds bought through brokers.
H1 : Variance of directly purchased mutual funds is higher that mutual funds bought through brokers.

SYNTAX
t.test(rm_lab$`FULL TIME`,rm_lab$`PART TIME`,alternative = "greater")

Rules:-
1. If p value is less than alpha, reject null.
2. If f value is greater than 5 % , reject null.

112
Welch Two Sample t-test

data: rm_lab$`FULL TIME` and rm_lab$`PART TIME`

t = 0.42248, df = 31.769, p-value = 0.3378
alternative hypothesis: true difference in means is greater than 0
95 percent confidence interval:
-0.8717293 Inf
sample estimates:
mean of x mean of y
3.583333 3.293750

INFERENCE:- Since p value is more than alpha so accept null. i.e, Variance of directly purchased
mutual funds is (less than) equal to mutual funds bought through brokers

113
Alternative:-

T TEST TWO INDEPENDENT SAMPLE- TWO

TAILED
Problem statement- The net annual returns (the returns on investment after deducting all relevant
fees) in percentages are given.
Can investors do better by buying mutual funds directly from banks or other financial institution than
by purchasing mutual funds through brokers.
Can we conclude at 5 % significance level that directly purchased mutual funds outperform mutual
funds bought through brokers?
HYPOTHESIS
H0 : Variance of directly purchased mutual funds is (less than) equal to mutual funds bought through brokers.
H1 : Variance of directly purchased mutual funds is higher that mutual funds bought through brokers.

Rules:-
1. If p value is less than alpha, reject null.
2. If f value is greater than 5 % , reject null.

114
INFERENCE:-
Since p value is greater than alpha so accept null. i.e, Variance of directly purchased mutual
funds is (less than) equal to mutual funds bought through brokers.

115
Independent Paired t-test
Problem:To analyse that the time spent by full time students in studying statistics is different
as time spent by part time students.
HYPOTHESIS.

Result:- p value is greater than alpha

INFERENCE:- since p value is more than alpha, so accept null. That is, time spent by full
time students in studying statistics is not less than part time students.

One sample t test (one tailed)

PROBLEM STATEMENT:- To establish that the mean work experience is greater than 20
HYPOTHESIS

H0 = µ is not equal to 20
116
H1 = µ is equal to 20

SYNTAX:

> t.test(a,mu=20)

DECISION RULE:-
if p value is less than alpha, accept null. If p value is greater than alpha, reject null.

The following output is obtained:

One Sample t-test

data: a
t = -4.8471, df = 23, p-value = 6.817e-05
alternative hypothesis: true mean is not equal to 20
95 percent confidence interval:
10.78539 16.29794
sample estimates:
mean of x
13.54167

INFERENCE: - p value is less than alpha so reject null. Therefore mean is equal to 20.

117
One sample t test (two tailed)
PROBLEM STATEMENT:- To establish that the mean work experience is greater than 20
HYPOTHESIS

H0 = µ is not equal to 20
H1 = µ is equal to 20

SYNTAX:-

> t.test(a,mu=20,alternative = "greater")

The following data is obtained:-

One Sample t-test

data: a
t = -4.8471, df = 23, p-value = 1
alternative hypothesis: true mean is greater than 20
95 percent confidence interval:
11.25811 Inf
sample estimates:
mean of x
13.54167

DECISION RULE:-
if p value is less than alpha, accept null. If p value is greater than alpha, reject null.
INFERENCE:-
P value is greater than alpha, so accept null .

118
By default conf level= 95%
To change, syntax

- (a,mu=20,conf.level=0.99)

ONE WAY ANOVA

Problem:
Determine whether the Means of Population are equal or not.
Data:
Economi Medicin Histor
cs e y
42 69 35
53 54 40
49 58 53
53 64 42
43 64 50
44 55 39
45 56 55
52 39
54 40

119
Adding Data:-
SYNTAX:-
Group1=c(42,53,49,53,43,44,45,52,54)
> Group2=c(69,54,58,64,64,55,56,0,0)
> Group3=c(35,40,53,42,50,39,55,39,40)
> combinedgroup=data.frame(cbind(Group1,Group2,Group3))
> summary(combinedgroup)

Then Stack group, using the syntax:-

>stack(combinedgroup)

120
To run one way anova test, type the following command
> stackedgroup=stack(combinedgroup)

> anovaresults=aov(values~ind,data=stackedgroup)

> summary(anovaresults)

Hypothesis Testing:
H1:At least one of the means is different.
H0:μ1 = μ2 = μ3
Following output is received:-

Df Sum Sq Mean Sq F value Pr(>F)

ind 2 101 50.33 0.189 0.829
Residuals 24 6386 266.08

Inference:
Since F stat (0.189) is less than F-critical (3.402). Therefore, accept Null hypothesis.
Since P (0.828) is greater than alpha (0.05). Therefore, accept Null hypothesis.

Conclusion:
The means of the population are equal.

121
F TEST( One tailed)
PROBLEM STATEMENT:- Determine whether or not there is a significant difference
between variances of two data sets.

Syntax
Var.test(file_name)$variable,file_name$variable

Following is the syntax used here:-

var.test(direct_broker$Direct,direct_broker$Broker)

Following is the output received:-

F test to compare two variances

data: direct_broker$Direct and direct_broker$Broker

F = 0.86499, num df = 49, denom df = 49, p-value = 0.6137
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.490863 1.524281
sample estimates:
ratio of variances
0.8649931

HYPOTHESIS:-

122
H0- True ration of variances is equal to 1
H1- true ratio of variances is not equal to 1
INFERENCE:-
P value is greater than alpha, so accept null . i.e,

Variance of directly purchased mutual funds is (less than) equal to mutual funds bought through brokers.

F TEST( Two tailed)

PROBLEM STATEMENT : Can we conclude at 5% level that variance of returns of directly
purchased mutual funds is higher than mutual funds bought through brokers?

HYPOTHESIS TESTING

DECISION RULE
If p value is less than alpha so reject null and vice versa
SYNTAX
var.test(direct_broker$Direct,direct_broker$Broker,alternative = "less")

123
Following is the OUTPUT received:-

F test to compare two variances

data: direct_broker$Direct and direct_broker$Broker

F = 0.86499, num df = 49, denom df = 49, p-value = 0.3068
alternative hypothesis: true ratio of variances is less than 1
95 percent confidence interval:
0.000000 1.390294
sample estimates:
ratio of variances
0.8649931

INFERENCE:-
P value is greater than alpha, so accept null . i.e,

Variance of directly purchased mutual funds is (less than) equal to mutual funds bought through brokers.

PEARSON’S CHI SQUARE TEST

The Chi-square test is intended to test how likely it is that an observed distribution
is due to chance. It is also called a "goodness of fit" statistic, because it measures
how well the observed distribution of data fits with the distribution that is expected
if the variables are independent
124
PROBLEM STATEMENT:- to find association between age and brand preferences
Take alpha as 0.05

SYNTAX
> age=rbind(c(65,76,72),c(60,40,64),c(45,52,50),c(55,65,60))
> dimnames(age)<-list(agegroup=c("a","b","c","d"),brand=c("1","2","3"))
> age

To run chi square test, enter the following command:

> chisq.test(age)

Hypothesis Testing:
H0: Not Associated
H1: Associated
Decision Rule:
If chi value is greater than table value reject null.
If p value is less than α then reject null.

125
Following output is received:

pearson's Chi-squared test

data: age
X-squared = 7.3726, df = 6, p-value = 0.2878

Since p value is greater than alpha, so accept null. That is, there is no association between age group
and brand preference.

126
127

The Case Against the Sexual Revolution 1st Edition Louise Perry pdf download
100% (1)
The Case Against the Sexual Revolution 1st Edition Louise Perry pdf download
43 pages
Ph.D. Viva Presentation
100% (1)
Ph.D. Viva Presentation
71 pages
Research Methododlgy Lab File
No ratings yet
Research Methododlgy Lab File
115 pages
Week 2 Class 2
No ratings yet
Week 2 Class 2
23 pages
Prac 4 fca
No ratings yet
Prac 4 fca
11 pages
Spreadsheet Functions - All Functions
No ratings yet
Spreadsheet Functions - All Functions
43 pages
Day 12 - Lookup and Reference Functions
No ratings yet
Day 12 - Lookup and Reference Functions
19 pages
Excel Functions
No ratings yet
Excel Functions
14 pages
Excel Command
No ratings yet
Excel Command
17 pages
Data Cleaning and Visualization
No ratings yet
Data Cleaning and Visualization
19 pages
Microsoft+Excel+Data+Analysis+Cheat+Sheet
No ratings yet
Microsoft+Excel+Data+Analysis+Cheat+Sheet
7 pages
Ifs Function: Syntax
No ratings yet
Ifs Function: Syntax
3 pages
Shivam Gupta 117 RM Lab File
100% (1)
Shivam Gupta 117 RM Lab File
86 pages
Formulas and Functions
No ratings yet
Formulas and Functions
4 pages
ETech Week5 Ver1
No ratings yet
ETech Week5 Ver1
25 pages
CSIT Module II Notes
No ratings yet
CSIT Module II Notes
7 pages
Excel of Data Analytics, Basic Functions and Formulas_Part 1
No ratings yet
Excel of Data Analytics, Basic Functions and Formulas_Part 1
6 pages
E Tech Lesson 5 Productivity Tools Excel
No ratings yet
E Tech Lesson 5 Productivity Tools Excel
39 pages
Excel Formulae
No ratings yet
Excel Formulae
6 pages
Excel - Part IV
No ratings yet
Excel - Part IV
9 pages
Spreadsheet - Functions
No ratings yet
Spreadsheet - Functions
27 pages
There Are Some Important Functions of Excel
No ratings yet
There Are Some Important Functions of Excel
5 pages
Advanced Spreadsheet Skills: Lesson 2
No ratings yet
Advanced Spreadsheet Skills: Lesson 2
43 pages
Excel Functions (Ism) 1 1
No ratings yet
Excel Functions (Ism) 1 1
13 pages
Excel Functions (Ism)
No ratings yet
Excel Functions (Ism)
13 pages
Lesson 4.advanced Spreadsheet Skills 20231
No ratings yet
Lesson 4.advanced Spreadsheet Skills 20231
36 pages
Dbase & Lookup
No ratings yet
Dbase & Lookup
7 pages
Function Description
No ratings yet
Function Description
2 pages
Asmahaleem - 2196 - 19138 - 1 - Lecture 5 - Excel
No ratings yet
Asmahaleem - 2196 - 19138 - 1 - Lecture 5 - Excel
42 pages
Ch-3 Lookup Functions
No ratings yet
Ch-3 Lookup Functions
21 pages
RM File - Anurag Mishra 01817701718
No ratings yet
RM File - Anurag Mishra 01817701718
123 pages
Excel Functions 1
No ratings yet
Excel Functions 1
45 pages
Advance Excel Lookup
No ratings yet
Advance Excel Lookup
47 pages
Logical Formula(1)
No ratings yet
Logical Formula(1)
6 pages
Excel Function
No ratings yet
Excel Function
19 pages
Example 1
No ratings yet
Example 1
6 pages
ms-excel notes
No ratings yet
ms-excel notes
29 pages
Excel Functions
No ratings yet
Excel Functions
4 pages
Bus100 - Test1 - Scope With Notes
No ratings yet
Bus100 - Test1 - Scope With Notes
4 pages
RM Lab Practical File: Submitted To: Ms. Nitya Khurana Submittedby: Chinmay Maheshwari BBA-IV (Evening)
No ratings yet
RM Lab Practical File: Submitted To: Ms. Nitya Khurana Submittedby: Chinmay Maheshwari BBA-IV (Evening)
30 pages
Unit 1.2
No ratings yet
Unit 1.2
29 pages
Basic Functions and Formulas For Data Manipulation and Analysis
No ratings yet
Basic Functions and Formulas For Data Manipulation and Analysis
10 pages
Creating HR Dashboards Using MS Excel
No ratings yet
Creating HR Dashboards Using MS Excel
16 pages
Advanced Excel - Module 1
No ratings yet
Advanced Excel - Module 1
20 pages
Mastering Excel Training (9 Aug 2024)
No ratings yet
Mastering Excel Training (9 Aug 2024)
33 pages
LESSON 3 Empotech
No ratings yet
LESSON 3 Empotech
15 pages
Cours Excel
No ratings yet
Cours Excel
37 pages
C.functions
No ratings yet
C.functions
17 pages
Excel Functions
No ratings yet
Excel Functions
3 pages
Excel
No ratings yet
Excel
24 pages
Excel Guide
No ratings yet
Excel Guide
21 pages
Ch-15 Spreadsheet Analysis Using MS Excel-Final Version 2018
No ratings yet
Ch-15 Spreadsheet Analysis Using MS Excel-Final Version 2018
83 pages
LabList PartA DA
No ratings yet
LabList PartA DA
40 pages
Lab Record
No ratings yet
Lab Record
18 pages
Lab Manual for Spread Sheet for Engineers for First 6 Experiments
No ratings yet
Lab Manual for Spread Sheet for Engineers for First 6 Experiments
21 pages
Excel Formulas and Functions
No ratings yet
Excel Formulas and Functions
6 pages
Lesson 4 Formatting
No ratings yet
Lesson 4 Formatting
74 pages
Bcom 4 Data Analytics
No ratings yet
Bcom 4 Data Analytics
23 pages
C Functions
No ratings yet
C Functions
14 pages
Advanced Spreadsheets Skills
No ratings yet
Advanced Spreadsheets Skills
17 pages
Introduction to Coding in Hours With Python Level 1: A Guide to Programming for Students With No Prior Experience (Learn Coding Basics With Python)
From Everand
Introduction to Coding in Hours With Python Level 1: A Guide to Programming for Students With No Prior Experience (Learn Coding Basics With Python)
Jack C. Stanely
No ratings yet
Quant Developers' Tools and Techniques: Quant Books, #1
From Everand
Quant Developers' Tools and Techniques: Quant Books, #1
Manfred Hindering
No ratings yet
T Test - Practice
No ratings yet
T Test - Practice
3 pages
"Financial Modelling - Bcom 308": Project Report
No ratings yet
"Financial Modelling - Bcom 308": Project Report
7 pages
"Financial Modelling - Bcom 308": Project Report
No ratings yet
"Financial Modelling - Bcom 308": Project Report
47 pages
RM File
No ratings yet
RM File
28 pages
Powersarj Catalogue..
No ratings yet
Powersarj Catalogue..
18 pages
San Andreas Cheat
No ratings yet
San Andreas Cheat
27 pages
Polaris Indiaxculture
No ratings yet
Polaris Indiaxculture
21 pages
Newsletter(MGN) (1)
No ratings yet
Newsletter(MGN) (1)
32 pages
0 0 91120125712191FeasibilityReport
No ratings yet
0 0 91120125712191FeasibilityReport
304 pages
GEA HRT
No ratings yet
GEA HRT
25 pages
Scoreboarding or SVA?: in A UVM Class-Based Environment
No ratings yet
Scoreboarding or SVA?: in A UVM Class-Based Environment
3 pages
Which is More Historically Reliable, The Bible or the Quran (Debate)
No ratings yet
Which is More Historically Reliable, The Bible or the Quran (Debate)
140 pages
Andhra Pradesh Process Fees Rules
No ratings yet
Andhra Pradesh Process Fees Rules
5 pages
MIdway Report On Working of ATC
No ratings yet
MIdway Report On Working of ATC
21 pages
History Intro - 1
No ratings yet
History Intro - 1
35 pages
jQuery Succinctly 1st Edition by Cody Lindley - Download the entire ebook instantly and explore every detail
No ratings yet
jQuery Succinctly 1st Edition by Cody Lindley - Download the entire ebook instantly and explore every detail
54 pages
5th Semester Transcript2
No ratings yet
5th Semester Transcript2
1 page
Rates of Reaction - Demos
No ratings yet
Rates of Reaction - Demos
20 pages
Sample Schedule of Works
100% (1)
Sample Schedule of Works
23 pages
Chapter 1-Basic Economic Ideas and Resource Allocation
No ratings yet
Chapter 1-Basic Economic Ideas and Resource Allocation
13 pages
VBA - Week 2 Guide
No ratings yet
VBA - Week 2 Guide
5 pages
Frobenius Method and Bessel Function: ODE: Assignment-7
No ratings yet
Frobenius Method and Bessel Function: ODE: Assignment-7
13 pages
Reviewer Board
No ratings yet
Reviewer Board
17 pages
5.1 Sustainable Development
No ratings yet
5.1 Sustainable Development
8 pages
Vapour Liquid Equilibria & Dew Point Calculations: Thermodynamics Ii Project CHE - 03
No ratings yet
Vapour Liquid Equilibria & Dew Point Calculations: Thermodynamics Ii Project CHE - 03
12 pages
Patron Manual
No ratings yet
Patron Manual
39 pages
Investing In Dividends For Dummies, 2nd Edition Lawrence Carrel - The ebook with rich content is ready for you to download
100% (2)
Investing In Dividends For Dummies, 2nd Edition Lawrence Carrel - The ebook with rich content is ready for you to download
32 pages
Foundation and Flooring System
No ratings yet
Foundation and Flooring System
2 pages
Curriculum Vitae: Detya Indrawan
No ratings yet
Curriculum Vitae: Detya Indrawan
1 page
Khaman Dhokla Recipe
No ratings yet
Khaman Dhokla Recipe
5 pages
October 2018 RFBT New Topis MCQ
No ratings yet
October 2018 RFBT New Topis MCQ
43 pages
Accounting For Business Combination - Quiz 2
No ratings yet
Accounting For Business Combination - Quiz 2
1 page