0% found this document useful (0 votes)

73 views

An Introduction To SAS: SAS Environment and Concepts of Libraries

Uploaded by

shekharku

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views

An Introduction To SAS: SAS Environment and Concepts of Libraries

Uploaded by

shekharku

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 99

SAS Environment and Concepts of Libraries

An Introduction to SAS

by
Prof: Prashobhan Palakkeel
Statistical Analysis System (SAS)

Is a set of solutions for enterprise-wide business users for performing:

 Data Entry, Retrieval and Management

 Report writing and graphics

 Statistical and Mathematical Analysis

 Business planning, Forecasting and Decision support

 Operations research and Project management

 Quality improvement

 Applications development
The core of the SAS system is base SAS software, which consists of:

 SAS Language

 SAS Procedures

 SAS Macros

 Data Step debugger

 ODS

 Windowing Environment
The basic components of SAS language are:

 SAS Files

 Data Step

 Procedure Step

 SAS Informats

 SAS Formats

 Variables

 Functions

 Statements

 Miscellaneous(SAS Programs, Outputs, Log and Errors )

SAS

 SAS Programming Environment Contains 6 Main Windows:

1. Project Designer: Shows the Process Flow of a Project in Flow charts

2. Project Explorer: Shows the Process Flow of a Project as Drop Down

3. Code Editor: Used to write and Edit codes

4. Server List: Show the Physical Storage Locations of Data

5. Log Window: Information about the execution of a program and

Lists the errors while execution

6. Output Window: Displays the output of execution of a program

SAS Programs

 SAS programs can be used to access, manage, analyze, or present your data

Layout for SAS Programs

SAS statements are in free format. This means that:

 they can begin and end anywhere on a line

 one statement can continue over several lines

 several statements can be on a line.

 SAS statements can be specified in uppercase or lowercase

 In most situations, text that is enclosed in quotation marks is case sensitive

SAS Libraries

 Every SAS file is stored in a SAS library

 SAS Library is a collection of SAS files

 A SAS data library is the highest level of

organization for information within SAS

 In the Windows and UNIX environments, a

library is typically a group of SAS files in the
same folder or directory.
Storing Files Temporarily or Permanently:

There are two types of libraries in SAS

 Temporary library

 Permanent library

Depending on the library name that is used when create a file, we can store SAS files
temporarily or permanently
Temporary Library:

 Its Temporary Storage Location of a data file

 They last only for the current SAS session

 Work is the temporary library in SAS

 When the session ends, the data files in the

temporary library are deleted

 The file is stored in Work, when:

 No specific library name is used while

creating a file

 Specify the library name as Work

Permanent Library:

 It’s the Permanent storage location of data files

 Permanent SAS libraries are available in

subsequent SAS sessions

 Permanent SAS data libraries are stored until delete

them

 To store files permanently in a SAS data library:

 Specify a library name Other than the

default library name Work

 Three Permanent Libraries provided by SAS are:

 Local
 SASuser
 SAShelp
Creating a Permanent Library:

 To create a permanent library use libname statement

 It creates a reference to the path where SAS files are stored

 The LIBNAME statement is global, which means that the librefs remain in effect until modify
them , cancel them, or end your SAS session

 The LIBNAME statement assigns the libref for the current SAS session only

 Assign a libref to each permanent SAS data library each time a SAS session starts

 SAS no longer has access to the files in the library, once the libref is deleted or SAS session is
ended.

 Contents of Permanent library exists in the path specified

Syntax :

libname <libref> ‘<path>‘ ;

where,

 libref is the name of the library to be created

 It can be 1 to 8 characters long

 Begins with a letter or underscore
 Contains only letters, numbers, or underscores

 path is location in memory to store the SAS files

Example:

libname Taxes ‘C:\Documents and Settings\admin\Desktop\Training‘ ;

Here,

 Taxes - A library reference name

 libname - This keyword assigns the libref taxes to the folder called
training in the path:

‘C:\Documents and Settings\admin\Desktop\Training‘

 Path should be specified in single code

Data lib1.emp;
Length name$ 12;
Input id name$ doj sal;
Informat doj mmddyy8. sal dollar7.;
Format doj date9. sal dollar7.;
Label id = “Employee Id” name = “Employee Name” doj = “Date of Joining”
Sal = “Salary”;
Cards;
1076 abcasdayut 12/23/05 $10,000
1983 aaaertgr 07/12/98 $40,000
1723 xyzasdsf 04/15/98 $25,000
;
Run;
SAS Data Sets

 SAS Data Set is a SAS file which holds Data

 Data must be in the form of a SAS data set to be processed

 Many of the data processing tasks access data in the form of a SAS data set and analyze,
manage, or present the data

 A SAS data set also points to one or more indexes, which enable SAS to locate records in the
data set more efficiently

Rules for SAS Data Set Names:

SAS data set names :

 can be 1 to 32 characters long

 must begin with a letter (A–Z, either uppercase or lowercase) or an underscore (_)
 can continue with any combination of numbers, letters, or underscores.

These are examples of valid data set names:

 Payroll
 LABDATA1995_1997
 _EstimatedTaxPayments3
SAS data set consists of two parts:

 Descriptor portion

 Data portion
Descriptor Portion:

The descriptor portion of a SAS data set contains information about the data set, including:

 The name of the data set

 The date and time that the data set was created
 The number of observations
 The number of variables.

Example: Descriptor portion of the data set Clinic.Insure

Data Set Name: CLINIC.INSURE

Member Type: DATA
Engine: V8
Created: 10:05 Tuesday, March 30, 1999
Observations: 21
Variables: 7
Indexes: 0
Observation Length: 64
Data Portion:

 Collection of data values that are arranged in a rectangular table

Example:

Here,

Jones is a data value, the weight 158.3 is a data value, and so on

Observations:

 Rows are called observations in SAS

 It is a Collections of data values that usually relate to a single object in SAS Data Sets

 The values Jones, M, 48, and 128.6 constitute a single observation in the data set shown
below
Variables:

 Columns are called variables in SAS

 It is a collection of values that describe a particular characteristic

 The values Jones, Laverne, Jaffe and Wilson contribute the variable Name in the data set
shown below
Missing Values:

 If a data is unknown for a particular observation, a missing value is recorded

 “.” (called period) indicates missing value of a numeric variable

 “ “ (blank) indicates missing value of a character variable

Variable Attributes:

 In addition to general information about the data set, the descriptor portion contains information
about the attributes of each variable in the data set

 The attribute information includes the variable's:

 Name
 Type
 Length
 Format
 Informat
 Label

Example: Listing of the attribute information in the descriptor portion of the SAS data set
Clinic.Insure

Variable Type Length Format Informat Label

Policy Num 8 Policy Number

Total Num 8 DOLLAR8.2 COMMA10. Total Balance

Name Char 20 Patient Name

Name:

 Each variable has a name that conforms to SAS naming conventions

 Variable names follow exactly the same rules as SAS data set names

 Like data set names, variable names:

 Can be 1 to 32 characters long

 Must begin with a letter (A–Z, either uppercase or lowercase) or an underscore (_)
 Can continue with any combination of numbers, letters, or underscores.
Type:

 A variable's type is either character or numeric

 Character variables, such as Name (shown below), can contain any values

 Numeric variables, such as Policy and Total (shown below), can contain only numeric values
(the digits 0 through 9, +, -, ., and E for scientific notation)
Length:

 A variable's length (the number of bytes used to store it) is related to its type

 Character variables can be up to 32,767 bytes long

 In the example below, Name has a length of 20 characters and uses 20 bytes of storage.

 All numeric variables have a default length of 8

 Numeric values (no matter how many digits they contain) are stored as floating-point numbers
in 8 bytes of storage, unless specify a different length.
Format:

 A Format is an instruction that SAS uses

to write data values

 Format is used to control the written

appearance of data values, or in some
cases, to group data values together for
analysis

 SAS software offers a variety of

character, numeric, and date and time
formats

 Formats can be created and stored

 Can permanently assign a format to a

variable in a SAS data set, or can
temporarily specify a format in a PROC
step to determine the way the data
values appear in the output
Informat:

 Used to Read data values in certain formats

into standard SAS values

 It determines how data values are read into

a SAS data set

 Informats are used to read numeric values

that contain letters or other special
characters
Label:

 A variable can have a label consisting of descriptive text up to 256 characters long

 By default, many reports identify variables by their names

 To display more descriptive information about the variable assign a label to that variable

Example:
Label Policy as Policy Number, Total as Total Balance, and Name as Patient Name to

display these labels in reports

Referencing Permanent SAS Files

Two-Level Names:

 Two-level name are used to reference a permanent SAS file in SAS programs

There are two parts in a Two-Level Name:

1. Libref name
2. Filename

Libref – Is the name of the SAS data library that contains the file

Filename – Is the name of the file itself

 A period separates the libref and filename

Example:

 Clinic.Admit is the two-level name for the SAS data set Admit

 Admit is assigned to the library named Clinic

Referencing Temporary SAS Files

 To reference temporary SAS files specify the default libref Work, a period, and the filename

Example:

Here,

The two-level name Work.Test references the SAS data set named Test that is stored in the temporary
SAS library Work
One-Level name

 One-level name (the filename only) can be used to reference a file in a temporary SAS library

 When a one-level name is used, the default libref Work is assumed

Example:

Here,

The one-level name Test also references the SAS data set named Test that is stored in the
temporary SAS library Work.
Components of SAS Programs

 SAS Programs contains only two steps:

 Data Step
 Proc Step

 A SAS Program may contain:

 A DATA step
 A PROC step
 Combination of DATA and PROC step
Data Step:

 Typically create or modify SAS data sets and they can also be used to produce custom-designed
reports

DATA steps are used to:

 Put data into a SAS data set

 Compute values

 Check for and correct errors in data

 Produce new SAS data sets by subsetting, merging, and updating existing data sets
Proc Step:

 They pre-written routines that enable us to analyze and process the data in a SAS data set and
to present the data in the form of a report

 PROC steps sometimes create new SAS data sets that contain the results of the procedure

 PROC steps can list, sort, and summarize data

PROC steps are used to:

 Create a report that lists the data

 Produce descriptive statistics

 Create a summary report

 Produce plots and charts

Importing Data for creating SAS Datasets

SAS Data step concepts:

 DATA steps typically create or modify SAS data sets

 Can also be used to produce custom-designed reports.

 SAS DATA steps can be used to:

 put data into a SAS data set

 compute values

 check for and correct errors in your data

 produce new SAS data sets by subsetting, merging, and updating existing data sets.
 A SAS data set can be created by:

 Entering data as input

 Reading existing raw data
 Accessing external files (files that were created by other software)

 The fig below shows how to design and write a DATA step program to create a SAS data set
from raw data that is stored in an external file
Data step:

Data & Set Statements:

 Data & Set statements are used to create a data set

Syntax:

DATA <dataset1> ;
SET <dataset2> ;

Where,
dataset1 is the Destination Data Set

dataset2 is the Source Data Set

Reading Instream Data using Cards and Datalines

 Data can be entered into SAS data set directly through SAS program

 Reading instream data is useful when to create data and test programming statements on a few
observations

 To read instream data use:

 DATALINES statement as the last statement in the DATA step (except for the RUN
statement) and immediately preceding the data lines

 a null statement (a single semicolon) to indicate the end of the input data

 Only one DATALINES statement can be used in a DATA step

 Use separate DATA steps to enter multiple sets of data

 If the data contains semicolons, use the DATALINES4 statement plus a null statement that
consists of four semicolons (;;;;) to indicate the end of the input data
Syntax:

DATA <datasetname>;
INPUT <variablename1>[$] <variablename2>[$] … ;
DATALINES;
.
.
data lines go here
.
.
;
run ;

 After the DATALINES statement specify the data values

 After typing in the values give a semicolon to indicate the end of the data values.

 Can also use Cards instead of datalines

Example:

Data emp_details ;
Input id name$ age ;
Datalines ;
2458 Murray, W 42
2462 Almers, C 38
2501 Bonaventure, T 48
2523 Johnson, R 39
2539 LaMance, K 45
2544 Jones, M 49
;
run ;
Here,
 A dataset called emp_details is created with variables id, name & age, and having 6
observations

 Name is a character variable which is indicated by $ sign after name

Importing Different File types

 SAS GUI can be used to import different file types data such as:

 Excel File

 Comma separated Files (CSV)

Importing Files using PROC IMPORT

 Proc import procedure step can be used to import an external file of different file types

Syntax:

proc import datafile =“ External file path “ out= <dataset name> dbms= <file type> replace;
delimiter= “special character” ; getnames= <yes/no> ; datarow= n ;

Where,
‘External file path’ is the path of the external file to import
‘Out=‘ specifies the dataset to be created using the imported file
‘dbms’ specifies the file type to be imported or ‘dlm’ if delimited files are imported
‘replace’ replaces already existing files
‘getnames=yes’ tells SAS to read the variable names from the first line of the data file
‘delimiter=‘ specifies the delimiter in the external file. It is specified only when the ‘dbms= dlm’ is specified
‘datarow =n’ specifies the row from which the data has to read from the external file. Where, n is a
number
Importing a comma separated file (.csv) :

Example 1:

 Comma separated file is a special external file with file extension .csv (comma separated
variables)

proc import datafile="comma.csv" out= mydata dbms=csv replace;

getnames=no;
run;
Here,

A comma separated file called ‘comma.csv’ is imported

A new dataset called ‘mydata’ is created

‘getnames=no’ indicates that the first row in the file is not variable names

‘replace’ indicated SAS to replace the existing file mydata

Example 2:

 Another way of reading a comma delimited file is to consider a comma as an ordinary delimiter

 Here is a program that shows how to use the dbms=dlm and delimiter=","

proc import datafile="comma1.txt" out=mydata dbms=dlm replace;

Delimiter =",” ;
Getnames =yes ;
Datarow =5 ;
Run ;

Here,
‘comma1.txt’ is a comma separated text file whose variable values are separated by
commas

‘dbms=dlm’ indicates that comma1.txt is a delimiter file

‘delimiter=“,” ‘ indicates the delimiter as “,”

‘Datarow=5’ tell SAS to read data from the 5th row

Import from Tab- Delimitated files (TXT File):

Example:

proc import datafile ="tab.txt" out=mydata dbms=tab replace ;

getnames=no ;
Run ;

Here,

‘tab.txt’ is a tab separated text file

‘dbms=tab’ indicates tab.txt as tab separated file

Data Understanding

Proc Contents Step:

 The CONTENTS procedure is used to create SAS output that describes either of the following:

 The contents of a library

 The descriptor information for an individual SAS data set

 Describes the structure of the data set rather than the data values

 Displays valuable information at the...

 Data set level

 Name
 Engine
 Creation date
 Number of observations
 Number of variables
 File size (bytes)
 Variable level
 Name
 Type
 Length
 Formats
 Position
 Label
Syntax:

Proc Contents Data = libref . _ALL_ NODETAILS;

Run;

Where,

 libref is the libref that has been assigned to the SAS library.

 _ALL_ requests a listing of all files in the library

 A period (.) is used to append _ALL_ to the libref

 NODETAILS (NODS) suppresses the printing of detailed information about each file
when _ALL_ is specified.

 Specify NODS only when you specify _ALL_

Example:

 To view the contents of the Mylib library, submit the following PROC CONTENTS step:

proc contents data = mylib ._all_ nods ;

run ;

 The output from this step lists only the names, types, sizes, and modification dates for the SAS
files in the Mylib library

 To view the descriptor information for the Mylib.Admit data set, submit the following PROC
CONTENTS step:

proc contents data = mylib .admit ;

run ;

 The output from this step lists information for Mylib.Admit data set, including an alphabetic list of
the variables in the data set
Proc Print:

 Prints a listing of the values of some or all of the variables in a SAS data set

Syntax:
proc print data = libref .Datasetname [ (firstobs = n obs = n) split = ‘Special Character’
double label n noobs ] ;

[ Id Variable list ;
Var Variable list ;
By Variable list ;
Sum Varibale list
]
Run ;

Where,

 ‘[ ]’ are optionals

 Libref is the library in which Datasetname is the dataset whose values are to be printed
 Firstobs indicates the starting number of observation to be printed

 Obs indicates the ending number of observation to be printed

 Drop indicates the variables to be dropped

 Keep indicates the variables to be keep

 Split ='split character' - splits labels as column headings across multiple lines where split
character appears

 Double - double spaces the printed output

 Label - uses variable labels as column headings (variable name is default heading)

 N - Lists no: of observations in the specified datasets

 Noobs - suppresses the observation number in the output.

 Id -Identify observations by the formatted values of the variables which can be listed instead of
observation numbers

 Var -Select variables that appear in the report and determine their order

 By - Produce a separate section of the report for each BY group

 Sum - Total values of numeric variables

Example:

proc print data = candy_products (firstobs=1 obs=16 ) n noobs double label ;

id Prodid ;
var Prodid Product Category Retail_price ;
by Category ;
Sum Retail_price ;
Run ;

Here,

 Candy_products is the dataset which is present in work library

 First observation to 16thobservation are printed (firstobs=1 and obs=16)
 N gives the number of observation
 Double - Double spacing between observations printed (only in list input)
 Label - Prints the label of each variable instead of variable names
 Id - Prodid becomes the row identifier instead of observation no:
 Var - Only the variables indicated here are printed
 By - The outputs are grouped by category
 Sum - Sum of the Retail_price
Data Transfer from one library to another

 We can create a new data set from an existing SAS data set

 To create the new data set, read a data set using the DATA step and use the programming features
of the DATA step to manipulate data

 Store the manipulated data to new data set or the same which will overwrite the existing data

Syntax 1:

Data SAS-data-set;
Set SAS-data-set;
Run;
where ,

 SAS-data-set in the DATA statement is the name (libref.filename) of the SAS data set to be
created (Destination Data Set)

 SAS-data-set in the SET statement is the name (libref.filename) of the SAS data set to be
read (Source Data Set)
Example:

libname lab23 ‘ c : \ drug\ allergy \ labtests ‘ ;

libname research ‘ c : \ drug \ allergy ‘ ;
data lab23.drug1h ;
set research.cltrials ;
Run ;

Where

 Lab23 and research are two libraries which are created in two different locations

 The DATA statement creates the permanent SAS data set Drug1H

 Drug1H will be stored in a SAS data library to which the libref Lab23 has been
assigned

 The SET statement below reads the permanent SAS data set Research.CLTrials.
Syntax 2: Data Transfer from one library to another using Proc Copy

proc copy in = libref1 out = libref2 ;

[ select Ds1 Ds2 . . . ; ]
run ;

Where,

 Libref1 is the library from which the data sets are to copied

 Libref2 is the library to which the data sets are to be copied

 Select is an option which selects the data sets Ds1, Ds2, etc form libref1 to libref2

 If Select is not used, all the data sets from libref1 is copied to libref2
Example:

proc copy in = clinic out = work ;

select admit ;
run ;

Here,

 Data Set admit is copied from clinic libref to temporary library work
Manipulating data during data transfers

 Some of the options for manipulating data are:

 Firstobs
 Obs
 Label
 Rename
 Delete
 Drop
 Keep
 by group
 point= option
 Output
 END= option
Firstobs & Obs Data Set Options:

 Firstobs and Obs options are used to select a range of observations from a data set

 It can be used in both Data step and proc step

 When used in Data step the selected observation remain in memory

 When used in proc print step the output displays the selected observations

 Firstobs specifies the starting no: of the observations to be selected

 Obs specifies the ending no: of the observations to be selected

 Firstobs and Obs can be used together to select a range of observations

 If only Firstobs is specified, observations from that position to the end of file are selected

 If only Obs is specified, observations from first to the specified no: are selected
Syntax:

data SAS-Data-Set;
Set SAS-Data-Set (firstobs = n obs = n);
run;

data SAS-Data-Set (firstobs = n obs = n);

Set SAS-Data-Set;
run;
Where,

 SAS-Data-Set in Data Step is the Destination Data set

 SAS-Data-Set in Set Step is the Source Data set

 N ;- Any numeric value

 Firstobs specifies the observation to start with

 Obs specifies the last observation

 Firstobs and Obs options can be used both in Data Step or Set Step
Example:

data candy_products;
set local.candy_products (firstobs=10 obs=100);
run;

Here,

 91 observations are copied from candy_products in local library to candy_products in

work library
Label & Rename Statements:

 Label is a descriptive text given to a variable

 It can be up to 256 characters long

 Label can be assigned temporarily in proc step or permanently using data step

 Label assigned in data step remains in memory and will be shown when the data set is printed
using proc print step

 Rename statement is used to rename a variable in the data set

 Rename statement in data step will permanently rename the variable in the data set
Syntax:

Data libname .dataset-name ;

Set libname .dataset-name ;
Label Variable-Name = <‘ Variable Label’>;
Rename Variable-Name = <New Variable Name>;
Run;
or

proc print data= libname . Dataset-name Label;

Label Variable-Name = <‘Variable Label’>;
Run;
Where,
 ‘Variable Label’ is assigned to Variable specified by Variable-Name in the Label
Statement
 ‘New Variable Name’ is assigned to the Variable specified by Variable-Name in the
Rename Statement
 Label in Data step will write the new label in memory for that variable and will be
displayed when
 Label in proc step will only be displayed when that block of proc step is being executed
 Label option should be specified in proc when using label statement in proc step
Example:

Data demo.class;
Set demo.class ;
Label sizehh = ‘Size of household’;
Rename sizehh = sizehouse;
Run;

proc print data = demo1.class1 Label;

label sizehh = ‘Size of Household’;
run;

Here,

 ‘Size of household’ is assigned as label for the variable Sizehh in Data step

 Sizehh variable is renamed as ‘Sizehouse’ in Data step

 ‘Size of household’ label is assigned for the variable Sizehh temporarily using proc step
which is effective only when that block of code is executed

 Rename Statement can be used only in Data step as it is data modification

DROP= and KEEP= Data Set Options:

 Drop= and Keep= options in data step can be used to drop and keep variables in that data set

 Drop=, omits all variables specified after it

 Keep=, keeps all variables specified after it

 Use the KEEP= option instead of the DROP= option if more variables are dropped than kept

 Specify drop and keep options in parentheses after a SAS data set name

Syntax:
(DROP = variable(s))
(KEEP = variable(s))

where ,
 the DROP= or KEEP= option, in parentheses, follows the name of the data set that contains
the variables to be dropped or kept

 variable(s) identifies the variables to drop or keep

Example:

1. Timemin and Timesec are dropped from the data set clinic.stress

data clinic.stress (drop= timemin timesec);

Set clinic.stress;
Run;

2. Timemin and Timesec are Kept in the data set clinic.stress

data clinic.stress (Keep= timemin timesec);

Set clinic.stress;
Run;
Drop and Keep Statements:

 Another way to exclude variables from data set is to use the DROP statement or the KEEP
statement

 Like the DROP= and KEEP= data set options, these statements drop or keep variables

 The DROP statement differs from the DROP= data set option in the following ways:

 Cannot use the DROP statement in SAS procedure steps

 The DROP statement applies to all output data sets that are named in the DATA
statement.
 To exclude variables from some data sets but not from others, place the appropriate
DROP= data set option next to each data set name that is specified in the DATA
statement.

 The KEEP statement is similar to the DROP statement, except that the KEEP statement specifies
a list of variables to write to output data sets

 Use the KEEP statement instead of the DROP statement if the number of variables to keep is
significantly smaller than the number to drop
Syntax:

DROP variable(s);
KEEP variable(s);

Where,
 variable(s) identifies the variables to drop or keep

Example:

data clinic.stress;
Set clinic.stress;
drop timemin timesec;
Run;

Here,
 Drop statement omits variables timemin and timesec
Data Modifications using conditional statements

Conditional Statement:- Where:

 ‘Where’ statement can be used to select observations during proc step and data step


There can be only one WHERE statement in a step

Syntax:

Where where-expression;

Where,
 where-expression specifies a condition for selecting observations

 The where-expression can be any valid SAS expression

 The WHERE statement works for both character and numeric variables

 WHERE statement is observation level

 To specify a condition based on the value of a character variable:
 enclose the value in quotation marks

 write the value with lowercase and uppercase letters exactly as it appears in the data

set

 Following comparison operators can be used to express a condition in the WHERE statement:

Symbol Meaning Example

= or eq equal to where name='Jones, C.';

^= or ne not equal to where temp ne 212;

> or gt greater than where income>20000;

< or lt less than where partno lt "BG05";

>= or ge greater than or equal to where id>='1543';

<= or le less than or equal to where pulse le 85;

Contains operator in Where:

 The CONTAINS operator selects observations that include the specified substring.

 The mnemonic equivalent for the CONTAINS operator is ‘?’

Example:

where firstname CONTAINS 'Jon';

where firstname ? 'Jon';

Here,

 ‘Firstname’ is the variable name and ‘Jon’ is the value

Compound WHERE Expressions:

 WHERE statements can be used to select observations that meet multiple conditions

 To link a sequence of expressions into compound expressions, use logical operators, including
the following:

Operator Meaning
AND or & and, both. If both expressions are true, then the compound
expression is true.

OR or | or, either. If either expression is true, then the compound

expression is true.
Example:

1. Where with proc step

proc print data = clinic.admit;
var age height weight fee;
where age > 30;
run;

2. Where with data step

data clinic.admit;
set clinic.admit;
where age >30 and pulse >55;
run;

3. Some examples using logical operators:

where ID>1050 and state='NC';

where actlevel = 'LOW' or actlevel = 'MOD';
where actlevel in ('LOW','MOD');
where fee in (124.80,178.20);
where (age<=55 and pulse>75) or area='A';
Conditional Statement:- IF Then Else:

 The IF-THEN statement executes a SAS statement when the condition in the IF clause is true
 comparison and Logical operators can be used in IF conditional expression
 Any numeric value other than 0 or missing is true, and a value of 0 or missing is false

Syntax:

IF expression THEN statement;

[
else IF expression THEN statement;
.
.
else statement;
]

Where,

 expression is any valid SAS expression

 statement is any executable SAS statement
Example:

Data clinic.stress;
Set clinic.stress;
if totaltime > 800 then TestLength = 'Long';
else if 750 <= totaltime <= 800 then TestLength ='Normal';
else if totaltime < 750 then TestLength = 'Short';
Run;

Here,

 ‘Long’ is assigned to variable Testlength if totaltime is greater than 800

 If first IF expression is not true, the control will check the next expression. If true it will
assign and quit the execution

 If first and second IF statements are not true, the control will come to third expression
and assign ‘Short’ to Testlenght
Deleting Unwanted Observations: Delete option

 If Then statement along with Delete option can be used to select observations in a data set and
delete

Syntax:

IF expression THEN DELETE;

If the expression is:

 true, the DELETE statement executes, and control returns to the top of the DATA step
(the observation is deleted).

 false, the DELETE statement does not execute, and processing continues with the next
statement in the DATA step
Example:

Data clinic.stress;
Set clinic.stress;
if resthr < 70 then delete;
Run;

Here,

 The IF-THEN and DELETE statements below omit any observations whose values for
RestHR are lower than 70
Assigning Values Conditionally Using SELECT Groups:

Use IF-THEN/ELSE statements or SELECT groups based on the following criteria.:

 When a long series of mutually exclusive conditions and the comparison is numeric,
using a SELECT group is more efficient than using a series of IF-THEN or IF-
THEN/ELSE statements because CPU time is reduced

 SELECT groups also make the program easier to read and debug.

 For programs with few conditions, use IF-THEN/ELSE statements

Syntax:

SELECT <(select-expression)>;
WHEN-1 (when-expression-1 <..., when-expression-n>) statement;
WHEN-n (when-expression-1 <..., when-expression-n>) statement;
<OTHERWISE statement;>
END;

Where,

 SELECT begins a SELECT group

 The optional select-expression specifies any SAS expression that evaluates to a single value.

 WHEN identifies SAS statements that are executed when a particular condition is true.

 When-expression specifies any SAS expression, including a compound expression

 Must specify at least one when-expression

 Statement is any executable SAS statement.

 The optional OTHERWISE statement specifies a statement to be executed if no WHEN condition is

met.

 END ends a SELECT group

Example:

data emps (keep=salary group);

set sasuser.payrollmaster;
length Group $ 20;
select (jobcode);
when ("FA1") group="Flight Attendant I";
when ("FA2") group="Flight Attendant II";
when ("FA3") group="Flight Attendant III";
when ("ME1") group="Mechanic I";
when ("ME2") group="Mechanic II";
when ("ME3") group="Mechanic III";
when ("NA1") group="Navigator I";
when ("NA2") group="Navigator II";
when ("NA3") group="Navigator III";
when ("TA1","TA2","TA3") group="Ticket Agents";
otherwise group="Other";
end;
run;

 The SELECT group assigns values to the variable Group based on values of the variable JobCode
Appending Data Sets

 It is concatenation of two data sets which are already existing.

 The observation in each data set will stack together according to the order specified to form new
data set

 Appends the observations from one data set to another data set
Syntax:

DATA output-SAS-data-set;
SET SAS-data-set-1 SAS-data-set-2;
RUN;

Where,

 output-SAS-data-set names the data set to be created

 SAS-data-set-1 and SAS-data-set-2 specify the data sets to be read

 SAS-data-set-1 and SAS-data-set-2 gets appended and copies to output-SAS-data-set

Example:

Data combined;
Set A C;
Run;
Appending Data Sets Using Proc Step

 Adding observations using append procedure

 The base file gets appended with observations from data file.

 No new data set is created

 Works only if the base file is having all the variables in the data file, otherwise use force option
Syntax:

Proc Append base = <SAS-data-set-1> data = <SAS-data-set-2> [force];

Run;

Where,

 SAS-data-set-1 and SAS-data-set-2 specify the data sets to be read

 SAS-data-set-2 gets appended to SAS-data-set-1999

 Force is an optional keyword, used when base file is having some variables missing
compared to data file, to force SAS to append
Example:

Proc Append base = A data = C;

Run;
Merging

 A merge combines observations from two or more SAS data sets based on the values of
specified common variables (one or more)

 It creates a new data set (the merged data set)

 Merging is done in a data step with the statements

 MERGE : to name the input data sets

 BY : to name the common variable(s) to be used for matching

 Prerequisites for a match-merge

 input data sets must have a common variable

 input data sets must be sorted by the common variable(s)

 It is also called "match-merge."

Syntax:

DATA output-SAS-data-set;
MERGE SAS-data-set-1 SAS-data-set-2;
BY <DESCENDING> variable(s);
RUN;
Where,

 output-SAS-data-set names the data set to be created

 SAS-data-set-1 and SAS-data-set-2 specify the data sets to be read

 variable(s) in the BY statement specifies one or more variables whose values are
used to match observations

 DESCENDING indicates that the input data sets are sorted in descending order by the
variable that is specified

 If there are more than one variable in the BY statement, DESCENDING applies only to
the variable that immediately follows it

 Each input data set in the MERGE statement must be sorted in order of the values of
the BY variable(s)

 Each BY variable must have the same type in all data sets to be merged
Sorting of Data Set:

 Procedure sort can be used to sort the data sets either ascending or descending

Syntax:

Proc Sort Data = Data-Set-1 [out = Data-Set-2];

By [Descending] Variabel1 [Variable2 …];
Run;

Here,

 Data-Set-1 will be sorted in either ascending or descending order

 If ‘OUT=‘ option is specified then a Data-Set-1 will be copied to Data-Set-2 and will get
sorted there but the original data set (Data-Set-1) remains un sorted.

 ‘By’ statement will sort the data set according to the variables specified

 ‘Descending’ option will sort the data set in descending order by the variable just proceeding
that.
Example:

 During match-merging SAS sequentially checks each observation of each data set to see
whether the BY values match, then writes the combined observation to the new data set

data merged;
merge a b;
by num;
run;
Example: Sample Data Sets:

1. Clinic.Demog

proc sort data=clinic.demog;

by id;
run;
proc print data=clinic.demog;

Obs ID Age Sex Date

1 A001 21 m 05/22/75
2 A002 32 m 06/15/63
3 A003 24 f 08/17/72
4 A004 . 03/27/69
5 A005 44 f 02/24/52
6 A007 39 m 11/11/57
2. Clinic.Visit

proc sort data=clinic.visit;

by id;
run;
proc print data=clinic.visit;
run;

Obs ID Visit SysBP DiasBP Weight Date

1 A001 1 140 85 195 11/05/98
2 A001 2 138 90 198 10/13/98
3 A001 3 145 95 200 07/04/98
4 A002 1 121 75 168 04/14/98
5 A003 1 118 68 125 08/12/98
6 A003 2 112 65 123 08/21/98
7 A004 1 143 86 204 03/30/98
8 A005 1 132 76 174 02/27/98
9 A005 2 132 78 175 07/11/98
10 A005 3 134 78 176 04/16/98
11 A008 1 126 80 182 05/22/98
Example: Merging

data clinic.merged;
merge clinic.demog clinic.visit;
by id;
run;
Obs ID Age Sex Date Visit SysBP DiasBP Weight
1 A001 21 m 11/05/98 1 140 85 195
2 A001 21 m 10/13/98 2 138 90 198
3 A001 21 M 07/04/98 3 145 95 200
4 A002 32 M 04/14/98 1 121 75 168
5 A003 24 f 08/12/98 1 118 68 125
6 A003 24 f 08/21/98 2 112 65 123
7 A004 . 03/30/98 1 143 86 204
8 A005 44 f 02/27/98 1 132 76 174
9 A005 44 f 07/11/98 2 132 78 175
10 A005 44 f 04/16/98 3 134 78 176
11 A007 39 m 11/11/57 . . .
12 A008 . 05/22/98 1 126 80 182
Excluding Unmatched Observations:

 By default, DATA step match-merging combines all observations in all input data sets

 To exclude unmatched observations from output data set, use the IN= data set option and the
subsetting IF statement in DATA step.

In this case, use

 the IN= data set option to create and name a variable that indicates whether the data set
contributed data to the current observation

 the subsetting IF statement to check the IN= values and to write to the merged data set only
those observations that appear in the data sets for which IN= is specified
Syntax:

(IN=variable)

Where,

 the IN= option, in parentheses, follows the data set name

 variable names the variable to be created

 Within the DATA step, the value of the variable is 1 if the data set contributed data to
the current observation. Otherwise, its value is 0.
Example:

To Match-merge the data sets Clinic.Demog and Clinic.Visit and select only observations that
appear in both data sets :

 Use IN= to create two temporary variables, indemog and invisit

 The first IN= creates the temporary variable indemog, which is set to 1 when an observation from
Clinic.Demog contributes to the current observation; otherwise, it is set to 0

 Likewise, the value of invisit depends on whether Clinic.Visit contributes to an observation or not

 IF statement is used to select only observations that appear in both Clinic.Demog and Clinic.Visit

 If the condition is met, the new observation is written to Clinic.Merged. Otherwise, the observation
is deleted

data clinic.merged;
merge clinic.demog (in= indemog) clinic.visit (in=invisit);
by id;
if indemog=1 and invisit=1;
run;
proc print data=clinic.merged;
run;
Output:

Obs ID Age Sex BirthDate Visit SysBP DiasBP Weigh VisitDate

t
1 A001 21 m 05/22/75 1 140 85 195 11/05/98
2 A001 21 m 05/22/75 2 138 90 198 10/13/98
3 A001 21 m 05/22/75 3 145 95 200 07/04/98
4 A002 32 m 06/15/63 1 121 75 168 04/14/98
5 A003 24 f 08/17/72 1 118 68 125 08/12/98
6 A003 24 f 08/17/72 2 112 65 123 08/21/98
7 A004 . 03/27/69 1 143 86 204 03/30/98
8 A005 44 f 02/24/52 1 132 76 174 02/27/98
9 A005 44 f 02/24/52 2 132 78 175 07/11/98
10 A005 44 f 02/24/52 3 134 78 176 04/16/98
Different Types Of Merge

Join Condition Description

Inner Join No condition Includes all the observations from both the dataset

Right Inner Join If Y = 1 Includes all the observations from right dataset

Left Inner Join If X = 1 Includes all the observations from left dataset

Exact Join If X = 1 and Y = 1 Includes all the matching observations from both datasets

Outer Join If X = 0 or Y = 0 Includes all the non matching observations from both datasets

Right Outer Join If X = 0 and Y = 1 Includes all the non matching observations from right dataset

Left Outer Join If X = 1 and Y = 0 Includes all the non matching observations from left dataset

Auth API Documentation
No ratings yet
Auth API Documentation
6 pages
SAS Training - 101
No ratings yet
SAS Training - 101
119 pages
Base SAS Interview Questions You'll Most Likely Be Asked
No ratings yet
Base SAS Interview Questions You'll Most Likely Be Asked
22 pages
SAS Basics
100% (1)
SAS Basics
42 pages
Practical and Efficient SAS Programming: The Insider's Guide
From Everand
Practical and Efficient SAS Programming: The Insider's Guide
Martha Messineo
No ratings yet
SAS Training: SAS Environment and Concepts of Libraries
No ratings yet
SAS Training: SAS Environment and Concepts of Libraries
99 pages
Week 1-PART 1-Review SAS Program Basics
No ratings yet
Week 1-PART 1-Review SAS Program Basics
37 pages
Programs and The SAS Files That They Process. in Particular, You Need To Be Familiar With SAS Data Sets
No ratings yet
Programs and The SAS Files That They Process. in Particular, You Need To Be Familiar With SAS Data Sets
16 pages
S A S SAS: Tatistical Nalytical Ystem
No ratings yet
S A S SAS: Tatistical Nalytical Ystem
30 pages
Basics of SAS 2
No ratings yet
Basics of SAS 2
417 pages
Top 10 SAS Questions - Revisited: #10: What Is A SAS Data Set/file?
No ratings yet
Top 10 SAS Questions - Revisited: #10: What Is A SAS Data Set/file?
8 pages
A Hands-On Introduction To SAS Basics and The SAS Display Manager
No ratings yet
A Hands-On Introduction To SAS Basics and The SAS Display Manager
12 pages
SAS Tips
No ratings yet
SAS Tips
34 pages
Summary Syntax SAS
No ratings yet
Summary Syntax SAS
6 pages
1 Base Programming
No ratings yet
1 Base Programming
4 pages
SAS For Managers Lol
No ratings yet
SAS For Managers Lol
24 pages
Mainframe Sas Online Training 01
No ratings yet
Mainframe Sas Online Training 01
27 pages
SAS BASE Certificate Topic: Introduction To SAS and SAS Fundamental Concepts
No ratings yet
SAS BASE Certificate Topic: Introduction To SAS and SAS Fundamental Concepts
5 pages
Apache Spark Interview Questions Book
No ratings yet
Apache Spark Interview Questions Book
5 pages
SAS Basics - Part 1
No ratings yet
SAS Basics - Part 1
30 pages
Accessing Data: Center of Excellence Data Warehousing
No ratings yet
Accessing Data: Center of Excellence Data Warehousing
108 pages
1-Reading Raw Data in SAS Week1
No ratings yet
1-Reading Raw Data in SAS Week1
75 pages
SAS Basics %28review%29 - big
No ratings yet
SAS Basics %28review%29 - big
24 pages
Introduction To Sas: Reading Assignment: Selected Sas Documentation For Bios111 Part 1: Introduction To SAS Software
No ratings yet
Introduction To Sas: Reading Assignment: Selected Sas Documentation For Bios111 Part 1: Introduction To SAS Software
22 pages
Summary of Lesson 3: Accessing Data: Topic Summaries
No ratings yet
Summary of Lesson 3: Accessing Data: Topic Summaries
3 pages
Based On Learning SAS by Example: A Programmer's Guide: Chapters 3 & 4
No ratings yet
Based On Learning SAS by Example: A Programmer's Guide: Chapters 3 & 4
52 pages
Actions On SAS Datasets: Creating
No ratings yet
Actions On SAS Datasets: Creating
10 pages
Ting Started With Sas
No ratings yet
Ting Started With Sas
7 pages
SAS: What You Need To Know To Write A SAS Program: Data Definition and Options Data Step Procedure(s)
No ratings yet
SAS: What You Need To Know To Write A SAS Program: Data Definition and Options Data Step Procedure(s)
8 pages
SC Sug 96020
No ratings yet
SC Sug 96020
11 pages
(Statistical Analysis System) : By: Kirtikrushna
No ratings yet
(Statistical Analysis System) : By: Kirtikrushna
129 pages
A Hands-On Introduction To SAS Programming: Casey Cantrell, Clarion Consulting, Los Angeles, CA
No ratings yet
A Hands-On Introduction To SAS Programming: Casey Cantrell, Clarion Consulting, Los Angeles, CA
21 pages
SAS Accessing Data
No ratings yet
SAS Accessing Data
108 pages
Sas Programming
No ratings yet
Sas Programming
30 pages
SAS Notes 1
No ratings yet
SAS Notes 1
42 pages
SAS Tutorial: Presented By: Shashi Kumar
No ratings yet
SAS Tutorial: Presented By: Shashi Kumar
78 pages
Chapter 3 SAS ESSENTIALS
No ratings yet
Chapter 3 SAS ESSENTIALS
50 pages
SAS Base Certification Preparatio Document - Part 1
No ratings yet
SAS Base Certification Preparatio Document - Part 1
50 pages
An Introduction To SAS Programming
No ratings yet
An Introduction To SAS Programming
52 pages
SAS Data Set Observation Variable: Temporary Permanent
No ratings yet
SAS Data Set Observation Variable: Temporary Permanent
2 pages
Airtel Dbms (Tables) HR Peoplesoft Finance Oracle Marketing Teradata
No ratings yet
Airtel Dbms (Tables) HR Peoplesoft Finance Oracle Marketing Teradata
55 pages
SAS Certification Prep Guide Review_4Formats
No ratings yet
SAS Certification Prep Guide Review_4Formats
13 pages
Chapter 1: Creating SAS Data Sets - The Basic
No ratings yet
Chapter 1: Creating SAS Data Sets - The Basic
78 pages
SAS Overview Short
No ratings yet
SAS Overview Short
30 pages
EPG1V2_Summary of Lesson 2_ Accessing Data
No ratings yet
EPG1V2_Summary of Lesson 2_ Accessing Data
2 pages
Chapter 1
No ratings yet
Chapter 1
28 pages
S A S Guide
No ratings yet
S A S Guide
33 pages
SAS Overview Short
No ratings yet
SAS Overview Short
30 pages
Sas Programming 1 Essentials Summary
0% (1)
Sas Programming 1 Essentials Summary
41 pages
Day 1
No ratings yet
Day 1
13 pages
26 Run Cody
No ratings yet
26 Run Cody
5 pages
Sas PPT 1
No ratings yet
Sas PPT 1
30 pages
Mtt1 BIVA previous year_theory
No ratings yet
Mtt1 BIVA previous year_theory
7 pages
Chapter 2 SAS ESSENTIALS
No ratings yet
Chapter 2 SAS ESSENTIALS
53 pages
Learn SAS ® in 50 Minutes: Subhashree Singh, The Hartford, Hartford, CT
No ratings yet
Learn SAS ® in 50 Minutes: Subhashree Singh, The Hartford, Hartford, CT
12 pages
Learn SAS ® in 50 Minutes: Subhashree Singh, The Hartford, Hartford, CT
No ratings yet
Learn SAS ® in 50 Minutes: Subhashree Singh, The Hartford, Hartford, CT
12 pages
1 Base SAS Training 12th May 2015
100% (2)
1 Base SAS Training 12th May 2015
494 pages
b600 Save Permanent Data
No ratings yet
b600 Save Permanent Data
6 pages
Base SAS Interview Questions You'll Most Likely Be Asked
From Everand
Base SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Biostatistics by Example Using SAS Studio
From Everand
Biostatistics by Example Using SAS Studio
Ron Cody
No ratings yet
Oracle Essbase 9 Implementation Guide
From Everand
Oracle Essbase 9 Implementation Guide
Joseph Sydney Gomez
No ratings yet
Emirates NBD Private Banking CIO Highlights Long
No ratings yet
Emirates NBD Private Banking CIO Highlights Long
2 pages
Presentation On Telecom Sector
No ratings yet
Presentation On Telecom Sector
19 pages
Derivatives
No ratings yet
Derivatives
3 pages
Budget 2010.Ppt 1
No ratings yet
Budget 2010.Ppt 1
10 pages
Complex Joi in Obiee 11g
No ratings yet
Complex Joi in Obiee 11g
3 pages
FTP Assignment
No ratings yet
FTP Assignment
15 pages
SAPALVUSINGFUNCTIONMODULESDOUBLECLIK
No ratings yet
SAPALVUSINGFUNCTIONMODULESDOUBLECLIK
4 pages
POP Module 3
No ratings yet
POP Module 3
26 pages
Gunshot Bits Computer Architecture Unit 4
No ratings yet
Gunshot Bits Computer Architecture Unit 4
7 pages
Mounting Cdrom Unix
No ratings yet
Mounting Cdrom Unix
7 pages
create database QuanLyGiaoVu
No ratings yet
create database QuanLyGiaoVu
3 pages
Alrtrm
No ratings yet
Alrtrm
136 pages
B sc115
No ratings yet
B sc115
1 page
Extract Article Title From PDF
No ratings yet
Extract Article Title From PDF
2 pages
Sublime Text Unofficial Documentation
No ratings yet
Sublime Text Unofficial Documentation
81 pages
WiX - Installer Framework
No ratings yet
WiX - Installer Framework
100 pages
Utkarsh:Common AS 400 Commands
No ratings yet
Utkarsh:Common AS 400 Commands
3 pages
Lenovo Server Storage Price List 2019 Singapore
No ratings yet
Lenovo Server Storage Price List 2019 Singapore
57 pages
Sliding Window Approach Explained
No ratings yet
Sliding Window Approach Explained
6 pages
Interface Control Document
No ratings yet
Interface Control Document
25 pages
Database
No ratings yet
Database
40 pages
Roboffice PCX GB
No ratings yet
Roboffice PCX GB
33 pages
EDM
No ratings yet
EDM
4 pages
A C++ Crash Course: UW Association For Computing Machinery
No ratings yet
A C++ Crash Course: UW Association For Computing Machinery
57 pages
12 1 2 Repository JT
No ratings yet
12 1 2 Repository JT
244 pages
Getting Started With SAS Text Miner
No ratings yet
Getting Started With SAS Text Miner
102 pages
Week 1 - Computer Appreciation
No ratings yet
Week 1 - Computer Appreciation
38 pages
Oracle DBA Material Draft
100% (2)
Oracle DBA Material Draft
143 pages
Service Manual Acer Travel Mate 7730 7730g
No ratings yet
Service Manual Acer Travel Mate 7730 7730g
252 pages
Vector Quantization
100% (1)
Vector Quantization
25 pages
Práctica1y2 Velasco Ricardez LizbethMaría
No ratings yet
Práctica1y2 Velasco Ricardez LizbethMaría
9 pages
Rubicon Toolbox User Guide - Downloading The Latest Upgrade
No ratings yet
Rubicon Toolbox User Guide - Downloading The Latest Upgrade
10 pages
Plflow 2 Userguide
No ratings yet
Plflow 2 Userguide
22 pages