0% found this document useful (0 votes)
136 views

選擇題

Uploaded by

ipmessage1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
136 views

選擇題

Uploaded by

ipmessage1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

SAS Advanced 备考

63 题+新题

13Jan2018 edited by MZX


目录
Question 1 ...............................................................................1
Question 2 ...............................................................................1
Question 3 ...............................................................................3
QUESTION 4 ...............................................................................4
QUESTION 5 ...............................................................................5
QUESTION 6 ...............................................................................5
QUESTION 7 ...............................................................................8
QUESTION 8 ...............................................................................8
QUESTION 9 ...............................................................................8
QUESTION 10 .............................................................................10
QUESTION 11 .............................................................................12
QUESTION 12 .............................................................................13
QUESTION 13 .............................................................................14
QUESTION 14 .............................................................................15
QUESTION 15 .............................................................................16
QUESTION 16 .............................................................................19
QUESTION 17 .............................................................................22
QUESTION 18 .............................................................................23
QUESTION 19 .............................................................................23
QUESTION 20 .............................................................................25
QUESTION 21 .............................................................................26
QUESTION 22 .............................................................................28
QUESTION 23 .............................................................................30
QUESTION 24 .............................................................................33
QUESTION 25 .............................................................................33
QUESTION 26 .............................................................................36
QUESTION 27 .............................................................................39
QUESTION 28 .............................................................................40
QUESTION 29 .............................................................................41
QUESTION 30 .............................................................................41
QUESTION 31 .............................................................................42
QUESTION 32 .............................................................................43
QUESTION 33 .............................................................................44
QUESTION 34 .............................................................................45
QUESTION 35 .............................................................................46
QUESTION 36 .............................................................................47
QUESTION 37 .............................................................................48
QUESTION 38 .............................................................................49
QUESTION 39 .............................................................................50
QUESTION 40 .............................................................................51
QUESTION 41 .............................................................................51
QUESTION 42 .............................................................................52
QUESTION 43 .............................................................................52
QUESTION 44 .............................................................................53
QUESTION 45 .............................................................................56
QUESTION 46 .............................................................................58
ii
QUESTION 47 .............................................................................59
QUESTION 48 .............................................................................60
QUESTION 49 .............................................................................60
QUESTION 50 .............................................................................61
QUESTION 51 .............................................................................62
QUESTION 52 .............................................................................65
QUESTION 53 .............................................................................66
QUESTION 54 .............................................................................67
QUESTION 55 .............................................................................67
QUESTION 56 .............................................................................70
QUESTION 57 .............................................................................70
QUESTION 58 .............................................................................71
QUESTION 59 .............................................................................73
QUESTION 60 .............................................................................73
QUESTION 61 .............................................................................74
QUESTION 62 .............................................................................74
QUESTION 63 .............................................................................75
题目 1 ...................................................................................77
题目 2 ...................................................................................77
题目 3 ...................................................................................78
题目 4 ...................................................................................78
题目 5 ...................................................................................78
题目 6 ...................................................................................79
题目 7 ...................................................................................79
题目 8 ...................................................................................79
题目 9 ...................................................................................79
题目 10 ..................................................................................80
题目 11 ..................................................................................80
题目 12 ..................................................................................80
题目 13 ..................................................................................80
题目 14 ..................................................................................81
题目 15 ..................................................................................81
题目 16 ..................................................................................81
题目 17 ..................................................................................82
题目 17 ..................................................................................82
题目 18 ..................................................................................82
题目 19 ..................................................................................84
题目 20 ..................................................................................84

iii
Question 1
When attempting to minimize memory usage, the most efficient way to do group processing whe
n using the MEANS procedure is to use:

A. The BY statement.
B. GROUPBY with the NOTSORTED specification.
C. The CLASS statement.
D. Multiple WHERE statements.

答案:C 【解释】BY 的本质是划分切割数据本身(SUBSET DATA)。CLASS 是对变量值做


归属区分(CLASSIFY VARIABLE)。例如:
题意:使用 BY语句能够节约 PROC MEANS的内存占用。但by的话,data要先sort,你得考虑sort的时间
通常形式:
BY <DESCENDING> VARLIST <NOTSORTED>;

如果不指定 BY 语句,PROC MEANS 直接对整个数据集进行处理。

若指定 BY 语句,PROC MEANS 会对各个 BY 变量进行分别处理,使得内存占用较低。

BY 语句使用索引的情况:
 数据必须没有按照 BY 变量的降序排序;
 没有指定 DESCENDING 以及 NOTSORTED 选项;
 建立索引时没有指定 NOMISS 选项

Question 2
The SAS data set WORK.CHECK has a variable named Id_Code in it.
Which SQL statement would create an index on this variable?

A. create index Id_Code on WORK.CHECK;


B. create index(Id_Code) on WORK.CHECK;
C. make index=Id_Code from WORK.CHECK;
D. define index(Id_Code) in WORK.CHECK;

答案:A 【解释】prepare guide chapter 6

题意:排除干扰项,选择正确的 PROC SQL 语句,因为是简单索引,索引名称和列名称必须一致,所以可省略括号。

PROC SQL 的 CREATE INDEX 语句格式:


PROC SQL;
CREATE <UNIQUE> INDEX index-name
ON table-name (column-name-1<,…column-name-n>);
DROP INDEX index-name FROM table-name;
QUIT;

 简单索引(单个列),则索引的名称必须和列的名称一致。
 复合索引(多个列),则索引的名称必须不能与存在的列或者索引一致。

如:
1
proc sql;
create unique index EmpID
on work.payrollmaster(empid);
quit;

proc sql;
create unique index daily
on work.marchflights(flightnumber, date);
quit;

proc sql;
create table indexlib.prodindx as
select * from indexlib.prodsale;
create unique index seqnum on indexlib.prodindx;
quit;
proc sql;
create table indexlib.prodcomp as
select * from indexlib.prodsale;
create index country_state on indexlib.prodcomp(country, state);
quit;

PROC DATASETS 建立索引的形式:


PROC DATASETS LIBRARY=libref <NOLIST>;
MODIFY SAS-data-set-name;
INDEX DELETE index-name;
INDEX CREATE index-specification / UNIQUE NOMISS UPDATECENTILES=ALWAYS|0|NEVER|101|integer;
QUIT;
 NOLIST 选项:不在 SAS LOG 以及 ODS 输出当中显示文件夹下的 SAS 文件。
 index-specification:对于简单索引,直接指定键变量的名称,若是复合索引,则格式为 index-name=(v
ariable-1 … variable-n)
proc datasets library=indexlib;
modify prodindx;
index create seqnum /nomiss unique updatecentiles=10;
index create state / nomiss;
index create county;
index create year /nomiss;
index create country_state=(country state) / nomiss;
index create year_and_quarter=(year quarter) / updatecentiles = 10;
index create state_product=(state product);
run;
quit;

DATASET 选项建立索引:
简单索引

基本形式为:
DATA data-set-name(INDEX=(variable-name </UNIQUE> </NOMISS>));

与 PROC 步联用指定输出数据集的索引:
… OUT=data-set-name(INDEX=(variable-name </UNIQUE> </NOMISS>));

复合索引
基本形式:
DATA data-set-name(INDEX=(index-name=(variable-name1 variable-name2 …) </UNIQUE> </NOMISS>));

与 PROC 步联用指定输出数据集的索引:
… OUT=data-set-name(INDEX=(index-name=(variable-name1 variable-name2 …) </UNIQUE> </NOMISS>));

2
简单索引:
data indexlib.prodindx(index=(state));
set indexlib.prodsale;
run;

data indexlib.prodindx(index=(seqnum /unique /nomiss


state / nomiss
county
year / nomiss));
set indexlib.prodsale;
run;

复合索引:
data indexlib.prodcomp(index=(country_state=(country state)));
set indexlib=prodsale;
run;

Question 3
Given the SAS data sets:

WORK.EMPLOYEE WORK.NEWEMPLOYEE

Name Dept Names Salary


------------------ -------------------
Alan Sales Michelle 50000
Michelle Sales Paresh 60000

A SAS program is submitted and the following is written to the SAS log:

101 proc sql;


102 select dept, name
103 from WORK.EMPLOYEE
104 where name=(select names
from newemployee
where salary > 40000)
ERROR: Subquery evaluated to more than one row.
105 ;
106 quit;

What would allow the program to successfully execute without errors?

A.Replace the where clause with:


where EMPLOYEE.Name=(select Names delimited with ','
from WORK.NEWEMPLOYEE
where Salary > 40000);

B.Replace line 104 with:


where EMPLOYEE.Name =ANY(select Names separated with ','
from WORK.NEWEMPLOYEE
where Salary > 40000);

C.Replace the equal sign with the IN operator.


D.Qualify the column names with the table names.

答案:C 【联想】如果将 B 选项中的“separated with ','”删除,就能出现结果,且不报错。


3
A 运行结果:没有“select Names delimited
with ','”
单值相关查询只返回一行值,一般需要指定聚集函数,如 AVG()等。
B 运行结果:提示“separated with”有语法错
多值相关子查询一般常和 IN、ANY、ALL、EXIESTS 操作符联用。
误,用“separated by”也不对

QUESTION 4
与 QUESTION 16 采用类似数据集。
Given the SAS data set SASUSER.HIGHWAY:

Steering Seatbelt Speed Status Count


-----------------------------------------
absent No 0-29 serious 31
absent No 0-29 not 1419
absent No 30-49 serious 191
absent no 30-49 not 2004
absent no 50+ serious 216

The following SAS program is submitted:

proc sql noprint;


select distinct
Speed [INSERT SQL CLAUSE]
from SASUSER.HIGHWAY;
quit;
title1 "Speed values represented are: &GROUPS";
proc print data=SASUSER.HIGHWAY;
run;

Which SQL clause stores the text 0-29,30-49,50+ in the macro variable GROUPS?

A. into &GROUPS
B. into :GROUPS
C. into :GROUPS separated by ','
D. into &GROUPS separated by ','
E. into GROUPS separated by ','
F. into Groups delimited by ','
G. into :Groups delimited by ','

答案:C
PROC SQL NOPRINT;
A:提示“&GROUP”选项错误
SELECT column1
INTO:macro-variable-1 SEPARATED BY 'delimiter1' B:虽然可以有结果显示,但只
FROM table-1 | view-1 <WHERE expression> <other clauses>; 将“0-29”的值赋给了宏变量 groups,其他
值未能 C:将“0-29,30-49,50+”赋给了宏变
需要特别注意,宏变量之前的冒号“:”以及“SEPARATED BY”
。 量 groups
相关的干扰项比较多,需要仔细辨别 D:提示“语法错误,语句将被忽略。”

多个变量可以写为: c 0-29 30-49 50+


:macro-variable-1, :macro-variable-2, …, :macro-variable-n

或者数字列表形式:
:macro-variable-1:macro-variable-n

4
如:
proc sql noprint;
select distinct location into: sites separated by ' '
from sasuser.schedule;
quit;

QUESTION 5

The SAS data set WORK.CHECK has an index on the variable Code and the following SAS program
is submitted.

proc sort data=WORK.CHECK;


by Code;
run;

Which describes the result of submitting the SAS program?

A. The index on Code is deleted.


B. The index on Code is updated.
C. The index on Code is uneffected.
D. The sort does not execute.

答案:D
如果数据集被索引,在 PROC SORT 默认选项下,用户无法排序并替换该数据集(即,没指定 OUT=选项)

默认选项下,PROC SORT 会检查数据表的排序指示器,以免数据重复排序。

可以增加 FORCE 参数,使得 PROC SORT 无视索引以及排序指示器,强制进行索引操作。

QUESTION 6

The table WORK.PILOTS contains the following data:

WORK.PILOTS
Id Name Jobcode Salary
-----------------------------
001 Albert PT1 50000
002 Brenda PT1 70000
003 Carl PT1 60000
004 Donna PT2 80000
005 Edward PT2 90000
006 Flora PT3 100000

The data set was summarized to include average salary based on jobcode:

Jobcode Salary Avg


-----------------------

5
PT1 50000 60000
PT1 70000 60000
PT1 60000 60000
PT2 80000 85000
PT2 90000 85000
PT3 100000 100000

Which SQL statement could NOT generate this result?


A.
select Jobcode, Salary, avg(Salary) label='Avg'
from WORK.PILOTS
group by Jobcode
order by Id;

B.
select Jobcode, Salary,
(select avg(Salary)
from WORK.PILOTS as P1
where P1.Jobcode=P2.Jobcode) as Avg
from WORK.PILOTS as P2
order by Id;

C.
select Jobcode, Salary,
(select avg(Salary)
from WORK.PILOTS
group by Jobcode) as Avg
from WORK.PILOTS
order by Id;

D.
select Jobcode, Salary, Avg
from WORK.PILOTS,
(
select Jobcode as Jc, avg(Salary) as Avg
from WORK.PILOTS
group by 1
)
where Jobcode=Jc
order by Id
;

答案是 C.
建立数据集:
data pilots;
infile datalines;
input Id $ Name $ Jobcode $ Salary;
datalines;
001 Albert PT1 50000
002 Brenda PT1 70000
003 Carl PT1 60000
004 Donna PT2 80000
005 Edward PT2 90000
006 Flora PT3 100000
;
run;

选项 A 的测试程序如下:

6
options nolabel;
proc sql;
select Jobcode, Salary, avg(Salary) label='Avg'
from WORK.PILOTS
group by Jobcode
order by Id;
quit;
options label;
可以看到在默认情况下显示的是变量的标签而不是变量名。

B 的测试程序如下:
proc sql;
select Jobcode, Salary,
(select avg(Salary)
from WORK.PILOTS as P1
where P1.Jobcode=P2.Jobcode) as Avg
from WORK.PILOTS as P2
order by Id;
quit;
其中,
(select avg(Salary) from WORK.PILOTS as P1 where P1.Jobcode=P2.Jobcode)
可以看作是一个返回单值的相关子查询,其 Jobcode 的值由外层查询(P2)确定,然后 AVG()聚集函数返回单个
标量。J1和J2产生笛卡尔,每一个j2.Jobcode对应的都是和它相关的那些相等的J1.Jobcode们的平均值

C 的测试程序:
proc sql;
select Jobcode, Salary,
(select avg(Salary)
from WORK.PILOTS
group by Jobcode) as Avg
from WORK.PILOTS
order by Id;
quit;

提示如下:

可以看出,
(select avg(Salary) from WORK.PILOTS group by Jobcode)返回多个值,不能满足要求。

D 的测试程序:
proc sql;
7
select Jobcode, Salary, Avg
from WORK.PILOTS,
(
select Jobcode as Jc, avg(Salary) as Avg
from WORK.PILOTS
group by 1
)
where Jobcode=Jc
order by Id;
quit;
其中,内层查询返回表格如下:

实际外层查询做了等值连接,选取了相应的 Avg值(此 Avg为列名,与 A选项不同)。


where 就是要求等值连接,产生笛卡尔,然后按条件选择需要的

QUESTION 7

A quick rule of thumb for the space required to run PROC SORT is:

A. Two times the size of the SAS data set being sorted.
B. Three times the size of the SAS data set being sorted.
C. Four times the size of the SAS data set being sorted.
D. Five times the size of the SAS data set being sorted.

答案:A
New: For SAS9, multiplier is 2 or less.
old依据:http://support.sas.com/resources/papers/proceedings10/140-2010.pdf
“If you want the sort to complete entirely in memory, a simple rule of thumb is four times the
size of the data set.
And I'm assuming the data set is not compressed or being subset with a DROP=/KEEP= data set op
tion or a WHERE statement.
A better estimate would be to use this formula to predict the amount of memory:
((length of observation+sum of lengths of BY variables)*number of observations)* 1.10”

QUESTION 8

Multi-threaded processing for PROC SORT will affect which of these system resources?

A. CPU time will decrease, wall clock time will decrease.


B. CPU time will increase, wall clock time will decrease.
C. CPU time will decrease, wall clock time will increase.
D. CPU time will increase, wall clock time will increase.

答案:B

QUESTION 9

8
Given the SAS data set WORK.TRANSACT:

Rep Cost Ship


-----------------------
SMITH 200 50
SMITH 400 20
JONES 100 10
SMITH 600 100
JONES 100 5

The following output is desired:

Rep
------------
JONES 105
SMITH 250

Which SQL statement was used?

A.
select rep, min(Cost+Ship)
from WORK.TRANSACT
order by Rep;

B.
select Rep, min(Cost,Ship) as Min
from WORK.TRANSACT
summary by Rep order by Rep;

C.
select Rep, min(Cost,Ship)
from WORK.TRANSACT
group by Rep
order by Rep ;

D.
select Rep, min(Cost+Ship)
from WORK.TRANSACT
group by Rep
order by Rep;

答案:D
GROUP BY 子句一般与聚集函数以及 HAVING 子句联用。

proc sql;
select jobcode, avg(salary) as AvgSalary format = dollar11.2
from sasuser.payrollmaster
group by jobcode
having salary > 56000;
 若视图当中含有 GROUP BY 子句,则为聚集视图,无法更新。
 如果查询当中包括 GROUP BY 子句,则不包括聚集函数的其他所有在 SELECT 子句当中出现的列,都应该列在
GROUP BY 子句当中,否则可能会出现非预期的结果。
 如果有 HAVING 子句,而忽略了 GROUP BY 子句,则查询会将整个表当作一个组。
 如果有 GROUP BY 子句,而查询当中没有包括聚集函数,则 GROUP BY 子句变成了 ORDER BY 子句。

变体:
9
输出变为:
【解释】A 运行结果:得出的结果应该是没有分组比较的结果,就是自身比较,所
以会得到 5 个结果,没有分组,而且 MIN 的值都为 105
B 运行结果:语法错误,得不到结果
C 运行结果:得到是在一个 REP 上的这两个变量 COST 和 SHIP 之间的小值。
D 运行结果:正确答案,因为现在要得到的数据集中有一个变量是没有命名的

即:没有 GROUP BY,整个表格都成为了一个组。

则相应的程序为:

data work.transact;
infile datalines;
input Rep $ Cost Ship;
datalines;
SMITH 200 50
SMITH 400 20
JONES 100 10
SMITH 600 100
JONES 100 5
;
run;
proc sql;
select Rep, min(Cost+Ship)
from work.transact
order by Rep;
quit;

QUESTION 10

The following SAS program is submitted:

%let Value=9;
%let Add=5;
%let Newval=%eval(&Value/&Add);
%put &Newval;

What is the value of the macro variable newval when the %PUT statement executes?

A. 0.555
B. 2
C. 1.8
D. 1

答案:D 如果要获得准确的结果应该选择函数%sysevalf 替换%eval。


变体: 【联想】若将函数%EVAL 换成%sysevalf,则运行结果为“1.8”。
%let Value=11;
%let Add=5;
%let Newval=%eval(&Value/&Add);
10
%put &Newval;
输出为:2
%EVAL 方程功能:
 将整数字符串以及十六进制字符串转换成为整数值;
 将代标数值运算、比较运算以及逻辑运算的 Token 转换成为 Macro 运算符;
 进行算数或者逻辑运算。

对于数值运算表达式,如果计算结果是非整数,%EVAL 直接截断小数点之后的值,返回整数。
同时,%EVAL 在非整数值参与数值表达式计算时,会提示错误并返回 NULL 值。

对于逻辑表达式运算,%EVAL 对于 TRUE 返回 1 或者其他非 0 数值,对于 FALSE 则返回 0。

对于某些值,%EVAL 不转换成为数值:
 包含小数点或者 E的数值字符串,即是 表达式中不能出现小数点!
 SAS 日期以及时间常量。

语句 写入 LOG 信息
%put value=%eval(10 lt 2); value=0
%put value=10+2; value=10+2
%put value=%eval(10+2); value=12

%let counter=2; counter=3


%let counter=%eval(&counter+1);
%put counter=&counter;

%let numer=2; value=0


%let denom=8;
%put value=%eval(&numer/&denom);

%let numer=2; value=0


%let demon=8; value=2
%put value=%eval(&numer/&denom*&denom);
%put value=%eval(&denom*&numer/&denom);

%let real=2.4; value=


%let int=8; 错误消息:
%put value=%eval(&real+&int); ERROR: A character operand was found in the %EV
AL function
or %IF condition where a numeric operand is req
uired.
The condition was: 2.4+8
value=
因为%EVAL 不会将包含句号的数值转换成为数字,该操作数
被认为是字符型。

%SYSEVALF()可以对使用浮点表示的数值或者逻辑表达式进行求值

%SYSEVALF(expression<, conversion-type>)
其中 Conversion-type 可以为:BOLLEAN、CEIL、FLOOR、INTEGER

11
%SYSEVALF()进行浮点运算之后总返回字符型值,该值使用 BEST16.格式。
%SYSEVALF 是唯一一个能够对包含浮点、日期、时间、日期时间或者缺失值进行逻辑运算的 Macro 函数。
指定 Conversation-type 可以避免返回类型不一致。

任何需要数值或者逻辑表达式参数的 Macro 方程或者语句都会自动调用%EVAL 方程,如%SCAN、%SUBSTR 以及%IF


-%THEN 语句。

QUESTION 11

The following SAS code is submitted:

data WORK.TEMP WORK.ERRORS / view=WORK.TEMP;


infile RAWDATA;
input Xa Xb Xc;
if Xa=. then output WORK.ERRORS;
else output WORK.TEMP;
run;

Which of the following is true of the WORK.ERRORS data set?

A. The data set is created when the DATA step is submitted.


B. The data set is created when the view TEMP is used in another SAS step. (or)
When TEMP is used in another SAS step, the data set is created.
C. The data set is not created because the DATA statement contains a syntax error.
D. The descriptor portion of WORK.ERRORS is created when the DATA step is submitted.

【答案】B
DATA 步视图包含的是“部分”编译的 DATA 步,并且只能通过 DATA 步创建。引用 DATA 步视图与普通表格相同。
语句形式如下:
DATA SAS-data-view <SAS-data-file-1 … SAS-data-file-n> / VIEW=SAS-data-view;
<SAS statements>
RUN;

使用 DATA 步视图可以读取文件的当前数据,避免复制大的数据文件,并且可以综合多个数据来源的数据。

DATA 步视图可以许多数据来源读取数据,包括:原始数据文件、SAS 数据文件、PROC SQL 视图、SAS/ACCESS 视


图、DB2、ORACLE 或者其他 DBMS 数据。

建立 DATA 步视图:DATA 步仅被部分编译,中间代码在指定的 SAS 逻辑库当中存储为 VIEW 成员类型。


引用 DATA 步视图:编译器解析中间代码并为主机环境产生可执行代码,生成的代码随后被执行。
在仅仅通过 F3 执行后会发现,只在 work 下创建了一个名为 temp 的视
DATA 步视图不能包括:
图,视图中也没有观测。当你双击该视图后,才写入观测,并创建了另一个
 全局语句
名为 Error 的数据集。所以此处答案应选 B。
 针对特定主机的数据集选项
 大多数针对特定主机的 FILE 以及 INFILE 语句选项
 DATA 步视图也不能被索引或者被压缩。

VIEW=选项, SAS 将编译而并不执行源程序,编译过的代码将在 DATA 步视图当中存储。

12
注意:若在 DATA 语句当中指定了其他的数据文件,则当后续的 DATA 或者 PROC 步调用了该视图时 SAS 才会创建这
些数据文件。所以,若想使用此类数据文件,必须先引用 DATA 步视图。

DATA 步视图的源语句会被保存,可以使用 DESCRIBE 语句查看。不能与引用 DATA 步的语句搞混。


如:
data view=company.newdata;
describe;
run;

QUESTION 12
Which title statement would always display the current date?
A. title "Today is: &sysdate.";
B. title "Today is: &sysdate9.";
C. title "Today is: &today.";
D. title "Today is: %sysfunc(today(),worddate.)";
【注意】如果手动修改电脑时间,则 D 选的是电脑时间而 A 还是当天时间。因为sysdate 返回的是
date that a SAS job or session began executing。 A 中&sysdate 显示的是 SAS 执行时的日期,格式
如“Today is: 03FEB16” B 中&sysdate9 显示的是 SAS 执行时的日期,格式如“Today is: 03FEB2016”
答案:D
C 中&today 不是系统宏变量,因此报错,执行结果“Today is: &today.”
值固定的自动 MACRO 变量:
变量引用 变量值说明
&SYSDATE. SAS 启动日期 (DATE7.),如“01DEC15”
&SYSDATE9. SAS 启动日期(DATE9.)
,如“01DEC2015”
&SYSDAY. SAS 启动日期是星期当中的那一天,即 DayOfWeek,如“Tuesday”
&SYSTIME. SAS 的启动时间,如“19:38”
&SYSENV. FORCE(交互模式)、BACK(非交互模式或者批处理模式)
&SYSSCP. 操作系统缩写,如 WIN 或者 LINUX
&SYSVER. 使用的 SAS 版本
&SYSJOBID. 当前 SAS 回话或者当前批处理任务的标识符(大型机系统是用户名或者任务名,其他系
统使用的是进程 PID)

值随用户提交语句而改变的自动 MACRO 变量:


变量引用 变量值说明
&SYSLAST. 最近创建的 SAS 数据集名称,形式为 LIBREF.NAME。此变量始终保持大写。
如果没有创建过数据集,则该变量值为_NULL_
&SYSPARM. 启动 SAS 时候指定的文本
&SYSERR. DATA 步或者一些 SAS 过程的返回值,指示步或者过程有没有成功执行

%SYSFUNC 方程的形式为:
%SYSFUNC(function(argument(s)) <,format>)

类似的还有%QSYSFUNC,将返回值添加引号,用于作为参数再次传递
如:
title "Report Produced on %sysfunc(left(%qsysfunc(today(),worddate.)))";
典型输出为:
Report Produced on November 4, 2011

13
QUESTION 13

Given the SAS data sets:


WORK.ONE WORK.TWO
Id Name Id Salary
----------- ---------------
112 Smith 243 150000
243 Wei 355 45000
457 Jones 523 75000

The following SAS program is submitted:

data WORK.COMBINE;
merge WORK.ONE WORK.TWO;
by Id;
run;

Which SQL procedure statement produces the same results?


A. A:犯了一个明显的错误,就是 ID 这个变量引用
create table WORK.COMBINE as 不明确。“select Id…”应指明是从ONE 引用,还
select Id, Name, Salary 是从 TWO 引用,正确表述应为“select ONE.Id…”
from WORK.ONE full join WORK.TWO on ONE.Id=TWO.Id;
B. B:返回只有一个观测,因为 ONE 和 TWO 的连接
create table WORK.COMBINE as 按照 where 筛选条件进行了筛选,结果只有一条
select coalesce(ONE.Id, TWO.Id) as Id, Name, Salary 符合条件。
from WORK.ONE, WORK.TWO
C:首先执行的是按照 ON 条件执行 FULL JOIN 链
where ONE.Id=TWO.Id;
C. 接,此时如果不用 COALESCE,数据集会有两
create table WORK.COMBINE as 个 ID,一个是 ONE,一个是 TWO 的,如果
select coalesce(ONE.Id, TWO.Id) as Id, Name, Salary ONE 和 TWO 的 ID 不匹配,那么想要的 ID 就为
缺失值。
from WORK.ONE full join WORK.TWO on ONE.Id=TWO.Id
order by Id;
D. 那么 COALESCE 函数就是合并这两个 ID,然后
create table WORK.COMBINE as 重复的合并在一起,剔除缺失值,所以结果就和
select coalesce(ONE.Id, TWO.Id) as Id, Name, Salary MERGE 一致。
from WORK.ONE, WORK.TWO
D:只返回一个值,错误和 B 一样,连接方式不是
where ONE.Id=TWO.Id
order by ONE.Id; FULL JOIN,而是 inner join
【联想】关于字段唯一性指向,请注意与第 26
答案:C
题区分
DATA 步进行的是 MERGE 操作,对于左表或右表无法匹配的将填充缺失值,且 BY 变量只显示一次。

A 选项语法正确应该使用的是 SELECT ONE.ID 或者 SELECT TWO.ID,因为两个表变量名有冲突,然而依然与题意


不符。

B 选项,COALESCE()函数可以用来覆盖列变量名,但是整个语句进行的是等值连接,不会返回左表或者右表的缺失
值情况。而且,没有 ORDER BY 子句,PROC SQL 查询结果的排序是不可预期的。

D 选项虽然加了 ORDER BY 子句,符合排序要求,但是仍然是等值连接,无法返回无法匹配的值,不符合题意。

使用 DATA 步:
优点:
 输入数据集的数量没有限制;

14
 可以使用 DATA 步的处理(包括 DO 循环、以及数组等),和 MERGE 一起完成复杂的商业逻辑;
 多个 BY 变量允许通过多个变量进行查找;
缺点:
 数据集必须通过 BY 变量进行排序或者索引;
 BY 变量必须在所有的数据集出现,并且变量的名称必须一致;
 键值必须精确匹配,或者键值必须能找到。

使用 PROC SQL:
优点:
 数据集不需要事先排序或者建立索引,但是索引能够提升性能;
 多个数据集能够在一个步内完成连接,并且不需要所有的数据集必须具有相同的变量;
 可以使用合并的数据建立数据集(表) 、视图以及报告。
缺点
 一次最多连接 256 个表;
 连接过程当中难以应用复杂的商业逻辑;
 对于简单的连接操作,PROC SQL 可能比 DATA 步需要更多的资源。

QUESTION 14

The following SAS program is submitted:

proc contents data=TESTDATA.ONE;


run;

Which SQL procedure step produces similar information about the column attributes of TESTDA
TA.ONE?

A.
proc sql;
contents from TESTDATA.ONE;
quit;
B.
proc sql;
describe from TESTDATA.ONE;
quit;
C.
proc sql;
contents table TESTDATA.ONE;
quit;
D.
proc sql;
describe table TESTDATA.ONE;
quit;

答案:D
DESCRIBE TABLE 语句的语法如下:

DESCRIBE TABLE table-name-1<, …table-name-n>;

其中,table-name 可以为:单级名称,双级名称(Libref.Table),物理路径名(用单引号包括)

15
DESCRIBE TABLE 的典型输出结果如下(打印到 SAS LOG)

DESCRIBE TABLE 语句可以描述 DATA 步创建的表格,可以显示索引信息。

同样可以显示索引信息的有 PROC CONTENTS 以及 PROC DATASETS,以及查询 Dictionary.Indexes。

查询 Dictionary.Indexes 如下,注意名称大写:
proc sql;
select * from dictionary.indexes
where libname=’SASUSER’ and memname=’SALE2000’;
quit;

Dictionary.Indexes 表格结构如下图所示,
运行
proc sql;
describe table dictionary.indexes;
quit;

结果如下所示:

SELECT 的结果如下所示:

QUESTION 15

Given the SAS data set WORK.ONE:

Rep Cost
----------------
SMITH 200
16
SMITH 400
JONES 100
SMITH 600
JONES 100

The following SAS program is submitted;


proc sql;
select
Rep,
avg(Cost)
from WORK.ONE
order by Rep
;
quit;

Which result set would be generated?

A.
JONES 280
JONES 280
SMITH 280
SMITH 280
SMITH 280
B.
JONES 600
SMITH 100
C.
JONES 280
SMITH 280
D.
JONES 100
JONES 100
SMITH 600
SMITH 600
SMITH 600

答案:A
题目变体:
proc sql;
select Rep, avg(Cost)
from WORK.ONE
group by Rep;
quit;

输出
rep
----- ---
JONES 280
SMITH 280

题目当中的程序没有 GROUP BY 子句,则聚集函数应用于整个列。


相关测试程序如下:
data one;
infile datalines;
input rep $ cost;
datalines;
SMITH 200

17
SMITH 400
JONES 100
SMITH 600
JONES 100
;
run;

A 的程序如下:
proc sql;
select rep, avg(cost)
from work.one
order by rep;
quit;
结果如下:

B 的程序如下:
proc sql;
select rep, max(cost)
from work.one
group by rep;
quit;
结果如下:

C 的程序如下:
proc sql;
select rep, avg((select avg(cost) from work.one))
from work.one
group by rep;
quit;
结果如下:

D 的一种可行程序如下:
proc sql;
select o.rep, (select max(cost) from work.one as i where i.rep=o.rep)
from work.one as o
group by rep;
quit;
结果如下:
18
需要注意此时 SAS LOG 的提示:

即:GROUP BY 子句在没有聚集函数出现在 SELECT 子句或者 HAVING 子句之时会被自动转换成为 ORDER BY 子句,


即上述程序与以下等价:

proc sql;
select o.rep, (select max(cost) from work.one as i where i.rep=o.rep)
from work.one as o
order by rep;
quit;

或者采用内联视图:
proc sql;
select o.rep, i.m
from work.one as o, (select rep, max(cost) as m from work.one group by rep) as i
where o.rep=i.rep
order by o.rep;
quit;
此时结果如下:

QUESTION 16

Given the SAS data sets:

WORK.MATH1A WORK.MATH1B

19
Name Fi Name Fi
----------- -----------
Lauren L Smith M
Patel A Lauren L
Chang Z Patel A
Hillier R

The following SAS program is submitted:

proc sql;
select *
from WORK.MATH1A
[INSERT SET OPERATOR]
select *
from WORK.MATH1B
;
quit;

The following output is desired:

Name Fi
------------
Lauren L
Patel A
Chang Z
Hillier R
Smith M
Lauren L
Patel A

Which SQL set operator completes the program and generates the desired output?
A. append corr
B. union corr
C. outer union corr
D. intersect corr

答案:C
APPEND:APPEND 只有 PROC APPEND
基本语法如下:
proc append base=work.acities
data=work.airports force;
run;
情况如下:
 BASE=数据集含有比 DATA=数据集更多的变量,则不需要使用 FORCE 选项,BASE=数据集当中额外的变量自动
被赋缺失值。
 DATA=数据集含有比 BASE=数据集更多的变量,则需要使用 FORCE 选项,此时 DATA=数据集当中的多于变量被
丢弃。
 DATA=的数据集相应变量比 BASE=数据集的长度大,需要使用 FORCE 选项,DATA=数据集当中的变量值可能会
被裁剪。
 DATA=数据集和 BASE=的数据集变量类型不同,则 DATA=当中变量类型不符合的变量被赋缺失值。

如果对两个表进行简单连接,则 PROC APPEND 是最快的方法,因为 BASE=数据表的数据并不会被完全读取,只需


要完全读取 DATA=数据表。

SQL 集合操作符一般而言需要更多的计算资源,但是操作更加方便、灵活。
20
DATA 步可以一次处理几乎无穷多个数据表,而使用 SQL 集合操作符每次只能操作两个表格。

以下三个程序产生相同的报告结果:
data three;
set one two;
run;
proc print data=three noobs;
run;

proc sq;
select * from one
outer union corr
select * from two;
quit;

proc append base=one data=two;


run;
proc print data=one noobs;
run;

EXCEPT、INTERSECT、UNION 默认情况下只显示唯一行,并且,默认情况下 PROC SQL 会对数据进行两次遍历:


第一遍遍历 PROC SQL 移除表格当中重复行
第二遍遍历 PROC SQL 选择符合条件的行,若用户指定同时覆盖列。

默认情况下,PROC SQL 覆盖列的时候会采用第一个表格当中列的名称,若第一个表格当中列没有指定名称则下滑


到第二个表格当中的列名称。

集合操作符的关键字:ALL 和 CORR
ALL:只对数据进行一次遍历,不移除重复行。
 当数据重复不影响结果或者数据不可能重复时使用
 ALL 不能和 OUTER UNION 联用。

CORR:在对列进行比较和覆盖的时候根据名称进行,而不是按照默认的位置进行。
 适用于两张表格部分或者全部的列相同而列的顺序不同的情况
 当和 EXCEPT、INTERSECT、UNION 联用时,会移除所有在两个表格当中没有相同名称的列。
 如果在 SELECT 指定了列别名,则会使用列别名进行比较与覆盖

默认情况下,EXCEPT、INTERSECT、UNION 都基于列的位置进行列的覆盖,而 OUTER UNION 不进行列覆盖

题目有变体:

测试程序如下:
data math1a;
infile datalines;
input Name $ Fi $;
datalines;
Lauren L
Patel A
Chang Z
Hillier R
;
run;
data math1b;
infile datalines;
input Name $ Fi $;

21
datalines;
Smith M
Lauren L
Patel A
;
run;
title 'UNION CORR';
proc sql;
select *
from WORK.MATH1A
union corr
select *
from WORK.MATH1B
;
quit;

title 'OUTER UNION CORR';


proc sql;
select *
from WORK.MATH1A
outer union corr
select *
from WORK.MATH1B
;
quit;

title 'INTERSECT CORR';


proc sql;
select *
from WORK.MATH1A
intersect corr
select *
from WORK.MATH1B
;
quit;

QUESTION 17

Which of the following is an advantage of SAS views?

A. SAS views can access the most current data in files that are frequently updated.
22
B. SAS views can avoid storing a SAS copy of a large data file.
C. SAS views can decrease programming time.
D. both A and B are true

答案:D

QUESTION 18

In what order does SAS search for format definitions by default?

A. 1. WORK.FORMATS 2. LIBRARY.FORMATS
B. 2. LIBRARY.FORMATS 2. WORK.FORMATS
C. There is no default order, it must be defined by the user.
D. All user defined libraries that have a catalog named FORMATS, in alphabetic order.

答案:A
 当为数据集当中的变量永久指定了 FORMAT 的时候,需要保证引用的 FORMAT 保存在一个永久位置当中。
若在后续的 SAS 会话当中想要使用之前创建的 FORAMT,需要保证:
1. 在运行 PROC FORMAT 步的 SAS 会话当中指定相应的逻辑库 Libref
2. 在创建 FORMAT 的 PROC FORMAT 步当中指定 LIB=LIBRARY
3. 在后续的程序当中,包括一个 LIBNAME 语句,为包含了保存格式的 Catelog 的逻辑库指定 Libref。

搜索格式的时候,默认情况下,SAS 总会先搜索“Work.Formats” ,然后搜索“Library.Formats”



Library 这一 Libref 是自动搜索的,所以推荐使用该 Libref 保存 Formats。

如果想要指定默认搜索路径以外的,需要使用 FMTSEARCH=系统选项:
OPTIONS FMTSEARCH=(catalog-1 catalog-2 … catalog-n);

需要注意的是,指定 Catalog 的时候,可以只指定 Libref,此时会自动搜索 Libref.Formats 这一 Catalog。


如:
options fmtsearch=(rpt prod.newfmt);

此时,搜索的顺序为 Work.foramts -> Library.Formats -> Rpt.Formats -> Prod.Newfmt

QUESTION 19

Given the dataset WORK.STUDENTS:


Name Age
-------------
Mary 15
Philip 16
Robert 12
Ronald 15

The following SAS program is submitted:


%let Value=Philip;
proc print data=WORK.STUDENTS;
[INSERT WHERE STATEMENT]
run;
Which WHERE statement successfully completes the program and produces a report?

23
A. where upcase(Name)=upcase(&Value);
B. where upcase(Name)=%upcase(&Value);
C. where upcase(Name)="upcase(&Value)";
D. where upcase(Name)="%upcase(&Value)";

答案:D
若需要对 Macro 变量的字符串进行操作,应该使用 Macro 函数来进行操作。

%UPCASE()函数仅对 Macro 变量进行大小写转换,而 WHERE 子句进行字符串匹配需要将值用引号包围。


如果 Macro 变量包括特殊字符、助记操作符或者 Macro 触发器是,需要使用%QUPCASE 方程。

对于 A 答案,测试程序为: 【解释】此题其实考察的是字符串的引用问题。
option msglevel=i; 在一般 SAS 程序中,字符型变量的值必须加引号,宏变量定义
%let value=Philip;
data _null_; 除外。另外使用宏变量的函数,前面必须加“%”。“%upcase”宏
%put where upcase(Name)=upcase(&Value); 函数是对宏字符变量进行字符大写化。另等式右边是字符型
run; 值,所以必须加引号。
则输出的 LOG 为:
A:运行结果:变量 Philip 不在文件“WORK.STUDENTS”中。
B:运行结果:变量 PHILIP 不在文件“WORK.STUDENTS”中
C:运行结果:没有从数据集 WORK.STUDENTS 中选择观
测。

可以看出程序实际上变成了
where upcase(Name)=upcase(Philip);
程序无法解析对变量“Philip”的引用,出现错误。

B 答案的测试程序为:
%let value=Philip;
data _null_;
%put where upcase(Name)=%upcase(&Value);
run;
则输出的 LOG 为:

可以看出,程序实际上变成了:
where upcase(Name)=PHILIP;
字符常量没有用引号,程序会出现错误。

对于 C 答案,测试程序为:
%let value=Philip;
data _null_;
%put where upcase(Name)="upcase(&Value)";
run;
则输出的 LOG 为:

24
可以看出,程序实际上变成了:
where upcase(Name)="upcase(Philip)";
首先,UPCASE()函数包围在双引号之内,成为了字符常量,无法进行函数运算,
其次,UPCASE()函数的参数也出错,无法解析对变量的引用,程序出错。

QUESTION 20

The following SAS program is submitted:

data WORK.TEMP;
length A B 3 X;
infile RAWDATA;
input A B X;
run;

What is the length of variable A?


A. 3
B. 8
C. WORK.TEMP is not created – X has an invalid length.
D. Unknown.

答案:A
测试程序为:
data WORK.TEMP;
length A B 3 X;
infile RAWFILE;
input A B X;
run;

程序执行过程中会报错,并且 DATA 步会终止执行,不会读取数据,LOG 如下:

可以看出,虽然程序执行出错,X 后面应该添加一个 3~8 之间的的整数,所以执行的是默认值为 8。

25
DESCRIBE TABLE 语句不能看到数值变量具体的长度信息:
proc sql;
describe table work.temp;
quit;

使用 PROC CONTENTS 可以查看:


proc contents data=work.temp;
run;

可以看出数据集已经成功创建,并且变量长度分别为 3、3、8。

QUESTION 21

The following SAS program is submitted:

data WORK.NEW;
do i=1, 2, 3;
Next=cats('March' || i);
infile XYZ filevar=Next end=Eof;
do until (Eof);
input Dept $ Sales;
end;
end;
run;

The purpose of the FILEVAR=option on the INFILE statement is to name the variable Next, whos
e value:

A. Points to a new input file.


B. Is output to the SAS data set WORK.NEW.
C. Is an input SAS data set reference.
D. Points to an aggregate storage location.

答案:A Filevar 变量为打开一个新的文件。


常用的 CAT()系列字符串函数如下:
 CAT():连接字符串,不移除前导以及后续空格。
26
 CATS():连接字符串,并移除前导以及后续空格。
 CATT():连接字符串,只移除后续空格。

对于 NEXT 变量的值,测试程序如下:
data WORK.NEW;
do i=1, 2, 3;
Next=cats('March' || i);
output;
end;
run;
proc contents data=work.new;
run;

SAS LOG 输出如下:

需要注意:
 i 变量为数值变量,在||操作符进行字符串连接操作时发生了自动变量类型转换,并让 SAS 在 LOG 写入 NOTE。
 CATS 方程的返回值若是赋值给一个之前没有指定长度的变量,则变量的长度被指定为 200,可能会导致字符
串发生截断。

垂直组合数据的三种方式分别为:
1. 使用 FILENAME 语句
2. 使用 FILEVAR=选项
3. 附加 SAS 数据集

其一,使用 FILENAME 语句的形式为:

filename qtr1 ('month1.dat' 'month2.dat' 'month3.dat');


data work.firstqtr;
infile qtr1;
input Flight $ Origin $ Dest $ Date : date9. RevCargo : comma15.;
run;

产生 LOG 的形式如下图,可以看出 SAS 按顺序读取了各个文件。

27
其二,使用 FILEVAR=选项的语句形式为:

INFILE file-specification FILEVAR=variable END=variable2;

其中,文件标识仅起占位符的作用,variable 类似于自动变量,不会写入数据集。
程序例子如下:
data work.quarter (drop=monthnum midmon lastmon);
thisday=today();
monthnum=month(thisday);
midmon=month(intnx('month',thisday,-1));
lastmon=month(intnx('month',thisday,-2));
do Month = monthnum, midmon, lastmon;
nextfile="c:\sasuser\month"||compress(put(Month,2.)||".dat",' ');
do until (lastobs);
infile temp filevar=nextfile end=lastobs;
input Flight $ Origin $ Dest $ Date : date9.
RevCargo : comma15.;
output;
end;
end;
stop;
run;

其中,COMPRESS()函数用于移除数字当中的前导空格。DATA 步当中的 STOP 语句用于防止 DATA 步的死循环。

因为 DATA 步若越过一个文件的结尾会自动结束 DATA 步,所以要读取多个文件时,需要使用 END=选项指定变量用


来指示文件读取到最后一条记录(此时还没有越过文件结尾)。

QUESTION 22

Given the following partial SAS log:

NOTE: SQL table SASHELP.CLASS was created like:


28
create table SASHELP.CLASS( bufsize=4096 )
(
Name char(8),
Sex char(1),
Age num,
Height num,
Weight num
);

Which SQL procedure statement generated this output?


A. CONTENTS FROM SASHELP.CLASS;
B. CREATE FROM SASHELP.CLASS INTO LOG;
C. DESCRIBE TABLE SASHELP.CLASS;
D. DESCRIBE TABLE=SASHELP.CLASS;【干扰项】
E. VALIDATE SELECT * FROM SASHELP.CLASS;

答案:C
PROC CONTENTS,的基本格式为:
PROC CONTENTS DATA=<libref.>SAS-data-set-name;
RUN;
PROC CONTENTS 在 DATA=libref._ALL_时,可以指定选项

PROC DATASETS 基本格式为:


PROC DATASETS <LIBRARY=libref> <NOLIST>;
CONTENTS DATA=<libref.>SAS-data-set-name;
QUIT;
其中 NOLIST 选项的意思是不在 SAS LOG 以及 ODS 输出当中显示目录当中的 SAS 文件。

PROC CONTENTS 与 PROC DATASETS 加 CONTENTS 语句的显示结果基本一致。


 NODS 选项:使用_ALL_选项的时候才可以搭配使用,不显示单个数据集当中的内容。
 VARNUM:按逻辑顺序(创建顺序)列出变量名。

VALIDATE 关键词用于 SELECT 语句之前,用于验证查询的语法正确性,类似于 PROC SQL 的 NOEXEC 选项。


若查询语法正确,如果语法正确,SAS 不会报错,而只给出提示:
VADLIATE 关键词会提示: “NOTE: PROC SQL statement has valid syntax.”
NOEXEC 选项会提示:“NOTE: Statement not executed due to NOEXEC option.”

关于 B 选项的操作,若需要根据已知的表创建新表,语法类似:
proc sql;
create table work.flightdelays3
(drop=delaycategory destinationtype)
like
sasuser.flightdelays;
quit;

根据查询结果建立新表,语法类似:
proc sql;
create table work.supervisors2
as
select * from sasuser.supervisors;
quit;

29
QUESTION 23
与 QUESTION 3 采用类似数据集。
Given the SAS data set SASUSER.HIGHWAY:

Steering Seatbelt Speed Status Count


-----------------------------------------
absent No 0-29 serious 31
absent No 0-29 not 1419
absent No 30-49 serious 191
absent no 30-49 not 2004
absent no 50+ serious 216

The following SAS program is submitted:

%macro SPLIT;
proc sort data=SASUSER.HIGHWAY out=WORK.UNIQUES(keep=Status) nodupkey;
by Status;
run;

data _null_;
set uniques end=Lastobs;
call symputx('Status'||left(_N_),Status);
if Lastobs then call symputx('Count', _N_);
run;

%local i;
data
%do i=1 %to &count;
[INSERT REFERENCE]
%end;
;
set SASUSER.HIGHWAY;
select(Status);
%do i=1 %to &Count;
when("[INSERT REFERENCE]") output [INSERT REFERENCE];
%end;
otherwise;
end;
run;
%mend;
%SPLIT

What macro variable reference completes the program to create the WORK.NOT and WORK.SERIOUS
data sets?
A. &Status&i
B. &&Status&i
C. &Status&Count
D. &&Status&Count

答案:B
proc sort data=SASUSER.HIGHWAY out=WORK.UNIQUES(keep=Status) nodupkey;
by Status;
run;

这一段程序的含义是:
 对数据集 SASUSER.HIGHWAY 依据 Status 进行升序排序

30
 输出到 WORK.UNIQUES 数据集当中
 仅保留 Status 变量

 NODUPKEY:检查并排除 BY 变量相同的变量。
 PROC SORT 会比较所有观测的 BY 变量,若与已经写入到输出数据集当中的完全匹配,则当前观测不会被输出。
 若指定了 NODUPKEY,同时还可以指定 DUPOUT=选项, 将重复的观测输出到指定的数据集。
 同时需要注意 EQUALS | NOEQUALS 选项, 默认为 EQUALS,
即 BY 变量完全相同的观测会保持原来的相对位置。
 若数据集有索引并且没有指定 OUT=选项,则必须搭配 FORCE 选项无视索引与已经排序的变量,强制进行排序。
 还有 NOUNIQUEKEY 选项,与 UNIQUEOUT=选项搭配,用于排除针对某一 BY 变量仅有单个观测的情况,若某一
个 BY 变量仅有一个观测,则会被移除。

测试程序如下:
data sasuser.highway;
infile datalines;
input Steering $ Seatbelt $ Speed $ Status $ Count;
datalines;
absent No 0-29 serious 31
absent No 0-29 not 1419
absent No 30-49 serious 191
absent no 30-49 not 2004
absent no 50+ serious 216
;
run;
proc sort data=sasuser.highway out=work.uniques(keep=Status) nodupkey;
by Status;
run;
proc print data=work.uniques;
run;
建立的数据集如下:

排序之后的数据集 WORK.UNIQUES 结果如下:

SYMPUTX 与 SYMPUT 非常类似,除了创建 Macro 变量并为其赋值之外,SYMPUTX 同时还会移除两个参数当中所有的


前导以及结尾空格。

SYMPUTX 这一 Macro 方程的调用方式为:

CALL SYMPUTX(macro-variable, expression);

其中,macro-variable 被赋值为 expression 的字符串值,macro-variable 以及 expression 的任何前导空格


以及结尾空格都会被移除。
如果 Macro 变量已经存在,则会覆盖之前的值。

31
macro-variable 以及 expression 可以指定为:
 常量,包括在引号当中
 DATA 步变量
 DATA 步表达式

data _null_;
set uniques end=Lastobs;
call symputx('Status'||left(_N_), Status);
if Lastobs then call symputx('Count', _N_);
run;
这段程序的含义是:
遍历 WORK.UNIQUES 数据集,Lastobs 变量指示是否到达最后一个观测(到达最后一个观测值变为 1)

对每一条观测,建立 GLOBAL Macro 变量。

需要注意的是,数值型变量会发生自动类型转换,格式为 BEST12.(右对齐),意味着直接进行字符串连接操作会
导致_N_自动变量的数值前面会被填充前导空格,所以需要 LEFT()去除前导空格。

运行了此段程序之后,通过运行:
%put _global_;
或者:
proc sql;
select * from dictionary.macros;
quit;

可以看出 3 个 GLOBAL Macro 变量被创建:


GLOBAL STATUS1 not
GLOBAL STATUS2 serious
GLOBAL COUNT 2

对于:
%local i;
data
%do i=1 %to &count;
&&Status&i
%end;
;

实际上是 DATA 步的第一个语句,形式为:


%local i;
data %do i=1 %to &count; &&Status&i %end; ;

目标生成的语句实际为:
data not serious;

实际上是需要解析 Status1 和 Status2 这两个 Macro 变量的值。

,然后 SAS 会重新扫描 Macro 变量。


&&操作符在运行时会被解析为单个“&”

%do i=1 %to &Count;


when("&&Status&i") output &&Status&i;
%end;

该语句生成了两条可执行语句,即:
WHEN("not") output not;
WHEN("serious") output serious;
32
即,最后生成的 DATA 步为:
data not serious;
set SASUSER.HIGHWAY;
select(Status);
when("not") output not;
when("serious") output serious;
otherwise;
end;
run;

QUESTION 24
参见第 7 题。

The following SAS program is submitted:

%let Num1=7;
%let Num2=3;
%let Result=%eval(&Num1/&Num2);
%put &Result;

What is the value of the macro variable Result when the %PUT statement executes?

A. 2.3
B. 2
C. . (missing value)
D. 2.33333333333333

答案:B
%EVAL()会对整数计算产生的浮点值结果进行截断,若浮点值参与运算则直接报错。

%SYSEVALF()可以对使用浮点表示的数值或者逻辑表达式进行求值

%SYSEVALF(expression<, conversion-type>)

其中 Conversion-type 可以为:BOLLEAN、CEIL、FLOOR、INTEGER

%SYSEVALF()进行浮点运算之后总返回字符型值,该值使用 BEST16.格式。

QUESTION 25

Given the SAS data set SASUSER.HIGHWAY:

Steering Seatbelt Speed Status Count


-------------------------------------
absent No 0-29 serious 31
absent No 0-29 not 1419
absent No 30-49 serious 191
absent no 30-49 not 2004
absent no 50+ serious 216

33
The following SAS program is submitted:

%macro HIGHWAY(Belt=no);
proc print data=SASUSER.HIGHWAY;
where Seatbelt="&Belt" ;
run;
%mend;
%HIGHWAY(Belt=No)

How many observations appear in the generated report?


A. 0
B. 2
C. 3
D. 5

答案:C
题目变体:将提交的参数大小写改变,因为此处字符串比较是大小写敏感的:

调用语句 输出观测数

%HIGHWAY(Belt=No) 3

%HIGHWAY(Belt=no) 2

MACRO 程序获取运行参数的几种形式:
位置参数:
%MACRO macro-name(parameter-1<,…parameter-n>);
text
%MEND<macro-name>

关键词参数:
%MACRO macro-name(keyword-1=<value-1><, …, keyword-n=<value-n>>);
text
%MEND<macro-name>;

混合变量列表(位置参数必须列在前):
%MACRO macro-name(parameter-1<, …, parameter-n>,
keyword-1=<value-1><, …, keyword-n=<value-n>>);
text
%MEND;

PARMBUFF 选项:
%MACRO macro-name/PARMBUFF
text
%MEND;

例子:
%macro printz/parmbuff;
%put Syspbuff contains: &syspbuff;
%local num;
%do num=1 %to %sysfunc(countw(&syspbuff));
%let dsname=%scan(&syspbuff,&num);
proc print data=sasuser.&dsname;
run;
%end;
%mend printz;

34
%printz(courses, schedule)

若打开 SYMBOLGEN 系统选项,则:

题中的程序运行 LOG 为:

即:
proc print data=SASUSER.HIGHWAY;
where Seatbelt="No" ;
run;

输出结果为:

需要注意的是,若需要进行 Macro 解析的字符串改为单引号,则会阻止 Macro 变量的解析。

35
必须使用双引号包围 Macro 变量解析。

QUESTION 26

Given the following SAS data sets:

WORK.VISIT1 WORK.VISIT2

Id Expense Id Cost
-------------- ------------
001 500 001 300
001 400 002 600
003 350

The following result set was summarized and consolidated using the SQL procedure:

Id Cost 首先,观察输出数据集:多了一个原始数据当中没有出现过的元组(001, 900):


------------
第一组数据当中(001, 400)和(001, 500)的第二个分量相加可以得到 900。
001 300
001 900
002 600
003 350

Which of the following SQL statements was most likely used to generate this result?

A.
select
Id,
sum(Expense) label='COST'
from WORK.VISIT1
group by 1
union all
select
Id,
sum(Cost)
from WORK.VISIT2
group by 1
order by 1,2
;

B.
select
id,
sum(expense) as COST
from
WORK.VISIT1(rename=(Expense=Cost)),
WORK.VISIT2
where VISIT1.Id=VISIT2.Id
group by Id
order by
Id,
Cost
;

C.
select
VISIT1.Id,
36
sum(Cost) as Cost
from
WORK.VISIT1(rename=(Expense=Cost)),
WORK.VISIT2
where VISIT1.Id=VISIT2.Id
group by Id
order by Id, Cost
;

D.
select
Id,
sum(Expense) as Cost D 运行结果:
from WORK.VISIT1
group by Id
Id Cost
outer union corr
select
001 . 300
Id,
sum(Cost)
001 900 .
from WORK.VISIT2
002 . 600
group by Id
order by 1,2
003 350 .
;
【答案】A
B 和 C 犯了同一个错误,就是 order by 中没有指定是那一个数据集中的 ID CASE。 B、C 是先连接形成一个
数据集之后,ORDER BY 是对整个数据集排序, D 答案最后得出的结果是三个变量,ID COST 一个位置命
名的变量。 B 运行结果:ERROR: 不明确的引用,列 id 出现在多个表中
C 运行结果:ERROR: 不明确的引用,列 Cost 出现在多个表中。
数据准备:
data visit1;
infile datalines;
input Id $ Expense;
datalines;
001 500
001 400
003 350
;
run;
data visit2;
infile datalines;
input Id $ Cost;
datalines;
001 300
002 600
;
run;

对于答案 A:
proc sql;
select Id, sum(Expense) label='COST'
from WORK.VISIT1
group by 1
37
;
quit;
的结果为:

proc sql;
select Id, sum(Cost)
from WORK.VISIT2
group by 1
;
quit;
的结果为:

UNION:首先连接并排序两个表格(列基于位置进行覆盖),然后排除重复的行。
ALL:显示所有重复的行。
CORR:基于列名而位置不是进行列的覆盖。

B 选项:
select id, sum(expense) as COST
from WORK.VISIT1(rename=(Expense=Cost)), WORK.VISIT2
where VISIT1.Id=VISIT2.Id
group by Id
order by Id, Cost;

做的是等值连接,其结果不符合要求,语句语法也出现错误:

1. VISIT1 表格的 Expense 变量被重命名为 Cost 变量,而聚集函数 Sum()引用的依然是 Expense 变量,会导致


SAS 提示列没有找到:

2. GROUP BY 以及 ORDER BY 子句当中的变量引用也出错,因为 Id 以及 Cost 列在两个表格当中都有,所以 SAS


会提示引用不明确:

C 选项的 GROUP BY 以及 ORDER BY 子句当中的变量引用依然错误。

D 选项:
proc sql;
select Id, sum(Expense) as Cost
from WORK.VISIT1
group by Id
outer union corr
select Id, sum(Cost)

38
from WORK.VISIT2
group by Id
order by 1,2
;
quit;

查询的结果分别为:

先按 ID 排序,后按 Cost 排序,输出的结果为:

QUESTION 27

Given the SAS data sets:

WORK.FIRST WORK.SECOND
Common X Common Y
------------ ---------------
A 10 A 1
A 13 A 3
A 14 B 4
B 9 B 2

The following SAS program is submitted:

data WORK.COMBINE;
set WORK.FIRST;
set WORK.SECOND;
run;

What data values are stored in dataset WORK.COMBINE?


A.
Common X Y
----------------
A 10 1
A 13 3
B 14 4
B 9 2

B.
Common X Y
---------------

39
A 10 1
A 13 3
A 14 3
B 9 4
B 9 2

C.
Common X Y
----------------
A 10 1
A 13 3
A 14 .
B 9 4
B . 2

D.
Common X Y
---------------
A 10 1
A 13 1
【答案】A
A 14 1
A 10 3 【解释】这个题目应该是 base 里面的合并问题
A 13 3 其实程序的终止主要是看这两个数据集哪一个最后到最后一个观测,就终止。
A 14 3
第一个 SET 建立了 PDV,填入了 COMMON 和 X 数据。
B 9 4
B 9 2 第二个 SET 时,加入 Y 进入 PDV 中,并且每次更新了第一个 SET 中的 COMMON
X 变量的值。
答案:A
程序的终止是由于第一个 SET 数据集首先到最后一个观测
SET 语句读取数据集,若读到任意一个数据集的结尾 DATA 步直接结束,读取过程中,后续数据集的同名变量直接
覆盖 PDV 当中已存在的同名变量的值。

QUESTION 28

Which of the following ARRAY statements is similar to the statement

array Yr{1974:2007} Yr1974-Yr2007;

and will compile without errors?

A. array Yr{34} Yr1974-Yr2007;


B. array Yr{74:07} Yr1974-Yr2007;
C. array Yr{74-07} Yr1974-Yr2007;
D. array Yr{1974-2007} Yr1974-Yr2007;

答案:A
SAS 中数组的定义既可以是指定维度的大小,也可以指定下标的范围。

如,以下两个语句等价(各定义了二维数组,5 行 3 列) :
array x{5,3} score1-score15;
array x{1:5,1:3} score1-score15;
后续题目当中也有混合使用的定义方法:
array x{1-5,3} score1-score15;

40
QUESTION 29

The following program is submitted to check the variables Xa, Xb, and Xc in the SASUSER.LOOK
data set:

data _null_ WORK.BAD_DATA / view=WORK.BAD_DATA;


set SASUSER.LOOK(keep=Xa Xb Xc);
length _Check_ $ 10 ;
if Xa=. then _check_=trim(_Check_)!!" Xa" ;
if Xb=. then _check_=trim(_Check_)!!" Xb" ;
if Xc=. then _check_=trim(_Check_)!!" Xc" ;
put Xa= Xb= Xc= _check_= ;
run ;

When is the PUT statement executed?

A. When the code is submitted.


B. Only when the WORK.BAD_DATA view is used.
C. Both when the code is submitted and the view is used
D. Never, the use of _null_ in a view is a syntax error

答案:B 【解释】视图只有在调用时才会执行其他的程序,否则只是创建了一个视图的编译集合。

DATA 步视图先进行部分编译,只有当视图被调用的时候才生成可执行代码并执行。
详见 11 题。

QUESTION 30

The following SAS program is submitted:

%let product=merchandise;
[INSERT %PUT STATEMENT]

and the following message is written to the SAS log:

the value is "merchandise"

Which macro statement wrote this message?

A. %put the value is '"'&product.'"';


B. %put the value is %quote(&product.);
C. %put the value is "&product.";
D. %put the value is ""&product."";

答案:C
A 选项的输出结果为:

41
B 选项的输出结果为:

C 选项的输出结果为:

D 选项的输出结果为:

%QUOTE 以及%NRQUOTE 方程会遮盖如下的操作符:


+ − * / < > = ¬ ^ ~ ; , # 空格
AND OR NOT EQ NE LE LT GE GT IN
同时,会遮盖 '(单引号) 和 "(双引号) :配对使用时、单独使用时、或者被前置的%(百分号)标记时。

%NRQUOTE 同时还会遮蔽 & 和 %,


所以在参数可能含有不想被解析的 Macro 变量引用或者 Macro 调用时尤其有效。
B 选项可以改为:
B 选项可以改为 %put the value is %quote("&product.");
%put the value is %sysfunc(quote(&product.));

QUOTE()可以为字符串两端添加双引号,不移除空格,默认返回长度为 200 字节。

QUESTION 31

Given the SAS data sets:

WORK.ONE WORK.TWO

X Y SumY
--------- -----
A 10 36
A 3
A 14
B 9

The following SAS DATA step is submitted:

data WORK.COMBINE;
if _N_=1 then set WORK.TWO;
42
set WORK.ONE;
run;

What data values are stored in data set WORK.COMBINE?

A. An ERROR message is written to the SAS log and the data set WORK.COMBINE is not created.

B.
SumY X Y
---------------
36 A 10

C.
SumY X Y
---------------
36 A 10
. A 3
. A 14
. B 9

D.
SumY X Y
---------------
36 A 10
36 A 3
36 A 14
36 B 9

答案:D
使用 SET 语句读取的变量自动 Retain,当 DATA 步首次循环的时候即读取 SumY 变量,即所有的观测当中都会有 S
umY 这一变量。

QUESTION 32

The following SAS program is submitted:

data WORK.NEW(BUFNO=4);
set WORK.OLD(BUFNO=3);
run;

Why are the BUFNO options used?

A. to reduce memory usage


B. to reduce CPU time usage
C. to reduce the amount of data read
D. to reduce the number of I/O operations

答案:D
SAS 的缓存(Buffer)可以想象为内存中的一个容器,正好能够存储一页的数据。

一个数据页是存储器和内存交换数据的单位,对于已经建立的数据集来说是固定的(用户可以自定义)。

43
数据页大小(BUFSIZE)增加,能够降低 SAS 的 IO 次数,所以能够降低执行时间,但是会导致内存占用升高。

可以使用 PROC CONTENTS 来查看数据集的数据页大小。

64 位环境当中,未压缩的数据文件有 40 字节的开销
32 位环境有 24 字节的开销
每个观测 1bit 的开销(规整到最近的字节当中),用于标记观测的删除状态。
BUFSIZE=选项可以控制数据 SAS 数据集的页面大小,单位为字节。
BUFSIZE=MIN|MAX|n;

MIN 可能会导致非预期的结果,一般不应该置为此值。
BUFSIZE=0 将重置为操作环境默认值。
4K=4096,遵循 2 进制。

在某些情况下,增大页面大小会降低速度,特别是随机存取过程中。若想增加直接存取(随机存取)的性能,可以
考虑更改 BUFSIZE=的值。
BUFSIZE=数据集选项会覆盖 BUFSIZE=系统选项。

BUFNO=系统选项用于控制在读写 SAS 数据集过程当中可用的缓存的数目,通过改变缓存的数目,可以控制每次 IO


操作过程载入内存的数据页的数目。

BUFNO=MIN|MAX|n;

默认值是 MIN,由操作系统决定。MAX 为 4 字节最大整数,即 231-1,大约 20 亿。

推荐的最大值是 10。BUFNO=数据集选项会覆盖系统选项。

在读取大数据集的过程当中,增加缓存数目有很大可能不会增加性能。默认情况下,WINDOWS 和 UNIX 操作系统每


次读取一个缓存。在窗口操作环境下,可以通过在打开 SAS 时指定 SGIO 系统选项来改变系统默认值。

当读取小数据集的时候,可以通过指定与数据页同样多的缓存来一次载入整个数据集,对于多次读取同一观测的时
候最为有效。

BUFSIZE=以及 BUFNO=选项对 CPU 使用影响比较轻微。

QUESTION 33

Given the following program and desired results:

%let Thing1=gift;
%let Thing2=surprise;
%let Gift1=book;
%let Gift2=jewelry;
%let Surprise1=dinner;
%let Surprise2=movie;
%let Pick=2;
%let Choice=surprise;

Desired %PUT Results in LOG:


44
My favorite surprise is a movie

What is the correct %PUT statement that generates the desired results?

A. %put My favorite &Thing&Pick is a &&Choice&Pick;


B. %put My favorite &&Thing&Pick is a &&&Choice&Pick;
C. %put My favorite &Choice&Pick is a &&Thing&Pick;
D. %put My favorite &&Choice&pick is a &&&Thing&Pick;

答案:B
测试程序如下:

options symbolgen;
%let Thing1=gift;
%let Thing2=surprise;
%let Gift1=book;
%let Gift2=jewelry;
%let Surprise1=dinner;
%let Surprise2=movie;
%let Pick=2;
%let Choice=surprise;

%put My favorite &Thing&Pick is a &&Choice&Pick;


%put My favorite &&Thing&Pick is a &&&Choice&Pick;
%put My favorite &Choice&Pick is a &&Thing&Pick;
%put My favorite &&Choice&pick is a &&&Thing&Pick;

输出结果如下:

QUESTION 34

Given the SAS dataset WORK.ONE


Name Salary

45
----------------
Hans 200
Maria 205
Jose 310
Ariel 523

The following SAS program is submitted:


proc sql;
[INSERT SELECT CLAUSE]
from WORK.ONE
;
quit;

The following output is desired:


Salary Bonus
---------------------
200 20
205 20.5
310 31
523 52.3

Which SQL procedure clause completes the program and generates the desired output?
A. select Salary Bonus as Salary*.10 as Bonus
B. select Salary Bonus=Salary*.10 'Bonus'
C. select Salary, Salary*.10 label='Bonus'
D. select Salary, Salary*.10 column="Bonus"

答案:C
答案 A 可以改为:
select Salary, Salary*.10 as Bonus

QUESTION 35

The following SAS program is submitted:

options reuse=YES;
data SASUSER.REALESTATE(compress=CHAR);
set SASUSER.HOUSES;
run;

What is the effect of the reuse=YES SAS system option?


A. It allows updates in place.
B. It tracks and recycles free space.
C. It allows a permanently stored SAS data set to be replaced.
D. It allows users to access the same SAS data set concurrently.

答案:B
默认情况下,SAS 总是会将新的观测写入数据集的尾部。如果用户删除了数据集当中的某些观测,空闲的空间不会
被回收。

仅对压缩数据集可用:SAS 可以检测并重新利用数据集当中因为增删数据集而产生的碎片空间。

46
如果用户指定了 REUSE=YES,SAS 会跟踪并重复利用碎片空间。

QUESTION 36

Which statement is true for Data step HASH objects?

A. The key component must be numeric.


B. The data component may consist of numeric and character values.
C. The HASH object is created in one step and referenced in another.
D. The HASH object must be smaller than 2 to the 8th power bytes.

答案:B 【答案】B
从 SAS 9 开始,Hash Object 在 DATA 步中可用。 【解释】总结下 hash 的优缺点:
Advantages:
Hash Object 提供了快速存储以及查找数据的有效且方便的机制。 Use of character and numeric keys 【答案 B 的
依据】
与使用连续整数来定位数组元素的数组不同,Hash Object可以使
Use of composite(复合) keys
用数值以及字母的任意组合作为地址。 数组每次只能返回一个
Ability for faster look up 这个一般是 hash 的
值,而 Hash Object可以一次返回多个值。
一个主要功能 Ability to be loaded from a sas
Hash Object 在查找数据的过程中不需要数据进行排序或者索引。 data set
Fine level of control flexibility ,可灵活调用和
Hash Object 是 DATA 步的一个成员对象,包括属性以及方法。 控制
Ability to do chained lookups
data work.difference (drop= goalamount); 这个两条是缺点,最好记住。
if _N_ = 1 then do;
declare hash goal( ); Disadvantages:
goal.definekey("QtrNum"); Unique keys required
goal.definedata("GoalAmount"); DATA step only
goal.definedone( );
call missing(goalamount); 【联想】HASH:基于查找索引,HASH 为快
goal.add(key:'qtr1', data:10 ); 速存取数据提供高效的方便的方法。 Hiter:
goal.add(key:'qtr2', data:15 ); 以前一个或后一个索引顺序获得数据。
goal.add(key:'qtr3', data: 5 );
goal.add(key:'qtr4', data:15 ); Hash,一般翻译做“散列”,也有直接音译
end; 为“哈希”的。那么哈希函数的是什么样的?
set sasuser.contrib; 大概就是 value = hash(key),我们希望 key 和
goal.find();
Diff = amount - goalamount; value 之间是唯一的映射关系。
run;

DECLARE object object-name <(<argument_tag-1 : value-1 <, …argument_tag-n : value-n>>)>;

可用的 Object:
 HASH,指定生成一个 Hash 对象
 HITER,指定生成一个 Hash 迭代器对象,用于根据键值按升序或者降序的顺序从 Hash 对象当中检索数据。

DECLARE 是一个可执行语句。

KEY 是通过向 DEFINEKEY()方法指定 KEY 变量的名称来定义的。


KEY 可以由任意数量的 DATA 步数值或者字符变量组成。
当所有的 KEY 和数据都定义完毕时,调用 DEFINEDONE()方法。

47
CALL MISSING 方程:
需要注意的是,GOALAMOUNT 在输入数据集当中并不存在,并且不能在赋值语句的等号左边出现,SAS 会提示变量
没有初始化。
通过 CALL MISSING 方程,以 KEY 和数据变量作为参数,将指定的参数赋值为缺失值,达到初始化变量的目的。

建立了 HASH 对象,并将 HASH 对象的 KEY 以及数据变量初始化之后,可以通过 ADD()方法来给填充 HASH 对象。

通过使用 FIND 方法来从 HASH 对象当中查询数据,FIND 方法生成一个数值返回值,0 表示成功,其他表示失败。

如果 KEY 在 HASH 对象当中,FIND 方法将数据变量的值设为存储的值。

可以依据已经存在的数据集建立 HASH 对象:


data work.report;
if _N_=1 then do;
if 0 then set sasuser.acities(keep=Code City Name);
declare hash airports (dataset: "sasuser.acities");
airports.definekey ("Code");
airports.definedata ("City", "Name");
airports.definedone();
end;
set sasuser.revenue;
rc=airports.find(key:origin);
if rc=0 then do;
OriginCity=city;
OriginAirport=name;
end;
else do;
OriginCity='';
OriginAirport='';
end;
rc=airports.find(key:dest);
if rc=0 then do;
DestCity=city;
DestAirport=name;
end;
else do;
DestCity='';
DestAirport='';
end;
run;

不可执行 SET 语句:


if 0 then set sasuser.acities(keep=Code City Name);

这是为了根据已知数据集变量的属性来初始化 HASH 变量。


上述 SET 语句经过了编译,PDV 创建了相应变量,但是并不会执行。此时不需要 CALL MISSING

可以将数据集当中的所有变量指定为数据变量:
hashobject.DEFINEDATA (ALL: "YES");

QUESTION 37

Given the SAS data sets:

WORK.CLASS1 WORK.CLASS2
48
Name Course Name Class
-------------- ---------------
Lauren MATH1 Smith MATH2
Patel MATH1 Farmer MATH2
Chang MATH1 Patel MATH2
Chang MATH3 Hillier MATH2

The following SAS program is submitted:


proc sql;
select Name
from WORK.CLASS1
[INSERT SET OPERATOR]
select Name
from WORK.CLASS2
;
quit;

The following output is desired:


Name
-----
Chang
Chang
Lauren

Which SQL set operator completes the program and generates the desired output?
A. intersect corr
B. except all
C. intersect all
D. left except

答案:B
注意题目当中的 PROC SQL 语句仅选取了 Name 列。

变形:把题目的 EXCEPT ALL 直接填进去,求输出


proc sql;
select Name
from WORK.CLASS1
except all
select Name
from WORK.CLASS2
;
quit;

Output:
Name
-----
Chang
Chang
Lauren

QUESTION 38

The following SAS program is submitted:

49
%macro CHECK(Num=4);
%let Result=%eval(&Num gt 5);
%put Result is &result;
%mend;
%check(Num=10)

What is written to the SAS log?


A. Result is 0
B. Result is 1
C. Result is 10 gt 5
D. Result is true

答案:B
参见第 7 题。对于%eval()函数,逻辑表达式为真,则返回的字符值为 1。
题目变体:
macro CHECK(Num=10);
%let Result=%eval(&Num gt 5);
%put Result is &result;
%mend;
%check(Num=4)

此时应选择:
Result is 0

QUESTION 39

The following SAS program is submitted:

%let Mv=shoes;
%macro PRODUCT(Mv=bicycles);
%let Mv=clothes;
%mend;
%PRODUCT(Mv=tents)
%put Mv is &Mv;

What is written to the SAS log?


A. Mv is bicycles
B. Mv is clothes
C. Mv is shoes
D. Mv is tents

答案:C
改:
The following SAS program is submitted:
%let Mv=bicycles;
%macro PRODUCT(Mv=shoes);
%let Mv=clothes;
%mend;
%PRODUCT(Mv=tents)
%put Mv is &Mv;

What is written to the SAS log?


Mv is bicycles
50
QUESTION 40

Which of the following SAS System options can aid in benchmarking

A. BUFSIZE= and BUFNO=


B. FULLSTIMER
C. IOBLOCKSIZE=
D. SYSTIMER

答案:B
A 选项是用于指定系统缓冲区大小以及数量的。

B 选项:
系统选项 UNIX/Windows z/OS
STIMER | STIMER 为默认设置。 STIMER 为默认设置。
NOSTIMER 记录 CPU 时间以及实际时间,并在 SAS LOG 当中输出。 记录 CPU 时间。
在整个 SAS 会话当中有效。 在整个 SAS 会话当中有效。
可以在启动时指定,也可以通过 OPTIONS 语句指定。
MEMRPT | 不可单独设置。 MEMRPT 为默认设置。
NOMEMRPT 其功能作为 FULLSTIMER 选项的一部分。 记录内存使用量。
在整个 SAS 会话当中有效。
可以在启动时指定,也可以通过 OPT
IONS 语句指定
FULLSTIMER | 记录所有可用的资源统计项,并在 SAS LOG 当中输出。 记录所有可用的资源统计项,并在 S
NOFULLSTIMER 在整个 SAS 会话当中有效。 AS LOG 当中输出。
可以在启动时指定,也可以通过 OPTIONS 语句指定。 在整个 SAS 会话当中有效。
在 Windows 操作环境下,一些统计量只有在 SAS 启动时 可以在启动时指定,也可以通过 OPT
指定了 FULLSTIMER 时才能正确计算。 IONS 语句指定。
在 z/OS 操作环境中,FULLSTIMER
是 FULLSTATS 的别名。
必须与 STIMER 或者 MEMRPT 共同使
用,单独指定该选项无效。
STATS | 不可用。 STATS 是默认选项。
NOSTATS 将指定的性能统计量的报告写入 SAS
LOG。
可以在启动时指定,也可以通过 OPT
IONS 语句指定。

C 选项的 IOBLOCKSIZE=是 LIBNAME 选项,或者 DataSet 选项。决定了一次 IO 操作传输的字节数。IOBLOCKSIZE


越大,IO 次数越少。

D 选项是干扰项。

QUESTION 41

Given the following macro program:


51
%macro MAKEPGM(NEWNAME, SETNAME, PRINT);
data &NEWNAME;
set &SETNAME;
run;
%if &PRINT=YES %then %do;
proc print data=&NEWNAME.(obs=10);
run;
%end;
%mend;

Which option would provide feedback in the log about the parameter values passed into this m
acro when invoked?

A. MPRINT
B. MDEBUG
C. MLOGIC
D. MPARAM

答案:C
MPRINT:
当宏执行的时候,将提交给编译器文本打印到 SAS LOG

MLOGIC:
 宏执行的开始
 触发宏时提供的参数值
 每一个宏程序的执行
 %IF 条件的真值
 宏执行的结束

QUESTION 42

The NOTSORTED option on the BY statement cannot be used with which other statement or optio
n?

A. SET
B. MERGE
C. IF FIRST.by-variable
D. BY GROUPFORMAT by-variable

答案:B 【解释】merge 过程需要先对数据进行排序 sort,因此 nosorted 选项不能再 merge 中使用。


BY 语句的 NOTSORTED 选项不能与 MERGE 以及 UPDATE 语句合用。

NOTSORTED 选项可以在 BY 语句的任何位置出现。

QUESTION 43

Given the SAS data set WORK.ONE:

52
Rep Cost
----- ----
SMITH 200
SMITH 400
JONES 100
SMITH 600
JONES 100

The following SAS program is submitted:

proc sql;
select Rep,
avg(Cost) as Average from WORK.ONE
[EITHER INSERT SQL WHERE CLAUSE]
group by Rep
[OR INSERT SQL HAVING CLAUSE]
;
quit;

The following output is desired:

Rep Average
----- -------
SMITH 400

Which SQL clause completes the program and generates the desired output?

A. where calculated Average > (select avg(Cost) from WORK.ONE)


B. having Average > (select avg(Cost) from WORK.ONE)
C. having avg(Cost) < (select avg(Cost) from WORK.ONE)
D. where avg(Cost) > (select avg(Cost) from WORK.ONE)

答案:B
AVG(COST)=(200+400+100+600+100)/5=1400/5=280
即表格 Cost 列的总平均为 280。
SMITH 平均值为 400,JONES 平均值为 100。

变体
输出:
Rep Average
----- -------
JONES 100
需要选择的是:
having avg(Cost) < (select avg(Cost) from WORK.ONE)
或者是
having Average < (select avg(Cost) from WORK.ONE)

QUESTION 44

Which dictionary table provides information on each occurrence of the variable named LastNa
me?

53
A. DICTIONARY.TABLES
B. DICTIONARY.COLUMNS
C. DICTIONARY.MEMBERS
D. DICTIONARY.VARIABLES

答案:B
DICTIONARY 表格通常用于监视和管理 SAS 会话,因为数据更容易操作。

DICTIONARY 表格是特殊的只读 SAS 表格,其信息自动更新,每次被调用的时候创建。


其信息包括:
 SAS 逻辑库
 SAS 宏
 正在使用的外部数据文件
 在当前 SAS 会话当中可用的外部数据文件
 SAS 系统选项
 当前有效的 SAS 标题以及脚注

DICTIONARY SASHELP 信息
表格 视图
Catalogs Vcatalg Catalog 条目的信息
Columns Vcolumn 变量及其属性的详细信息(如名称,类型, 长度,格式)
Extfiles Vextfl 当前分配的 Fileref
Indexes Vindex 数据文件定义的索引的信息
Macros Vmacro 用户以及系统定义的宏的信息
Members Vmember 数据逻辑库的一般信息
Vsacces
Vscatlg
Vslib
Vstable
Vstabvw
Vsview
Options Voption 当前的 SAS 系统选项
Tables Vtable 数据集的详细信息
Titles Vtitle 指定的标题以及脚注文本
Views Vview 数据视图的一般信息

proc sql;
describe table dictionary.tables;
quit;

需要特别注意的是,这些表格当中的 LIBNAME,必须全部字母大写。

使用 WHERE 语句筛选视图当中观测的时候,列名称必须与 DICTIONARY 表格当中的大小写一致,并且必须添加引号。

DICTIONARY.TABLE 的定义如下(对于视图来说,NOBS 无数据):


create table DICTIONARY.TABLES
(
libname char(8) label='Library Name',
memname char(32) label='Member Name',

54
memtype char(8) label='Member Type',
dbms_memtype char(32) label='DBMS Member Type',
memlabel char(256) label='Data Set Label',
typemem char(8) label='Data Set Type',
crdate num format=DATETIME informat=DATETIME label='Date Created',
modate num format=DATETIME informat=DATETIME label='Date Modified',
nobs num label='Number of Physical Observations',
obslen num label='Observation Length',
nvar num label='Number of Variables',
protect char(3) label='Type of Password Protection',
compress char(8) label='Compression Routine',
encrypt char(8) label='Encryption',
npage num label='Number of Pages',
filesize num label='Size of File',
pcompress num label='Percent Compression',
reuse char(3) label='Reuse Space',
bufsize num label='Bufsize',
delobs num label='Number of Deleted Observations',
nlobs num label='Number of Logical Observations',
maxvar num label='Longest variable name',
maxlabel num label='Longest label',
maxgen num label='Maximum number of generations',
gen num label='Generation number',
attr char(3) label='Data Set Attributes',
indxtype char(9) label='Type of Indexes',
datarep char(32) label='Data Representation',
sortname char(8) label='Name of Collating Sequence',
sorttype char(4) label='Sorting Type',
sortchar char(8) label='Charset Sorted By',
reqvector char(24) format=$HEX48 informat=$HEX48 label='Requirements Vector',
datarepname char(170) label='Data Representation Name',
encoding char(256) label='Data Encoding',
audit char(3) label='Audit Trail Active?',
audit_before char(3) label='Audit Before Image?',
audit_admin char(3) label='Audit Admin Image?',
audit_error char(3) label='Audit Error Image?',
audit_data char(3) label='Audit Data Image?',
num_character num label='Number of Character Variables',
num_numeric num label='Number of Numeric Variables'
);

DICTIONARY.COLUMNS
create table DICTIONARY.COLUMNS
(
libname char(8) label='Library Name',
memname char(32) label='Member Name',
memtype char(8) label='Member Type',
name char(32) label='Column Name',
type char(4) label='Column Type',
length num label='Column Length',
npos num label='Column Position',
varnum num label='Column Number in Table',
label char(256) label='Column Label',
format char(49) label='Column Format',
informat char(49) label='Column Informat',
idxusage char(9) label='Column Index Type',
sortedby num label='Order in Key Sequence',
xtype char(12) label='Extended Type',
notnull char(3) label='Not NULL?',
precision num label='Precision',
scale num label='Scale',
transcode char(3) label='Transcoded?'
);
55
QUESTION 45

To create a list of unique Customer_Id values from the customer UNSORTED data set, which of
the following techniques can be used?

technique 1: proc SORT with NODUPKEY and OUT=


technique 2: data step with IF FIRST.Customer_Id=1
technique 3: proc SQL with the SELECT DISTINCT statement

A. only technique 1
B. techniques 1 and 2
C. techniques 1 and 3
D. techniques 1, 2, or 3

答案:C 【解释】“technique 2”需要实现对 FIRST 变量 SORT。


选项变化:
technique 1: proc SORT with NODUPKEY and OUT=
technique 2: proc SQL with the SELECT DISTINCT statement
technique 3: data step with IF FIRST.Customer_Id=1

题目有变体:给一个没排序的数据集,想要输出唯一值,选项类似。

测试程序:
data customer;
infile datalines;
input Customer_ID $ Amount;
datalines;
A112 105
A112 352
A005 930
B152 530
C592 952
T152 742
H125 656
A005 110
C592 150
B152 220
T152 110
;
run;
title 'Customer Raw';
proc print data=customer;
run;

proc sort data=customer nodupkey out=customer_1(keep=customer_id);


by Customer_ID;
run;
title 'Customer 1';
proc print data=customer_1;
run;

56
data customer_2;
set work.customer(keep=customer_id);
by customer_id;
if first.customer_id then output;
run;
title 'Customer 2';
proc print data=customer_2;
run;

proc sql;
title 'Customer 3';
select distinct customer_id
from work.customer;
quit;

输出分别为:

使用 PROC SORT

使用 FIRST.Customer_ID 的话必须按 BY 变量排序或者建立合适的索引,因为没有进行排序,所以失败。

57
使用 PROC SQL

QUESTION 46
与 Q37 相关
Given the SAS data sets:

WORK.CLASS1 WORK.CLASS2
Name Course Name Class
-------------- ---------------
Lauren MATH1 Smith MATH2
Patel MATH1 Farmer MATH2
Chang MATH1 Patel MATH2
Hillier MATH2

The following SAS program is submitted:


proc sql;
select Name
from WORK.CLASS1
[INSERT SET OPERATOR]
select Name
from WORK.CLASS2
;
quit;

The following output is desired:


Name
-----
Chang
Lauren

Which SQL set operator completes the program and generates the desired output?
A. intersect corr
B. except

58
C. intersect
D. left except

答案:B
注意题目当中的 PROC SQL 语句仅选取了 Name 列,所以加不加 CORR 对结果的影响不大。
与 37 题不同在于题目的 WORK.CLASS1 的数据少了一个重复行。

变体:
The following output is desired:
Name
-----
Patel

此时应该选 intersect 、intersect corr、intersect all 或者 intersect all corr。


只要有 INTERSECT 即可。
代码如下:
data class1;
infile datalines;
input name $ course $;
datalines;
Lauren MATH1
Patel MATH1
Chang MATH1
;
run;
data class2;
infile datalines;
input name $ course $;
datalines;
Smith MATH2
Farmer MATH2
Patel MATH2
Hillier MATH2
;
run;

proc sql;
select name
from class1
intersect
select name
from class2
;
quit;

QUESTION 47

The following SAS program is submitted:

%macro execute;
[INSERT STATEMENT HERE]
proc print data=SASUSER.HOUSES;
run;
%end;
%mend execute;
%execute

59
Which statement completes the program so that the PROC PRINT step executes on Thursday?

A. if &sysday = Thursday then %do;


B. %if &sysday = Thursday %then %do;
C. %if "&sysday" = Thursday %then %do;
D. %if &sysday = "Thursday" %then %do;

答案:B
宏语句需要加%,A 错。
宏变量默认是字符型,不需要加引号进行比较,C 错,D 也错。
&SYSDAY 用来表示当前 SAS 会话开始的星期当中的天。

QUESTION 48

Given the following program and data:

data WORK.BDAYINFO;
infile datalines;
input Name $ Birthday : mmddyy10.;
datalines;
Alan 11/15/1950
Barb 08/23/1966
Carl 09/01/1963
;
run;
%let Want=23AUG1966;
proc print data=WORK.BDAYINFO;
[INSERT STATEMENT]
run;

What is the WHERE statement that successfully completes the PROC PRINT and selects the obse
rvation for Barb?

A. where Birthday=&Want;
B. where Birthday="&Want";
C. where Birthday="&Want"d;
D. where Birthday='&Want'd;

答案:C
变体:把 Want 修改为别的单词

QUESTION 49

Which macro statement would remove the macro variable Mv_Info from the symbol table?

A. %mdelete &Mv_Info;
B. %symerase Mv_Info;
C. %symdel &Mv_Info;

60
D. %symdel Mv_Info; 【解释】%SYMDEL macro-variable(s)</option>;
macro-variable(s) is the name of one or more macro variables or a text
答案:D expression that generates one or more macro variable names. You cannot
%symdel 不需要解析 Mv_Info 这一变量。 use a SAS variable list or a macro expression that generates a SAS
variable list in a %SYMDEL statement.

QUESTION 50

The table WORK.PILOTS contains the following data:

WORK.PILOTS
Id Name Jobcode Salary
-----------------------------
001 Albert PT1 50000
002 Brenda PT1 70000
003 Carl PT1 60000
004 Donna PT2 80000
005 Edward PT2 90000
006 Flora PT3 100000

A query was constructed to display the pilot salary means at each level of Jobcode and the d
ifference to the overall mean salary:

Jobcode Average Difference


-------------------------------
PT1 60000 -15000
PT2 85000 10000
PT3 100000 25000

Which select statement could NOT have produced this output?

A.
select
Jobcode,
avg(Salary) as Average,
calculated Average - Overall as difference
from
WORK.PILOTS,
(select avg(Salary) as Overall from WORK.PILOTS)
group by jobcode
;

B. 【联想】如果语句调整成如下:
select proc sql;
Jobcode, select Jobcode,
avg(Salary) as Average,
(select avg(Salary) from WORK.PILOTS) as Overall, avg(Salary) as Average,
calculated Average - Overall as Difference Average - Overall as difference
from WORK.PILOTS from
group by 1 WORK.PILOTS,
; (select avg(Salary) as Overall from
WORK.PILOTS)
C.
group by jobcode ;
select
Jobcode, quit;
Average, 也是报错,主要原因是 select 语句
的“average”变量不能识别
61
Average - Overall as Difference
from
(
select Jobcode, avg(Salary) as Average
from WORK.PILOTS
group by 1
),
(select avg(Salary) as Overall from WORK.PILOTS)
;

D.
select
Jobcode,
avg(Salary) as Average,
calculated Average-(select avg(Salary) from WORK.PILOTS) as Difference
from WORK.PILOTS
group by 1
; 关于是否使用“calculated”:from 语句后面的所有变量,不管是否是通过计算生成的变量,还
是原来数据中都有变量,都视为原始变量,可以在 select 语句中直接引用。而 from 语句之前
答案:B 的变量,如果是计算生成的,则需要加 calculated。
主要考察 CALCULATED 关键字,若在后续的查询当中引用计算得到的变量,必须加 CALCULATED 关键字,否则 SAS
会提示列未找到。

QUESTION 51

The SAS data set WORK.TEMP is indexed on the variable Id:

Id Amount
---------
P 52
P 45
A 13
A 56
R 34
R 12
R 78

The following SAS program is submitted:


proc print data=WORK.TEMP;
[INSERT BY STATEMENT]
run;

Which BY statement completes the program, creates a listing report that is grouped by Id, an
d completes without errors?
A. by Id;
B. by Id grouped;
C. by Id descending;
D. by descending Id;

答案:A

变体:
The following SAS program is submitted:

62
proc print data=WORK.TEMP;
by Id;
run;

A. Stops excute because dataset is not in ascending order.


B. Stops excute because dataset is not in decending order.
C. Excute without problem and generate an output.
D. Excute only with index=USE option is on.

答案:C

测试程序:
data temp;
infile datalines;
input id $ amount;
datalines;
P 52
P 45
A 13
A 56
R 34
R 12
R 78
;
run;
proc sql;
create index id on work.temp(id);
run;

proc print data=work.temp;


by descending id;
run;

proc print data=work.temp;


by id;
run;
其中,对于 D 选项,BY DESCENDING ID 会提示错误,而且输出结果错误,如下图:

63
A 选项的正确输出结果:

需要注意的是,如果没有经过排序,则无论是 by Id; 还是 by descending Id; 都是会出错的。

输出结果如下:

64
QUESTION 52

To create a dataset with unique values of a given variable using a data step and the FIRST.V
ARIABLES and LAST.VARIABLES, it is assumed that the input dataset is:

A. sorted on that variable.


B. indexed by that variable.
C. naturally in order.
D. any of the above A, B, or C

答案:D
参见 SAS 9.4 Statements Reference, 5th ed. 至少全部编程试验过,都可以得出想要的dataset。

测试代码:
* Raw Data File;
data bytest_raw;
infile datalines;
input id$ amount;
datalines;
B 52
C 12
A 13
B 34
C 45
A 56
C 78
;
run;
title 'Raw Data';
proc print data=bytest_raw;
run;

* Sorted on that variable;


proc sort data=bytest_raw out=bytest_sorted;
by id;
run;
data bytest_sorted_output;
set bytest_sorted;
by id;
if first.id then output;
if last.id then output;
run;
title 'Sorted on that variable';
proc print data=bytest_sorted_output;
run;

* Indexed by that variable;


data bytest_index (index=(id));
set bytest_raw;
run;

data bytest_index_output;
set bytest_index;
by id;
if first.id then output;
if last.id then output;
run;
title 'Indexed by that variable';
65
proc print data=bytest_index_output;
run;

* Naturally in order;
data bytest_naturalorder;
infile datalines;
input id$ amount;
datalines;
A 13
A 56
B 52
B 34
C 12
C 45
C 78
;
run;
data bytest_naturalorder_output;
set bytest_naturalorder;
by id;
if first.id then output;
if last.id then output;
run;
title 'Naturally in order';
proc print data=bytest_naturalorder_output;
run;

title '';

QUESTION 53

The SASFILE statement requests that a SAS data set be opened and loaded into memory:

A. One page at a time.


B. One variable at a time.
C. One observation at a time.
D. In its entirety, if possible.

答案:D
变体:
SASFILE 语句:
优点:降低 I/O,降低 CPU 使用
缺点:增大内存占用

SASFILE 语句一般格式:
SASFILE SAS-data-file <(password-options(s))> OPEN | LOAD | CLOSE;

SASFILE 语句打开 SAS 数据文件,分配足够的缓存,然后将整个文件读入内存。


一旦文件被读入内存,就在内存中存储,直到:
 遇到 SASFILE CLOSE 语句,清空缓存,关闭文件
 SAS 回话结束,自动清空缓存并关闭文件

用 SASFILE 语句读取的文件不能替换或者重命名其变量。
如:
66
sasfile company.sales load;
proc print data=company.sales;
var Customer_Age_Group;
run;
proc tabulate data=company.sales;
class Customer_Age_Group;
var Customer_BirthDate;
table Customer_Age_Group,Customer_BirthDate*(mean median);
run;
sasfile company.sales close;

QUESTION 54

The following SAS program is submitted:


%let Name1=Shoes;
%let Name2=Clothes;
%let Root=name;
%let Suffix=2;
%put &&&Root&Suffix;

What is written to the SAS log?

A. &Name2
B. Clothes
C. &&&Root&Suffix
D. WARNING: Apparent symbolic reference ROOT2 not resolved.

答案:B
变体:
%let Name1=MATH1;
%let Name2=MATH3;
%let Root=name;
%let Suffix=2;
%put &&&Root&Suffix;
输出:
MATH3

两种间接变量解析的示意图如下:

QUESTION 55

Given the SAS data sets:

WORK.ONE WORK.TWO
Year Qtr Budget Year Qtr Sales
---- --- ------ ---- --- -----
67
2001 3 500 2001 4 300
2001 4 400 2002 1 600
2003 1 350

The following SAS program is submitted:

proc sql;
select TWO.*, budget
from
WORK.ONE
[INSERT JOIN OPERATOR]
WORK.TWO
on ONE.Year=TWO.Year
;
quit;

The following output is desired:

Year Qtr Sales Budget


---- --- ----- ------
2001 4 300 500
2001 4 300 400
2002 1 600 .
. . . 350

Which join operator completes the program and generates the desired output?

A. left join
B. right join
C. full join
D. outer join

答案:C
需要特别注意 SELECT 当中的 TWO.* D 选项运行结果:
ERROR 73-322: 期望“UNION”。
变体: ERROR 76-322: 语法错误,语句将被忽略。
LEFT JOIN:
Year Qtr Sales Budget
---- --- ----- ------
2001 4 300 500
2001 4 300 400
. . . 350

测试程序:
data work.one;
infile datalines;
input Year Qtr Budget;
datalines;
2001 3 500
2001 4 400
2003 1 350
;
run;

data work.two;
infile datalines;
68
input Year Qtr Sales;
datalines;
2001 4 300
2002 1 600
;
run;

title 'Left Join';


proc sql;
select TWO.*, budget
from WORK.ONE
LEFT JOIN
WORK.TWO
on ONE.Year=TWO.Year
;
quit;

title 'Right Join';


proc sql;
select TWO.*, budget
from WORK.ONE
Right JOIN
WORK.TWO
on ONE.Year=TWO.Year
;
quit;

title 'Full Join';


proc sql;
select TWO.*, budget
from WORK.ONE
Full JOIN
WORK.TWO
on ONE.Year=TWO.Year
;
quit;

title ' ';

proc sql;
select TWO.*, budget
from
WORK.ONE [INSERT JOIN OPERATOR] WORK.TWO on ONE.Year=TWO.Year;
quit;
结果如下:

69
LEFT JOIN,右侧的表格没匹配的会赋缺失值。
RIGHT JOIN,左侧的表格没匹配的会赋缺失值。
FULL JOIN 两侧表格都有缺失值。
特别注意的是此题目将右侧表格列在了输出结果的左侧。

QUESTION 56

data _null_;
set WORK.ADDRESSES;
[INSERT STATEMENT]
put "filename mail email '" Email_Address "'; ";
put "data _null_;";
put " file mail;";
put " put 'Thank you for your continued';";
put " put 'support of The XYZ Corporation.';";
put " put 'We appreciate your patronage.';";
put " put 'Sincerely,';" ;
put " put 'The XYZ Corporation';";
put "run;" ;
run;

Which statement completes the program and creates a SAS program file?
A. infile 'c:\email.sas';
B. output 'c:\email.sas';
C. file 'c:\email.sas';
D. None of the above.

答案:C
INFILE 用于指定文件输入,OUTPUT 用于输出数据集。
FILE 语句用于指定外部文件,使用 PUT 语句进行输出。

QUESTION 57

Which of the following is true about the COMPRESS=YES data set option?

A. It uses the Ross Data Compression method to compress numeric data.


B. It is most effective with character data that contains repeated characters.
C. It is most effective with numeric data that represents large numeric values.
D. It is most effective with character data that contains patterns, rather than simple repe

70
titions.

答案:B
YES | CHAR
使用 RLE(Run Length Encoding,游长编码)压缩算法,将连续重复的字符(包括空格)缩减成为 2 字节/3 字节
的代表字符。RLE 对包含连续字符的情况下非常有效,同时对于大部分数值为 0 的数值数据也有效。

BINARY
使用 RDC(Ross Data Compression),结合了游长编码以及滑动窗口压缩。对于包含大量二进制块的数据有效(数
值变量) 。
BINARY 对于长度达到几百字节或者更长的观测更有效。
BINARY 对于包含模式的字符数据更有效(较简单重复的字符数据来说)。

使用 BINARY 压缩的文件比 YES|CHAR 压缩的更费 CPU 时间。

QUESTION 58

Given the SAS dataset WORK.ONE:

Salary
------
200
205
.
523

The following SAS program is submitted:

proc sql;
select *
from WORK.ONE
[INSERT WHERE CLAUSE]
;
quit;

The following output is desired:

Salary
------
200
205
523

Which WHERE expression completes the program and generates the desired output?

A. where Salary is not .


B. where Salary ne missing
C. where Salary ne null
D. where Salary is not missing

答案:D
IS MISSING 与 IS NULL 等价,判断缺失值:
71
WHERE AnyColumn IS MISSING
WHERE AnyColumn IS NULL
WHERE NumColumn = .
WHERE CharColumn = ' '

选取不包含缺失值的观测:
WHERE AnyColumn IS NOT MISSING
WHERE AnyColumn IS NOT NULL
WHERE NumColumn ne .
WHERE NumColumn ^= .
WHERE CharColumn ne ' '
WHERE CharColumn ^= ' '

其中,SAS 支持的 PROC SQL 比较操作符


SAS 操作符 助记操作符 描述
= EQ 相等
^= NE 不相等
< LT 小于
<= LE 小于等于
> GT 大于
>= GE 大于等于
SAS 宏操作符:
优先级 SAS 操作符 助记操作符 描述
1 ** 指数运算
+ 正前缀
2
- 负前缀
3 ^ NOT 逻辑非
* 乘
4
/ 除
+ 加
5
- 减
6 < LT 小于
<= LE 小于等于
= EQ 相等
# IN 与列表之一相等
^= NE 不相等
>= GE 大于等于
> GT 大于
7 & AND 逻辑与
8 | OR 逻辑或

72
QUESTION 59

The SAS data set WORK.TEST has an index on the variable Id and the following SAS program is
submitted.

data WORK.TEST;
set WORK.TEST(keep=Id Var_1 Var_2 rename=(Id=Id_Code));
Total=sum(Var_1, Var_2);
run;

Which describes the result of submitting the SAS program?

A. The index on Id is deleted.


B. The index on Id is updated as an index on Id_Code.
C. The index on Id is deleted and an index on Id_Code is created.
D. The index on Id is recreated as an index on Id_Code.
再做个简单的实验,生成一个新 data set,在 set 语句中引用之前带 index
答案:A
的数据,(不做任何动作)然后再用 proc contents 来检查,会发现新数据
保留 INDEX 的一种方法: 仍然没有 index。这时我们就明白原来根本 dropping index 和 rename option
无关, 而是在 data step 中使用 set 语句调用数据并不会把数据中的 index
PROC DATASETS LIB=WORK NOLIST;
copy 到新数据中来。手册中说的情况大前提是使用 proc datasets
MODIFY TEMP;
RENAME ID=ID_CODE; procedure!!!在 datasets procedure 中使用 rename statement 更改
QUIT; variable,会自动 update 相关的 index。

QUESTION 60

Given the data set SASHELP.CLASS:

Name Age
-----------
Mary 15
Philip 16
Robert 12
Ronald 15

The following SAS program is submitted:

%macro MP_ONE(pname=means);
proc &pname data=SASHELP.CLASS;
run;
%mend;

%MP_ONE(print)
%MP_ONE()

Which PROC steps execute successfully?

A. PROC MEANS only


B. PROC PRINT only

73
C. PROC MEANS and PROC PRINT
D. No PROC steps execute successfully

答案:A
%MP_ONE(print)执行的日志如下:

由于定义过程当中使用的是 KEYWORD 参数,必须指定 KEYWORD=的形式才能正确赋值,与其他语言不一致。


可以修改为如下形式,正确执行 PROC PRINT 以及 PROC MEANS:
%macro MP_ONE(pname=means);
proc &pname data=SASHELP.CLASS;
run;
%mend;
%MP_ONE(pname=print)
%MP_ONE()

QUESTION 61

In a data step merge, the BY variables in all data sets must have the same:

A. name.
B. name and type.
C. name and length.
D. name, type, and length.

答案:B
PrepGuide P520
BY 变量可以在 DATA 步当中通过 RENAME=选项重命名。
BY 变量必须名字、类型相同,但是长度不一定相同,若长度不相同则由第一个列出的数据集决定。

QUESTION 62

Given the following macro program and invocation:

%macro MAKEPGM(NEWNAME, SETNAME);


data &NEWNAME;
set &SETNAME;
run;
%put ->(!!) inside macro &NEWNAME &SETNAME;
%mend;
%MAKEPGM(WORK.NEW, SASHELP.CLASS)
%put —>(!!) outside macro &NEWNAME &SETNAME;

Which of these choices shows the correct %PUT statement output if the program is submitted a
t the beginning of a new SAS session?
Note that other lines may be written to the SAS log by the program but only the %PUT output
is shown here.
74
A.
->(!!) inside macro WORK.NEW SASHELP.CLASS
->(!!) outside invocation WORK.NEW SASHELP.CLASS
B.
->(!!) inside macro WORK.NEW SASHELP.CLASS
->(!!) outside invocation &NEWNAME &SETNAME
C.
->(!!) inside macro &NEWNAME &SETNAME
->(!!) outside invocation WORK.NEW SASHELP.CLASS
D.
->(!!) inside macro &NEWNAME &SETNAME
->(!!) outside invocation &NEWNAME &SETNAME

答案:B

QUESTION 63

The following SAS program is submitted:

%macro COLS1;
Name Age;
%mend;

%macro COLS2;
Height Weight;
%mend;

proc print data=SASHELP.CLASS;


[INSERT VAR STATEMENT HERE]
run;

Which VAR statement successfully completes the program to produce a report containing four
variables? 【联想】如果把宏改成:
A. var %COLS1 %COLS2; %macro COLS1;
B. var %COLS1-%COLS2; Name Age
C. var %COLS1 Weight Height; %mend;
D. var Weight Height %COLS1; %macro COLS2;
Height Weight
答案:D
%mend;
注意 Macro定义当中的分号,会导致 答案选 A 或者 C
Macro被触发时写入分号.实际 D选项当中的 如果把宏改成:
结尾分号也不需要,可以直接写成代码: %macro COLS1;
var Weight Height %COLS1
Name Age;
%mend;
变体:
%macro COLS2;
题目代码变为:
%macro COLS1; Height Weight
Name Age; %mend;
%mend; 答案可以选择 D
%macro COLS2; 原因很简单:
Height Weight; 宏程序被解析之后是一个带有分号的表达式,而分号在 SAS 语
%mend; 句中是一条语句的终止符号,等于此行程序执行的终点。
75
proc print data=SASHELP.CLASS;
Weight Height %COLS1;
run;
问:输出什么变量?
选择: Weight Height Name Age
需要特别注意变量的排序顺序,Weight 和 Height 变量的位置不能有错误。

=========================================================================================

通过 KEY=更新表格:
data indexlib.prodfile;
set indexlib.tranfile;
modify indexlib.prodfile key=seqnum;
select (_iorc_);
when(%sysrc(_sok)) do; /* A match was found, update master */
actual = newactual;
predict = newpredict;
replace;
end;
when (%sysrc(_dsenom)) do; /* No match was found */
_error_ = 0;
end;
otherwise do;
length errormessage $200.;
errormessage = iorcmsg();
put "ATTENTION: unknown error condition: "
errormessage;
end;
end;
run;

76
新题目
题目 1

%let this_year=%substr(&sysdate9, 6);


%let next_year=&this_year+1;
%let check_year=%eval(&next_year<2016);
%put two years after this year is &next_year+1;
%put check_year is &check_year;

Assume system time is 01Jan2013, what is the output?

two years after this year is 2013+1+1


check_year is 1

需要注意,&check_year 的值是在&next_year<2016 的基础上进行的判断,即 2013+1<2016。


输出的 Two Years After 是在输出的过程当中又进行了一次+1,所以输出的和用于判断的并不是一个年份,需要
注意。

a) 模拟 01Jan2013 的程序如下:
%let this_year=2013;
%let next_year=&this_year+1;
%let check_year=%eval(&next_year<2016);
%put two years after this year is &next_year+1;
%put check_year is &check_year;
输出为:
two years after this year is 2013+1+1
check_year is 1

b) 模拟 01Jan2014 的程序如下:
%let this_year=2014;
%let next_year=&this_year+1;
%let check_year=%eval(&next_year<2016);
%put two years after this year is &next_year+1;
%put check_year is &check_year;
输出为:
two years after this year is 2014+1+1
check_year is 1

c) 模拟 01Jan2015 的程序如下:
%let this_year=2014;
%let next_year=&this_year+1;
%let check_year=%eval(&next_year<2016);
%put two years after this year is &next_year+1;
%put check_year is &check_year;
输出为:
two years after this year is 2015+1+1
check_year is 0

题目 2

77
%let a=1;
%let b=2;
%macro test;
%let c=4;
%do i=1 %to 3;
%let d&i=123&i;
%end;
%put ______;
%mend;
%let c=3;
%test;

Output
GLOBAL a 1 ASK: 如何在 Log 中輸出 global macro variables
GLOBAL b 2 ANS: %put_GLOBAL_
GLOBAL c 4 FIB: _GLOBAL_

SAS log shows three global macro variables, so we should use %put _global_ ;

题目 3

Given two datasets, variables:


Data set WORK.ONE
State_ID State
-------- --------------
NC North Carolina

Data set WORK.TWO


State_ID City
-------- -----

Select the state of North Carolina.

%let selection=North Carolina;


proc sql step;

where s.state= "&selection";
quit;

题目 4

Output title "XXX A&M XXX", which macro definition should be used.

A. title %sysfundc("XXX A&M XXX");


B. title %str("XXX A&M XXX");
C. title %nrstr("XXX A&M XXX");
D. title %bquote("XXX A&M XXX);

题目 5

the first part of code gives Key:valuepair definition, the variables are somekey and someAl
78
pha, we need to fill in the hash object definition. HASH object
FIB:HashAlpha
Some.definedata("someAlpha");

题目 6

repeated need a local data set, what kind of effect does SASFILE statement has to the Global
statement.

Answer: reduce CPU, reduce I/O, increase memory

题目 7

Which option instructs SAS to use a specific index for where statements?
IDXNAME=index-name 数据集选项,用于指定使用的索引名称
IDXNAME=… (instruct SAS to use a specific index for where processing)
Which option instructs SAS to use indexes or not?
IDXWHERE= YES | NO 数据集选项,用于指定是否使用索引

题目 8

proc fcmp outlib=sasuser.funcs.trial;



endsub;

options cmplib =sasuser.funcs;


data _null_;

run;

Which option should be used?


Options:

A. UTLLOC
B. LIBREF
C. FMTSEARCH
D. CMPLIB

题目 9

A data set has 2000 million observations and 300 character variables.
What is the correct way to compress?

compress= YES | CHAR

79
题目 10

A compressed data set has 200,000 observations, 300 variables. We need 20% of character obs
ervations.
What method can minimize computer resource usage?

A. If-then/else clause
B. Case
C. Where
D. Drop

题目 11

A data set ahs 300,000 observations, 20 character variables, 50 numeric variables. We need
5 character variables and 7 numeric variables, which one is the most efficient:

A. DROP= option in data step.


B. KEEP= option in data step.
C. KEEP= option in set statement.
D. KEEP statement.

题目 12

data multi(keep=i j output);


array multi{1:2, 2}(1,2);
do i=1 to 2;
do j=1 to 2;
output=multi{i,j};
output;
end;
end;
run;

What is the corresponding values of i, j, and output.


data multi(keep=i j output);
array multi{1:2, 2}(1,2);
do i=1 to 2;
do j=1 to 2;
output=multi{i,j}; 1 2 
output;
. . 
end;
end;
 
run;
proc print data=multi noobs;
run;
multi 变量的自动命名为 multi1-multi4,如果需要改变默认命名,如下:
array multi{1:2, 2} mul5-mul8 (1,2);

题目 13
80
Data company.newdata/view=company.newdata;
Infile<fileref>;
<Data step statements>;
run;

Submit the above code and create a data step view, then we need to use this view in the PROC
MEANS procedure, which one to use:

A. Proc Means view=company.newdata;


B. Proc Means data=company.newdata/view=company.newdata;
C. Proc means data company.newdata/view;
D. Proc means data=company.newdata;

题目 14

Given two format with the same name $Gender, one store in Mylib, and the other in library.

options fmtsearch= ;
proc print data=xxx.xxx;
run;

Using the format $Gender. From the desired output, we can tell that the format in Mylib is u
sed.

Which statement should be filled in here?


Note: D is wrong.
A. no fmtsearch needed Without noting fmsearch options, the default search order is
B. fmtsearch=(mylib, library) (1 work.formats 2 library.formats 3 mylib.formats)
If specified as D, then the search order is
C. fmtsearch=(library, mylib)
(1. Work.formats 2. library.formats 3. mylib)
D. fmtsearch=(mylib) If specified as B, then the search order is
(1. Work.formats 2. mylib 3.library.formats )
默认是:Work.Formats -> Library.Formats。如果指定了其他 Library,则默认的顺序推后。

题目 15

Car column variables: year, model, color, name etc.

Grouped by Model.

Model: Sonata, Elantra, etc.

To select the unique values of model:

If first.model =1, then output=unique_model;

题目 16
81
The following SAS program is submitted:

%macro COLS1;
Name Age;
%mend;
%macro COLS2;
Height Weight;
%mend;
proc print data=SASHELP.CLASS;
var Weight Height %COLS1;
run;

Which variables are in the output in order?


Answer: Weight Height Name Age

Note, no semicolon after Age! The system ignores the extra semicolon.

题目 17

Check the PageSize information of a data set using proc contents procedure.

A. Proc contents
B. Proc print
C. Proc report
D. Proc catalog

题目 17

How to check the PageSize information of a data set using Proc SQL?

A. describe table(table-name);
B. describe table:table-name;
C. describe table table-name;
D. describe table=table-name;

题目 18

given data sets and program code, calculated the returned average value from the subquery .
Avg(Num)=avg(6,8)=7

Data SET WORK.ONE:


Name Year
------ ----
Joyce 9
John 4
John 2
Jane 6
Thomas 8
82
DATA SET: WORK.TWO
Name Age
------ ----
Joyce 35
John 40
Thomas 35
Robert 55
Jeff 34

The following SAS program is submitted:


proc sql;
select Name, Avg(year) as average
from work.one
where name in
(
select * from work.one
except corr
select * from work.two
);
quit;

proc sql;
select Name, Avg(year) as average
from work.one
where not exists
(
select * from work.two
where one.name = two.name
);
quit;

The output average is 7 。

data work.one;
infile datalines;
input Name $ Year;
datalines;
Joyce 9
John 4
John 2
Jane 6
Thomas 8
;
run;

data work.two;
infile datalines;
input Name $ Age;
datalines;
Joyce 29
John 33
Robert 22
Jeff 34
;
run;

proc sql;
select Name, Avg(year) as average
from work.one
where not exists

83
(
select * from work.two
where one.name = two.name
);
quit;

proc sql;
select Name, Avg(year) as average
from work.one
where name not in
(
select name from work.two
);
quit;

Advanced 里,有一个机经题,给两个 data set A, B 给了程序变量分别为 name years(工作年限)和 name age,


做 A expcet B ,我不懂的地方是 columns 两个 data set 都是全选的,但是 A 跟 B 只有 name 这一列有重名(名
字我记不清了,比如说):A 中有 John, John, Adam,B 中有 John, Adam.正确答案是 A 中 John, John, Adam
都去掉,算 avg(years)=(6+8)/2=7 这个是正确答案。但是原理我没有搞清楚。多亏了写机经的同学。

题目 19

nested query and inner join


given two data sets and SQL code, ask for the output.
Choose the answer with Thomas, Jones, Smith, but no Adam. Besides, there is a descreasing op
tion in the code, so the Sales need to be in decreasing order.

题目 20

LEFT JOIN, IN-LINE VIEW

Product
Product_id Product
---------- -------
1 1001
2 1002
3 1003

Sales
Product_id Sales
---------- -------
3 100
1 200
5 100
1 200
3 100
1 100

The following program was submitted:


proc sql;

84
select p.product, s.totalsales
from product as p
left join
(
select product_id, sum(sales) as totalsales
from sales
group by product_id
) as s
on p.product_id=s.product_id;
quit;

What is the output?

Product Totalsales
------- ----------
1001 500
1002 .
1003 200

测试程序:
data product;
infile datalines;
input product_id product;
datalines;
1 1001
2 1002
3 1003
;
run;
title 'Product';
proc print data=product noobs;
run;

data sales;
infile datalines;
input product_id sales;
datalines;
3 100
1 200
5 100
1 200
3 100
1 100
;
run;
title 'Sales';
proc print data=sales noobs;
run;

title 'Intermediate Results from Inline-View';


proc sql;
select product_id, sum(sales) as totalsales
from sales
group by product_id
;
quit;

title 'LEFT JOIN + IN-LINE VIEW';


proc sql;
select p.product, s.totalsales
from product as p
left join

85
(select product_id, sum(sales) as totalsales
from sales
group by product_id) as s
on p.product_id=s.product_id;
quit;

Q: Horizontal join set operator


(i) right join
Two data sets
Work.One
year sales
2001 800
2001 500
2003 700
Work.Two
year profit
2001 100
2002 200

proc sql;
select sum(profit)
Q: in-line view from one right join two
给了一段 code 明确告知 in-line view 中给定的 on one.year=two.year;
quit;
21 condition 有 multiply observations satisfied the
What is the output?
condition, 问 program 运行结果。 A. 100
答案是运行出错没有结果,因为 in-line view return B. 300
multiple results. C. 400
D. 500

ANS:C
There are two 2001 year in the left set(Work.one),
so the joined data set has three observations for the
variable profit: 100, 100, 200
Sum(profit)=400
86

You might also like