Which statement is true when creating two SAS data sets with a DATA step?
Answer : A
When creating two SAS data sets with a DATA step, you should name both data sets in the DATA statement. This tells SAS to create two separate data sets from the same DATA step. Using an OUT= option in the WHERE statement or a PUT statement are not correct methods for creating data sets. A SET statement is used to read data into the DATA step, not to create output data sets.
SAS documentation on the DATA step for creating multiple data sets.
Given the SAS data set WORK PRODUCTS:
How many variables does the WORK REVENUE data set contains?
Answer : D
The resulting WORK.REVENUE data set contains 3 variables. In the provided SAS program, the set statement uses a (keep=) option to keep only 3 variables from WORK.PRODUCTS, which are ProdId, Price, and Sales. The drop= data set option in the data statement specifies to drop the Sales and Returns variables from the output data set. However, Returns was not included in the keep= option, so it won't be part of WORK.REVENUE to begin with. Finally, a new variable Revenue is calculated and included in the data set. Therefore, the final data set contains the variables ProdId, Price, and Revenue.
SAS documentation on keep= and drop= data set options.
Which statements read the input data set SASHELP. SHOES and create the output data set WORK. TOTAL?
Answer : C
Option C is correct. The set statement is used within a DATA step to read an existing SAS data set. In this case, set sashelp.shoes; reads the SASHELP.SHOES data set, and data work.total; specifies the creation of a new data set named TOTAL in the WORK library. The options A, B, and D are incorrect because they either use the wrong syntax or are not proper statements for reading an input data set and creating an output data set.
SAS 9.4 documentation on the DATA step.
Which PROC PRINT option displays variable labels in the report?
Answer : D
In the PROC PRINT statement, the LABEL option is used to display variable labels in the report. If the variables in the dataset have labels associated with them, the LABEL option will ensure that these labels are used in the output instead of the variable names.
Here's how the LABEL option is used:
proc print data=sashelp.class label; (assuming sashelp.class is the dataset in question)
The LABEL keyword after the dataset name within the PROC PRINT call activates the use of variable labels in the output report.
The other options provided are incorrect for the following reasons:
SHOWLABELS is not a valid SAS option.
COLS does not exist as a PROC PRINT option in SAS.
LABELS= is not a correct syntax for any SAS procedure.
SAS 9.4 documentation for the PROC PRINT statement with LABEL option: [SAS Help Center: PROC PRINT]
Which program generates the PROC MEANS report below?
Answer : D
The PROC MEANS report shown in the image displays statistics for a single variable Age with no decimal places. The correct SAS code to generate such a report is option D, which uses the PROC MEANS procedure with the maxdec=0 option to control the number of decimal places displayed (in this case, zero) and specifies Age as the analysis variable using the var statement.
The PROC MEANS procedure computes descriptive statistics for numeric variables. The maxdec=0 option is used here to remove decimal places from the report output, which matches the output shown where the mean, standard deviation, minimum, and maximum are all integers.
The other options are incorrect for the following reasons:
A uses the class statement, which is not appropriate here because Age is not a classification variable but an analysis variable.
B uses the group statement, which is not a valid statement in the PROC MEANS procedure.
C uses the by statement, which requires the data to be sorted by the BY variable and does not fit the output provided since it would produce separate statistics for each value of Age.
SAS 9.4 documentation for the PROC MEANS statement: SAS Help Center: PROC MEANS
Given the data sets AMERICIAN NATIONAL and results in the data set BASEBALL shown below:
Which DATA step correctly creates the BASEBALL data set?
Answer : B
The correct answer is B. The set statement in SAS can concatenate multiple datasets together. The AMERICAN and NATIONAL datasets appear to have the same structure and variables, so the set statement without any options will combine them into one dataset called BASEBALL. The order of the datasets in the set statement determines the order of the observations in the output dataset, and since there is no variable that needs renaming to match between the two datasets, option B is the correct answer.
SAS documentation on the set statement.
Given the input data sets EMPLOYEES and DONATIONS, and the output data set NODONATIONS below:
Answer : C
The correct answer is C. To create the NODONATIONS dataset, which includes only the employees who did not make any donations, a merge operation is used with a conditional statement to filter out those records. The syntax if inE=1 and inD=0; ensures that the merged dataset includes only the observations from the EMPLOYEES dataset that do not have a corresponding observation in the DONATIONS dataset. This is achieved using data step processing with the in= option to create temporary variables that indicate whether the data is coming from the input dataset.
In the other options:
A is incorrect because inE=0 and inD=0 will never be true after a merge since at least one of the datasets must contribute to the merged observation.
B lacks the necessary conditional logic to filter out employees with donations.
D is incorrect because if inE=1 and inD=1; would actually select employees who made donations, the opposite of what we want.
SAS documentation on merging data sets using the DATA step.