Assignment 3 (Base SAS)
To compare the sepal length data for the 3 types of iris (data given in Assignment2), the most appropriate analysis would be a one factor analysis of variance. Using SAS to do this analysis will be discussed in Project 8. Here two sample ttests are done to make pairwise comparisons of the 3 types.
Recall that the two sample ttest is used to test the equality of means when one has 2 independent samples from 2 normal populations. In the standard scenario, these normal populations are assumed to have equal, but unknown, variances. Approximations can be made to handle the case in which the variances are unknown but not assumed to be equal.
In order to use proc ttest, it is necesary to have a variable in the data set that can be used in the class statement.
So task is to create a new dataset
ASSIGNMENT:
1. Use the techniques in Defining New Variables and Do Loops to create a SAS data set containing 5 variables: 1)species 2) m1 2) m2 3) m3 4) m4. These variables contain an indicator of the species of each plant and the 4 corresponding measurements. This data set should contain 150 observations.
2. Use the set statement and the where statement to construct 3 SAS data sets from the single data set of step 1. Each of these data sets should contain the sepal length measurements for 2 of the 3 species and a variable (taking only 2 values) which distinguishes the species from which the corresponding sepal length measurement came. This new variable will be used in the class statement in proc ttest. You don't need to print out a copy of the data sets but do turn in a copy of your program.
4. How could you use the where= statement to avoid doing problem 2 above?
