Question: 5. Factors How do we make a distinction between nominal, ordinal, interval, and ratio scale data? How do we do this in R? There are
5. Factors How do we make a distinction between nominal, ordinal, interval, and ratio scale data? How do we do this in R? There are more data types beyond the single numeric data type. Factors are the main way to represent a nominal scale variable. Nominal variables store values that have no relationship between the different possibilities (categories). For example, say you created a variable called eyes. The possible values of blue, green, hazel, etc. have no order, rank, or true zero point. Let's say that I have 3 teams of 4 students working together on a group project. How can I keep track of students and their groups? I might want to have a variable that keeps track of the student groups. Let's create a vector called groups. groups <- c(1,1,1,1,2,2,2,2,3,3,3,3) Next, we want to convert groups to a factor variable. This is better than having groups as a numeric variable. Numeric variables are best reserved for those values in which you'd want to perform calculations. In this case, the numbers used are just for groups. We'd never perform a calculation such as: groups+2 ## [1] 3 3 3 3 4 4 4 4 5 5 5 5 Let's convert groups to a factor variable so we don't accidentally use the data in groups to perform a calculation. We can use the as.factor( ) function to convert our variable. groups = as.factor(groups) groups ## [1] 1 1 1 1 2 2 2 2 3 3 3 3 ## Levels: 1 2 3 Notice the levels. Levels are the categories of data that are stored in our factor variable. You can see that 1, 2, or 3 are the only values stored in our vector. Now try to add the number 2 to the groups variable. groups + 2 ## Warning in Ops.factor(groups, 2): '+' not meaningful for factors ## [1] NA NA NA NA NA NA NA NA NA NA NA NA You'll notice that you are given an error message. We cannot use the + operator as a mathematical operator with factors. Factors are similar to character data types in that you cannot perform calculations. based on this text, When would you want to use a factor variable