Question: I am having a difficult time solving the following questions. If anyone is able to help, I would greatly appreciate it. 1. Given a customer
I am having a difficult time solving the following questions. If anyone is able to help, I would greatly appreciate it.
1. Given a customer buying electronics database, and 10,000 transactions are analyzed and the data show:
6,000 of customer transactions included computer games (game)
7,500 of them included videos (video),
4,000 of them included both computer games and videos
The generated association rule is; video-> game [support = ?%, confidence = ?%]
What is the confidence of the rule? (Please keep 2 digits after the decimal point, for example, 025).
2. Given a transaction database for mining association rule as follows:
| TID | Items |
| 100 | A C D |
| 200 | B C E |
| 300 | A B C E |
| 400 | B E |
For given support count = 2, which one of the following statement is incorrect?
- The rule B, C -> E and the rule C, E -> B have the same support value
- the rule B, C -> E and the rule B, E -> C have the same support value
- the rule B, C -> E and the rule B, E -> C have the same confidence value
- the rule B, C -> E and the rule C, E -> B have the same confidence value
3. Given a transaction database for mining association rule as follows:
| TID | Items |
| 100 | A C D |
| 200 | B C E |
| 300 | A B C E |
| 400 | B E |
which one of the following statement is correct?
- There are 5 non-empty item-set that can be generated from the set of items {A, B, C, D, E}
- There are 32 non-empty item-set that can be generated from the set of items {A, B, C, D, E}
- There are 31 non-empty item-set that can be generated from the set of items {A, B, C, D, E}
- There are 5 item-set that can be generated from the set of items [A, B, C, D, E}
4. Which of the following statement(s) is(are) correct for multiple linear regression method in XLMiner?
- The categorical data value must be transformed into binary data
- The numerical data value must be transformed into categorical value
- Either a or b
- Neither a or b
5. Given a transaction database for mining association rule as follows:
| TID | Items |
| 100 | A C D |
| 200 | B C E |
| 300 | A B C E |
| 400 | B E |
For given support count = 2 and item-set {B, C, E}, how many number of valid rules can be generated from the item-set {B, C, E}?
- 3
- 6
- 12
- 2
6. Given a database table containing weather data as follows:
| Outlook | Temperature | Humidity | Windy | Class: Play |
| Sunny | Hot | High | False | No |
| Sunny | Hot | High | True | No |
| Overcast | Hot | High | False | Yes |
| Rainy | Mild | High | False | Yes |
| Rainy | Cool | Normal | False | Yes |
| Rainy | Cool | Normal | True | No |
| Overcast | Cool | Normal | True | Yes |
| Sunny | Mild | High | False | No |
| Sunny | Cool | Normal | False | Yes |
| Rainy | Mild | Normal | False | Yes |
| Sunny | Mild | Normal | True | Yes |
| Overcast | Mild | High | True | Yes |
| Overcast | Hot | Normal | False | Yes |
| Rainy | Mild | High | True | No |
Where Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response or outcome).
For the given sample, X = (Outlook = 'Sunny', Temperature = 'Mild' , Humidity = 'High' , Windy = 'False')
Please compute the conditional probability P(X|PLAY='No').
Please give keep 3 digits after decimal, for example. 0.521.
7. A dataset has 1000 records and 50 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed?
8. Given a database table containing weather data as follows:
| Outlook | Temperature | Humidity | Windy | Class: Play |
| Sunny | Hot | High | False | No |
| Sunny | Hot | High | True | No |
| Overcast | Hot | High | False | Yes |
| Rainy | Mild | High | False | Yes |
| Rainy | Cool | Normal | False | Yes |
| Rainy | Cool | Normal | True | No |
| Overcast | Cool | Normal | True | Yes |
| Sunny | Mild | High | False | No |
| Sunny | Cool | Normal | False | Yes |
| Rainy | Mild | Normal | False | Yes |
| Sunny | Mild | Normal | True | Yes |
| Overcast | Mild | High | True | Yes |
| Overcast | Hot | Normal | False | Yes |
| Rainy | Mild | High | True | No |
Where Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response or outcome).
For the given sample, X = (Outlook = 'Sunny', Temperature = 'Mild' , Humidity = 'High' , Windy = 'False')
Please compute the conditional probability P(X|PLAY='No') * P(PLAY='No') . (* is the multiplication)
Please give keep 3 digits after decimal, for example. 0.521.
9. Given a transaction database for mining association rule as follows:
| TID | Items |
| 100 | A C D |
| 200 | B C E |
| 300 | A B C E |
| 400 | B E |
For given support count = 2, which one of the following statement is incorrect?
- The rule B, C -> E and the rule E->B, C have the same confidence value
- The rule B-> C,E and the rule E->B, C have the same confidence value
- The rule E ->B,C and the rule C->B,E have the same confidence value
- The rule B,E -> C and the rule C->B,E have the same confidence value
10. A dataset has 1000 records and 2 variables with 5% of the values missing, spread randomly throughout the records and variables. An analysis decides to remove records that have missing values. About how many records would you expect would be removed?
11. Given a transaction database for mining association rule as follows:
| TID | Items |
| 100 | A C D |
| 200 | B C E |
| 300 | A B C E |
| 400 | B E |
For given support count = 2, which one of the following statement is incorrect?
- The item-set in rule C->A is a frequent item-set
- The rule A->C and C->A have the same support value
- The item-set in rule A->C is a frequent item-set
- The rule A->C and C->A have the same confidence value
12. Given a transaction database for mining association rule as follows:
| TID | Items |
| 100 | A C D |
| 200 | B C E |
| 300 | A B C E |
| 400 | B E |
The number of generated item-sets from the set of items {A, B, C, D, E} can be used to formulate association rules is
- 5
- 12
- 6
- 26
13. Given a database table containing weather data as follows:
| Outlook | Temperature | Humidity | Windy | Class: Play |
| Sunny | Hot | High | False | No |
| Sunny | Hot | High | True | No |
| Overcast | Hot | High | False | Yes |
| Rainy | Mild | High | False | Yes |
| Rainy | Cool | Normal | False | Yes |
| Rainy | Cool | Normal | True | No |
| Overcast | Cool | Normal | True | Yes |
| Sunny | Mild | High | False | No |
| Sunny | Cool | Normal | False | Yes |
| Rainy | Mild | Normal | False | Yes |
| Sunny | Mild | Normal | True | Yes |
| Overcast | Mild | High | True | Yes |
| Overcast | Hot | Normal | False | Yes |
| Rainy | Mild | High | True | No |
Where Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response or outcome).
For the given sample, X = (Outlook = 'Sunny', Temperature = 'Mild' , Humidity = 'High' , Windy = 'False')
Please compute the conditional probability P(X|PLAY='Yes') * P(PLAY='Yes') . (* is the multiplication)
Please give keep 3 digits after decimal, for example. 0.521
14. Given a database table containing weather data as follows:
| Outlook | Temperature | Humidity | Windy | Class: Play |
| Sunny | Hot | High | False | No |
| Sunny | Hot | High | True | No |
| Overcast | Hot | High | False | Yes |
| Rainy | Mild | High | False | Yes |
| Rainy | Cool | Normal | False | Yes |
| Rainy | Cool | Normal | True | No |
| Overcast | Cool | Normal | True | Yes |
| Sunny | Mild | High | False | No |
| Sunny | Cool | Normal | False | Yes |
| Rainy | Mild | Normal | False | Yes |
| Sunny | Mild | Normal | True | Yes |
| Overcast | Mild | High | True | Yes |
| Overcast | Hot | Normal | False | Yes |
| Rainy | Mild | High | True | No |
Where Outlook, Temperature, Humidity, and Windy are the input variables (predictors), and Play is the output variable (response or outcome).
For the given sample, X = (Outlook = 'Sunny', Temperature = 'Mild' , Humidity = 'High' , Windy = 'False')
Please compute the conditional probability P(X|PLAY='Yes').
Please give keep 3 digits after decimal, for example. 0.521
15. A dataset has 1000 records and one variable with 5% of the values missing, spread randomly throughout the records in the variable column. An analysis decides to remove records that have missing values. About how many records would you expect would be removed?
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
