Question: 1 . Define unsupervised versus supervised data mining Identify which type of data mining you would use to run the following four projects 2 .
Define unsupervised versus supervised data mining
Identify which type of data mining you would use to run the following four projects
Autonomous car driving dataset with crash occurrences identified in the data, to determine what are the most critical factors to monior to avoid a crash.
Population and employment data for Ohio, with a desire to understand where people like to work and live in the state.
Monthly and daily records of product sales across various states, cities and stores with a desire to understand why some stores are selling more products than others.
Tire failure dataset on a certain brand of tires in the US with car operational data, tire failurebrecords and weather data, with a desire to understand the causes for failure.
Model training and testing datasets
Should we strive for the highest possible accuracy with the training set? Why or why not?
Explain why we sometimes need to balance model training data.
For the next two questions and for modeling training purposes, we are running a fraud classification model and have initially identified a training data set of records, of which we know or are fraudulent.
To better balance the training dataset, how many fraudulent records need to be added if we want the proportion of fraudulent records to be
How many total records would then be in our training dataset?
Final Question
When should a test data set be balanced?
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
