Question: The dataset includes missing values, invalid values, outliers, and extreme values. The dataset is also imbalanced. Import the data into the IBM Modeler. 1 -

The dataset includes missing values, invalid values, outliers, and extreme values. The dataset is also imbalanced.
Import the data into the IBM Modeler.
1- Exploring the data: Using Modeler Data Audit node, visualize the features in the data. Include figures of visualized features. Also, report
a) How many input features are in the dataset? How many records are in the dataset?
b) Which feature is suitable as the target? (Please propose at least 1 candidate target)
c) How many valid records are in each feature?
d) How many outliers and extreme values are in the data?
e) For each input feature, depending on whether it is a numerical or a categorical feature, answer one of the following questions related to that feature:
a. What type of variable is this feature? What are the mean, median, min, max, standard deviation, and distribution of the feature? Does the distribution (or other statistical measures) look acceptable to you? Why yes or no?
b. What type of variable is this feature? What are the frequency (counts), and distribution of the feature? Does the distribution (or other statistical measures) look acceptable to you? Why yes or no?
2- Missing values: The data includes multiple missing values. Do not remove a record if there is only one missing value in that record. Instead, use the IBM Modeler to fill in the missing value with an algorithm of your choice. If you find a record with more than one missing value, then you may either remove that record, or use the IBM Modeler to fill in for the missing values.
Explain how you treated the missing values in the data.
3- Invalid values: The data includes multiple invalid values. Do not remove a record if there is only one invalid value in that record. Instead, use the IBM Modeler to fill in the invalid value with an algorithm of your choice. If you find a record with more than one invalid value, then you may either remove the record, or use the IBM Modeler to fill in for the invalid values.
Explain how you treated the invalid values in the data.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!