solve this, Data Mining import pandas as pd import numpy as np data = { 'Student ID':
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
import pandas as pd import numpy as np data = { 'Student ID': [1, 2, 3, 4, 5], 'Name': ['Norah', 'Mohammed', 'Faisal', 'Ali', 'Lama'], 'Age' [19, 20, 'unknown', 'unknown', 21], 'Gender': ['Female', 'Male', 'Male', 'Male', 'Female'], 'Score': [85, 92, 78, 88, 'unknown'] } # ((((PRINT THE HEAD AFTER EACH STEP)))) #Task 1: Load the dataset into a Pandas DataFrame # Task 2: Drop Columns that Aren't Useful #Task 3: Handle Missing Values #replace unknown with NaN # Count missing values in each column and print the result missing_values = df.isnull().sum () print (missing_values) # Fill missing values in 'Age' and 'Score' with the mean (you can use the built in method mean () Task 4: Convert Categorical Values to Numeric for gender column #Task 5: Apply Feature Scaling/Normalization for age and score columns using MinMaxScaler (search about it) import pandas as pd import numpy as np data = { 'Student ID': [1, 2, 3, 4, 5], 'Name': ['Norah', 'Mohammed', 'Faisal', 'Ali', 'Lama'], 'Age' [19, 20, 'unknown', 'unknown', 21], 'Gender': ['Female', 'Male', 'Male', 'Male', 'Female'], 'Score': [85, 92, 78, 88, 'unknown'] } # ((((PRINT THE HEAD AFTER EACH STEP)))) #Task 1: Load the dataset into a Pandas DataFrame # Task 2: Drop Columns that Aren't Useful #Task 3: Handle Missing Values #replace unknown with NaN # Count missing values in each column and print the result missing_values = df.isnull().sum () print (missing_values) # Fill missing values in 'Age' and 'Score' with the mean (you can use the built in method mean () Task 4: Convert Categorical Values to Numeric for gender column #Task 5: Apply Feature Scaling/Normalization for age and score columns using MinMaxScaler (search about it)
Expert Answer:
Related Book For
Posted Date:
Students also viewed these programming questions
-
CANMNMM January of this year. (a) Each item will be held in a record. Describe all the data structures that must refer to these records to implement the required functionality. Describe all the...
-
In its income statement for the year ended December 31, 2020, Bramble Company reported the following condensed data. Operating expenses $754,570 Interest revenue $28,440 Cost of goods sold 1,339,800...
-
Compare and contrast the differences between how business and non-business income are divided among states for a multi-state business.
-
The potential of solar panels on roofs built above national highways as a source of solar energy was investigated in the International Journal of Energy and Environmental Engineering (December,...
-
Consider the IRR and ERR measures of worth. If we define a root to mean a value for the measure that results in \(\mathrm{PW}=0\), then which of the following statements is true? a. Both IRR and ERR...
-
Luxman Company has several processing departments. Costs charged to the Assembly Department for October 2012 totaled $1,298,400 as follows. Production records show that 25,000 units were in beginning...
-
Describe Edgar F. Codd's accomplishments and contributions to the relational model.?
-
Determine what will happen to the resulting confidence interval for the given scenario, if the proposed changes are made (assume each change is made independently to the original scenario): In order...
-
Use the following random numbers to simulate yes and no answers to 10 questions by starting in the first row and letting a. the double-digit numbers 0049 represent yes, and 5099 represent no. b. the...
-
Why do all large H&S carriers have several different aircraft types (particularly different capacities) in their fleets?
-
Who are the typical participants in a Delphi forecasting process? a. decision makers b. staff personnel c. respondents d. all of the above
-
To compute the variance, of a discrete random variable you need to know the a. variables possible values. b. expected value of the variable. c. probability of each possible value of the variable. d....
-
Why is a 100% load factor not achievable across all airline flights? What is a practical limit?
-
A. TASK Write a report (maximum 1,250 words, including any references) to inform an association of SMEs (small and medium-sized enterprises) of the significance of the Consumer Guarantees Act 1993...
-
Determine two different Hamilton circuits in each of the following graphs. A B F G
-
Following are the data on percentage of investments in energy securities and tax efficiency from Exercise 14.16. Use α = 0.05. Presuming that the assumptions for regression inferences...
-
In the article Material Culture as Memory: Combs and Cremations in Early Medieval Britain (Early Medieval Europe, Vol. 12, Issue 2, pp. 89128), H. Williams discussed the frequency of cremation...
-
The Information Please Almanac provides data on the ages at inauguration and of death for the presidents of the United States. We give those data on the Weiss Stats CD for those presidents who are...
-
Powerhouse Ltd purchased machinery on 2 January 2019, at a cost of $800 000. The machinery is depreciated using the straightline method over a useful life of 8 years with a residual value of $80 000....
-
The purchases and sales of Big Flower Pty Ltd of one brand of lawn fertiliser for the year ended 31 December 2019 are contained in the schedule below. The selling price up to 30 June was $12 per unit...
-
In groups of four or five, consider the following information. On 1 July 2019, Stevenson Pty Ltd, a proprietary company with three shareholders, acquired some property by issuing 100 000 shares to...
Study smarter with the SolutionInn App