Question: Please answer the following questions in python jupyter, solve 2-G and 2-H but in order to solve them I guess you need the solve the

Please answer the following questions in python jupyter, solve 2-G and 2-H but in order to solve them I guess you need the solve the previous ones too. Import these packages before starting, import numpy as np import pandas as pd import seaborn as sns import math from sklearn import preprocessing from sklearn import datasets import sklearn from scipy import stats import matplotlib import matplotlib.pyplot as plt %matplotlib inline matplotlib.style.use('ggplot') np.random.seed(1) Q2 (70 points) Working with Data

In [2]:X = datasets.load_wine(as_frame=True)c

data = pd.DataFrame(X.data, columns=X.feature_names)

data['class'] = pd.Series(X.target)

data = data.drop(list(data.columns[5:-1]),axis=1) #Keep only the first five columns and the class label

print(" classes ",data['class'].unique()) #The different class labels in the data .. We have three class labels, 0, 1, 2

print(" class distribution ",data['class'].value_counts()) #Shows the number of rows for each class

data.info()

data.head()

Q2-A (10 points) Construct a scatter plot between the 'ash' and 'malic_acid' columns

Use the plt.scatter to plot these two variables

Use the data['class'] to color the points

Q2-B (10 points) Construct a scatter matrix between all the attributes, except the last attribute, Class

Use the sns.pairplot function to plot the scatter matrix .. use the 'class' attribute as the hue

Q2-C (5 points) Normalize the data such that each attribute has a minimum of 0 and a maximum of 1

Don't change the content of the original dataframe. The final result will be stored in data_scaled

Q2-D (5 points) Standarize the data such that each attribute has a mean 0 and a standard deviation of 1 (unit variance)

Hint: use preprocessing.StandardScaler

Don't change the content of the original dataframe. The final result will be stored in data_scaled

Q2-E Equal-Width Binning (5 points)

Convert the values in each attribute to discrete values and use 5 bins.

Use the pandas cut method, pd.cut

Q2-F Equal Frequency Binning (5 points)

Convert the values in each attribute to discrete values and use 5 bins.

Use the pandas qcut method, pd.qcut

Q2-G Sampling (15 points)

Construct three samples from the original datasets

data_sample1: select 30 random rows

data_sample2: select 10 random rows from each class, for a total of 30 rows

data_sample3: select 17% random rows from each class. Hint: use frac=0.17

In [ ]:#data_sample1 Sample 30 rows from the data

#data_sample2 Sample 10 rows from each class

#data_sample3 Sample 17% for each class

#uncomment the following three lines to check your results

#print("Sample1 Size ", len(data_sample1)," ", data_sample1.head(30))

#print(" Sample2 Size ", len(data_sample2)," ", data_sample2.head(30))

#print(" Sample3 Size ", len(data_sample3)," ",data_sample3.head(30))

Q2-H (15 points)

Write Python code to answer the following questions with respect to the wine data set. You can use Pandas DataFrame:

What is the correlation coefficient between 'magnesium' and 'ash' for rows with class label 2?

What is the average of the 'ash' columns for rows with class label 1?

What are the averages for all the columns for rows with class label 0? -- use mean in dataframe

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Yego Domestic: a bond is issued in the US Global: a bond is issued in the US and foreign markets Eurobonds: a bond denominated in USD is issued in the foreign market. You need to estimate the...

D O NO T Ta KE IF YOU CANNOT ANSWEAR ALL THE QUESTION AS SUPPOSED OR I WILL RATE UNHELPFUL AND REPORT FOR PLAG Peru Domestic: a bond is issued in the US Global: a bond is issued in the US and foreign...

1 Exercise 3: Lift and Airfoils The first part of this week's assignment is to choose and research a reciprocating engine powered (i.e. propeller type) aircraft. You will further use your selected...

After Reading the Case. Clearly answer the following questions in a single post. 1) In your opinion , How did ChassisCo contribute to the cause of crisis at Athens in 2004? How did Toyota contribute...

I need an updated answer for these questions so please don't copy from the previous answer of these questions, i hope you can help me thanks. 1 Design, using an IPO chart, a flow chart, pseudocode,...

I have python homework. Please help me to solve it and can you share also code text please.(only File IO,string formatting,numpy and numpy functions, engineering applications of numpy, basic matplot...

Planning is one of the most important management functions in any business. A front office managers first step in planning should involve determine the departments goals. Planning also includes...

(c) Suppose there is no operation cost of the insurance rm and the loading factor is zero o = 0, show that agents will have zero savings. (d) Under what conditions about o, would agents choose to...

Please answer within 10hrs and I'll give u thumbs up. First pic is the question and the other pic is previous question and answer for your reference. thank you. Q4. From the previous questions, we...

Please read the question Question: On pages 5 and 6 the authors write, "Narrative provides not only meaning but also a mental framework for imbuing future experiences and information with meaning, in...

Following the investor model presented in this chapter, what returns do noise traders make on their investments in the long term: negative returns, returns around zero, or positive returns? What...

How can flexible benefits motivate employees?

Two annual coupons bonds both mature in 7 years. Bond A has a coupon rate of 4 . 0 0 % and a yield to maturity of 9 . 0 0 % . Bond B has a coupon rate of 9 . 0 0 % and a yield to maturity of 4 . 0 0...

Compared with half a century ago, adoption has become _ _ _ _ _ _ _ _ _ common, but it is more open and acceptabl e , so we probably discuss it _ _ _ _ _ _ _ . fill in the blanks more or much less or...

Question What is the difference between HSAs and Archer Medical Savings Accounts?

Question How are partnership contributions to a partners HSA account (and S corporation contributions to a more than 2% shareholder employees account) treated for tax purposes?

Question May a participant in an HRA roll over unused amounts from an HRA to a Health Savings Account (HSA)?