Question: Tips: In order to work on this lab, you have to get some software packages such as numpy and sklearn installed on your computer. In

Tips: In order to work on this lab, you have to get some software packages such as numpy and sklearn installed on your computer.

In python environment (non-anaconda), here is the installation steps (from ssh client):

cp ~nyu/get-pip.py ~

python get-pip.py --user

pip install numpy user

In Anaconda:

Run anaconda prompt first (search bar -> anaconda). Then, type python to run python prompt. Type import sklearn to see if any error (use exit() to quit). If any error, quit python prompt first and you may install the package in anaconda prompt (the prompt starts with (base)).

conda install pip

pip install scikit-learn

pip install mglearn

Please run the python program my_python_package_test.py posted on BlackBoard to verify your installation environment.

My_Python_Package:

import numpy as np #%matplotlib inline import matplotlib.pyplot as plt from scipy import sparse import mglearn from IPython.display import display

import sys print("Python version:", sys.version)

import pandas as pd print("pandas version:", pd.__version__)

import matplotlib print("matplotlib version:", matplotlib.__version__)

print("NumPy version:", np.__version__)

import scipy as sp print("SciPy version:", sp.__version__)

import IPython print("IPython version:", IPython.__version__)

import sklearn print("scikit-learn version:", sklearn.__version__)

x = np.array([[1, 2, 3], [4, 5, 6]])

print("x: {}".format(x))

# Create a 2D NumPy array with a diagonal of ones, and zeros everywhere else eye = np.eye(4) print("NumPy array: ", eye)

# Convert the NumPy array to a SciPy sparse matrix in CSR format # Only the nonzero entries are stored

sparse_matrix = sparse.csr_matrix(eye) print(" SciPy sparse CSR matrix: ", sparse_matrix)

data = np.ones(4) row_indices = np.arange(4) col_indices = np.arange(4) eye_coo = sparse.coo_matrix((data, (row_indices, col_indices))) print("COO representation: ", eye_coo)

# Generate a sequence of numbers from -10 to 10 with 100 steps in between x = np.linspace(-10, 10, 100) # Create a second array using sine y = np.sin(x) # The plot function makes a line chart of one array against another plt.plot(x, y, marker="^")

# create a simple dataset of people data = {'Name': ["John", "Anna", "Peter", "Linda"], 'Location' : ["New York", "Paris", "Berlin", "London"], 'Age' : [24, 13, 53, 33] }

data_pandas = pd.DataFrame(data)

# IPython.display allows "pretty printing" of dataframes # in the Jupyter notebook

display(data_pandas)

# Select all rows that have an age column greater than 30 display(data_pandas[data_pandas.Age > 30])

plt.show()

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

1a) Create a python file named mystat.py and import the following package:

import numpy as np

from sklearn import preprocessing

input_data = np.array([[5, -2, 3], [-1, 7, -6],[3, 0, 2],[7, -9, -4]])

print(input_data)

1b )We use threshold=2.2 for the input_data to get a Boolean values. Add the following lines to the same python file. Run the python file and show the below printout.

data_binarized = preprocessing.Binarizer(threshold=2.2).transform(input_data)

print(" Binarized data: ", data_binarized)

1c) Mean and Variance. Add the following lines to the python file to the same python file. Run the python file and show the below printout.

print("axis=0")

print("Mean =", input_data.mean(axis=0))

print("variance =", input_data.var(axis=0))

print("Std deviation =", input_data.std(axis=0))

print("axis=1")

print("Mean =", input_data.mean(axis=1))

print("variance =", input_data.var(axis=1))

print("Std deviation =", input_data.std(axis=1))

1d) What is the meaning for axis=0 and axis=1 respectively? Write your answer below.

1e) Data set can be scaled into a range with mean = 0 and std = 1. Add the following lines into the same python file. Run the file and indicate the below printout.

data_scaled = preprocessing.scale(input_data)

print(" AFTER:")

print("Mean =", data_scaled.mean(axis=0))

print("variance =", data_scaled.var(axis=0))

print("Std deviation =", data_scaled.std(axis=0))

1f) Min-max scaler can scale the data set to a range of [0,1]. Add the following lines to the same python file. Run the file and show the below printout.

minmax_scaler = preprocessing.MinMaxScaler(feature_range=(0, 1))

data_scaled_minmax = minmax_scaler.fit_transform(input_data)

print(" Min-max scaled data: ", data_scaled_minmax)

2a) Recall what we have learned in the class and explain how the min-max scaler works?

2b) L1 norm and L2 norm are both commonly used in deep learning. Add the following lines to the same python file. Run the file and show the below printout.

data_normalized_l1 = preprocessing.normalize(input_data, norm='l1')

data_normalized_l2 = preprocessing.normalize(input_data, norm='l2')

print(" L1 normalized data: ", data_normalized_l1)

print(" L2 normalized data: ", data_normalized_l2)

2c) Recall what we have learned in the class and explain the principles of L1 and L2 norms?

2d) Encoding the labels. Creating a new python file named mylabel.py and add the following lines. Run the file and show the below printout.

import numpy as np

from sklearn import preprocessing

input_labels = ['red', 'black', 'red', 'green', 'black', 'yellow', 'white']

# Create label encoder and fit the labels

encoder = preprocessing.LabelEncoder()

encoder.fit(input_labels)

print(" Label mapping:")

for i, item in enumerate(encoder.classes_):

print(item, '-->', i)

2e) Add the following lines to the same file. Run the file and show the below printout.

test_labels = ['green', 'red', 'black']

encoded_values = encoder.transform(test_labels)

print(" Labels =", test_labels)

print("Encoded values =", list(encoded_values))

2f) Add the following lines to the same file. Run the file and show the below printout.

encoded_values = [3, 0, 4, 1]

decoded_list = encoder.inverse_transform(encoded_values)

print(" Encoded values =", encoded_values)

print("Decoded labels =", list(decoded_list))

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!