Question: Q 1 : Homework Consolidation 2 0 0 pts Write a program to consolidate ALL your homeworks done to date ( leave out the LinkedIN

1

: Homework Consolidation

200

pts

Write a program to consolidate ALL your homeworks done to date

(

leave out the LinkedIN one

) .

In essence, you should have

10

homeworks. The user should be able to enter the homework number,

and the homework runs as it is expected to

.

Once the homework run is complete, we should come back

to the original menu, where the user can select another homework

(

or type

0

to exit

) .

Note: Do Error Handling in menu inputs, e

.

.

not allowing any input other than

0 - 10 .

Sample Menu

1

: BMI

2

: While Question String

3

: FindRemainder Function

4

: Lists with

0

5

: First Second File to Third

6

: Dictionary List

7

: Pandas Basics

8

: Stats & Pandas

9

: Data Visualization

10

: Address Class

Please Enter Homework Number

(

and

0

to exit

)

_

2

: Train Dataset

250

pts

You have to devise an Object Oriented solution for this problem, i

.

.

you MUST create a CLASS, and

solve the problem calling the class methods.

The train dataset contains data regarding a train crash. It has ~

891

records. However, all records are not

complete, i

.

.

some have missing data. In this data set, our outcome of interest is the survived column,

on whether an individual survived

(1)

or died

(0)

after the Train crash.

1 .

Clean the data and remove any unwanted records. How many records do you have now?

(20

points

)

2 .

The train picked most passengers from which station

?

(20

points

)

3 .

Do some basic data exploration

(

.

.

using commands as head

(),

info

(),

describe

(),

nunique

(),

etc

) .

Which variables will you NOT select?

(20

points

)

4 .

Are there any outliers in the data? If yes, treat them.

(15

points

)

5 .

Partition the data into a training set

(

with

70 %

of the observations

),

and testing set

(

with

30 %

of the

observations

)

using the random state of

12345

for cross validation.

(15

points

)

6 .

On the partitioned data, build the best KNN model. Show the accuracy numbers.

(

Hint: What is the

best value of k

?

How do you decide the

best k

?)

(30

points

)

7 .

On the partitioned data, build the best logistic regression model. Show the accuracy numbers.

(30

points

)

8 .

On the partitioned data, build the decision tree. Show the accuracy numbers. What tree depth did

you choose, i

.

.

which one is ideal and why?

(30

points

)

9 .

Based on the results of k

-

nearest neighbor, and logistic regression, what is the best model to classify

the data? Provide explanation to support your argument.

(20

points

)

10 .

Show some interesting graphs of the data, i

.

.,

that can describe the original data.

(50

points

)

Requirements

Make all the "best programming practice" decisions, e

.

.

how to show the output, what prompts

to display, how to ask for input etc.

You are the developer

/

engineer

,

it is YOUR decision

. . .

YOUR job. If the program is not presented

nicely, you will lose points.

First, all that is being asked should be done. Second, the displays should be intuitive, selfexplanatory and nicely put.

Do NOT assume that the user knows ANYTHING.

Write the program such that any new person that sits in front of the terminal, can start playing

with it

(

.

.,

the commands, displays etc. are adequate

) .

Submission and Demo

MULTIPLE file submissions are allowed. Anything you submit is kept.

You will submit the following:

.

ipynb files for all questions, clearly labeled.

(

1_

.

ipynb and Q

2_

Train.ipynb

)

You are asked to demo the project IN

-

class or Zoom. Details in D

2

/

.

The Demo files

(

.

.

files you submit

)

should NOT be commented. python code for this with the detail

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!

You are a junior analyst at Titan Capital, a private equity firm looking to purchase and merge two cybersecurity firms. Your managing partner, Mitchell, is asking for your assessment of which two...

Part #2 - Programming (20 pts) Write a class definition (not a program, there is no main method) named Geek (saved in a file Geek.java) that models a person who is a geek. For our purposes, a geek is...

CS 1120 (Python) - Spring 2021 LA4 Managing Students' Information Lab Assignment 4 Managing Students' Information Due Date (a two-week LA) Sections(540,543,544,545) 3/26/21 @ 11:59:59pm Concepts...

It appears that because of COVID limitations, data was collected virtually through phone, email, and surveys with a population that may have barriers to those methods of data collection. How do you...

Building: Contains multiple Floors and one Elevator. (While multiple Elevators are possible, the homeworks will only deal with one.) Elevator: Carries Passengers between Floors. Has a limited...

Q:-1: Why did Johns Hopkins hospital want to develop more resiliency in the nurses? What is the business case? Q:-2:Many of the nurses were initially resistant to Applied Improvisation then later...

Mates Rates Rent-A-Car ( just do the part a) using visual studio code (C#) Criteria sheet - Par A Example supplementary files (readme.pdf) Example supplementary files (class-diagram.pdf) Assignment...

PowerPoint Assignment for BUS 401: Principles of Finance You have been asked by a manager in your organization to put together a training program explaining Net Present Value (NPV) and Future Value...

The Final Project is to develop a simple database system. The database is to handle multiple records, each composed of several fields. The database will store its information to a file, addition and...

Briefly describe ASCII and Unicode and draw attention to any relationship between them. [3 marks] (b) Briefly explain what a Reader is in the context of reading characters from data. [3 marks] A...

Lang Company has not yet prepared a formal statement of cash flows for 2010. Following are comparative balance sheets as of December 31, 2010 and 2009, and a statement of income and retained earnings...

The collection of all convex subsets of a linear space ordered by inclusion forms a complete lattice.

Women in the U . S . workforce are accurately characterized by which statemenglsl? Multiple Choice The participation rate has been betmees 4 6 and 4 7 percert wince the late 2 0 0 0 s . A lack of...

Can you elucidate the role of molecular chaperones and protein quality control systems in maintaining the integrity and homeostasis of the cytoplasm, particularly under conditions of cellular stress...