Question: For a given dataset, where class labels may not be found, choose the right target variable and discretize the same for classification. Find the dataset

For a given dataset, where class labels may not be found, choose the right target variable and discretize the same for classification. Find the dataset here: https://drive.google.com/file/d/1gu1ooPQzikhsIQNnTLPu_AMGcLUcP1o-/view

PART A: (5-marks)

Research Select the research paper of your choice. Attach the chosen paper along with the assignment submission.

Write a synopsis and find below pointers:

3. Paper Contribution

4. Data Pre-processing

5. Machine Learning Activity

6. Result analysis with metrics used from paper

7. Exploratory Data Analysis / Visualization

PART B: (15 marks) Dataset-based Implementation Refer to the dataset mapped against your group. Use python based APIs and perform the following three classes of activities.

EDA 1. Perform Exploratory Data Analysis to gather insight from the dataset. Write your inference about the analysis learned from visualizations (minimum 3) [3]

Classification CLASSIFICATION (any of the Logistic Regression / SVM / Decision Tree/ Nave Bayes/KNN/ANN). Justify your design choices at each step: Write as a markdown cell in jupyter notebook at the beginning of each subsection.

1. Perform and explain necessary pre-processing / feature engineering on this dataset [0.5]

2. Perform the Machine Learning activity. Explain the choice of target attribute, classification type, model selected with reason [1.5]

3. Quantify and explain the quality of your ML model. Explain the choice of evaluation metric [1.5]

4. Your observation about the results (Hint: comment on the problem statement and conclude the effectiveness of the machine learning activity) [0.5]

Regression Any of the Linear Regression (any of Gradient / Stochastic / MiniBatch)/linear basis models/KNN/Locally weighted regression/ any of the regularization techniques). Justify your design choices at each step: Write as a markdown cell in jupyter notebook at the beginning of each subsection.

1. Perform and explain necessary pre-processing / feature engineering on this dataset [0.5]

2. Perform the Machine Learning activity. Explain Attributes of interest, Regularization type with reason, model selected with reason [1.5]

3. Quantify and explain the quality of your ML model. Explain the choice of evaluation metric [1.5]

4. Your observation about the results (Hint: comment on the problem statement and conclude the effectiveness of the machine learning activity) [0.5]

Ensemble ML Justify your design choices at each step: Write as a markdown cell in jupyter notebook at the beginning of each subsection.

1. Perform and explain necessary pre-processing / feature engineering on this dataset [0.5]

2. Perform the Machine Learning activity. Explain Attributes of interest, base classifier chosen with reason, model selected with reason [1.5]

3. Quantify and explain the quality of your ML model. Explain the choice of evaluation metric [1.5]

4. Your observation about the results (Hint: comment on the problem statement and conclude the effectiveness of the machine learning activity) [0.5]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

General Instructions : Inside each of the submission documents, you are required to mention Group details - group number, group members. Organize your code (PART B) in separate sections for each...

MATHEMATICS FOR MACHINE LEARNING Marc Peter Deisenroth A. Aldo Faisal Cheng Soon Ong Contents Foreword 1 Part I Mathematical Foundations 9 1 Introduction and Motivation 11 1.1 Finding Words for...

Week 2: Understanding and Exploring Assumptions You will submit one Word document, including your SPSS output. 1. Why do we care whether the assumptions required for statistical tests are met? (Tip:...

FORUM: QUALITATIVE SOCIAL RESEARCH SOZIALFORSCHUNG Volume 2, No. 3, Art. 22 September 2001 Qualitative Data Analysis: Common Phases, Strategic Differences Ian Baptiste Key words: Abstract: This paper...

Drive (miles) 63 20 80 42 88 71 33 36 36 42 73 76 80 36 63 6 28 55 40 4 25 25 36 80 29 54 54 80 36 76 78 76 71 94 6 State OR MI PA FL MI MI SC MI OR IL FL NY PA TX PA SC NY OH IL MI GA CA TX NV TX FL...

Algorithms in Artificial Intelligence (or, the old name: Introduction to Algorithmic Decision Making) Part 1 Based on slides by David Sarne and Lirong Xia Course Tentative Schedule Introduction...

Please find attached assignment and see if you can help me in that and provide me the main points (in the article of spencer and web (2015) as attached) that need to be criticized, also I need an...

MATH 221 Statistics for Decision Making Week 6 iLab Name: Melissa Frawley Statistical Concepts: Data Simulation Confidence Intervals Normal Probabilities Short Answer Writing Assignment All answers...

MKT500 Discussion Board 10 "Marketing Strategy" Please respond to the attached questions. I have attached the question, scenario, and lecture notes. Thank You "Marketing Strategy" 1.) From the...

\fThis is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content may be suppressed. Editorial review has deemed that any suppressed content does...

Use Table A to find the value z of a standard Normal variable that satisfies each of the given conditions. (Use the value of z from Table A that comes closest to satisfying the condition.) In each...

QUESTION 3 OMG Equipment Sdn Bhd reported the following data for 2022: Statement Profit and LosS Net Profit.. RM44,000 Depreciation........ RM8,000 Statement of Financial Position Increase in Account...

For the purposes lor which they are used. money market securlies should have which of the foliowing characteristics? Aow trading cests II . Litle price risk. B . High rate of return N . Life greater...

The first scenario will be a Verbal Judo scenario in which your scenario follows the standard Verbal Judo interaction: You need to ask somebody to modify their behavior either to do something or to...

design a simple performance appraisal system

2. What are the main advantages and disadvantages of using 360 degree appraisal?

4. How can social media be used to check a candidates experience and qualifications?