Question: You are working as a data analyst in a burgeoning mobile company that aims to compete with giants like Apple and Samsung. A key challenge

You are working as a data analyst in a burgeoning mobile company that aims to compete with giants like Apple and Samsung. A key challenge in this competitive market is to appropriately price mobile phones. To tackle this, you have been given a dataset with sales data of mobile phones from various companies. Analyze the mobile phone specifications, such as battery power, RAM, internal memory, etc., to determine the price range. Your goal is not to predict the exact selling price but to classify the phones into different price ranges, indicating how high the price is. Drawing from the knowledge you have accumulated, please analyze the data provided and write a report that is both descriptive and predictive (classification) to reflect your findings. Start by conducting a concise descriptive analysis of the data. The goal of this descriptive analysis is to understand the data provided and achieve greater accuracy in the predictive analysis phase. Next, clean the data using the techniques we've covered. Use your creativity and knowledge to identify the most accurate classification model for predicting the target variable. Please use decision tree (DT), k-nearest neighbors (KNN), and random forest (RF) algorithms in your prediction and compare the results. You also have several techniques at your disposal to enhance the model's accuracy, including, feature selection, the use of dummy variables, and data normalization. Some additional questions that you should explore is that can you use the concept that you learned about piecewise regression and model tree to improve the accuracy in classification? Among the KNN, DT, and RF, which model responses better to the piecewise classification and model tree? Explain the results in the paper. What is the most important feature in predicting the target variable? Please explain the results of that as well. When setting up your classification model, ensure that you shuffle the data first, then divide it into training data (80%) and testing data (20%). Train your classification model with the training data and subsequently evaluate it using the testing data. It is essential to assess your classification model with the evaluation metric discussed in class. Additionally, work to prevent overfitting in your analysis by comparing the prediction accuracy on both the training and testing data sets.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!