Question: Please use R Programming for the following question: Question 1: a. 'v'isualize the distributions of the variables in this data. You can choose bar graphs,

Please use R Programming for the following question:

Question 1: a. 'v'isualize the distributions of the variables in this data. You can choose bar graphs, histograms and density plots. Make appropriate choices given each type of variables and be careful when selecting parameters like the number of bins for the histograms. Note there are some numerical variables and some categorical ones. The ones labeled as a 'bool' are Boolean variables, meaning they are only true or false and are thus a special type of categorical. Checki ng all the distributions with visualization and summary statistics is a typical step when beginning to work with new data. b. How apply normalization to some of these numerical distributions. Specically, choose to apply zscore to one, min-max to another, and decimal scaling to a third. Explain your choices of which nomtalization applies to which variable in terms of what the variable means, what distribution it starts with, and how the normalization will affect it. c. 1lilisualize the new distributions for the variables that have been normalized. What has changed from the previous visualization in step a? d. For a variable already created, create a new variable called cont1_bins that is a binned version of that variable. This cont1_bins will have a new set of values like low, medium, high. Low ranges from |nf to 25, Medium ranges from 25 to 40, and High ranges from an to Inf. Show this binned version cont1_bins along with the other data from the dataset. ssign numerical values to the bins using the binmean and show the result. e. Building on {d}, use cont1_bins to create a smoothed version of contl and display the new distribution. How is this new distribution different than the previous distribution for oontl? Question 2: a. There are some variables we will not use, so rst remove lms, vehicles, starships and name. Also remove rows with missing values b. Several variables are categorical. We will use dum mv variables to ma ke it possible for SUM to use these. Show the resulting head of the dummvr variables including the target oolumn gender. c. Use SUM to predict gender and report the accuracv. FUEL create the dataset for 56-96 training and 34-96 testing and a seed of 514 for the random partitioning. d.: Given that we have so many variables. it malces sense to consider using PEA. Run PEA on the data and determine an appropriate number of components to use from the graph. Create a reduced version of the data with that number of principle components bv rst nding and removing near zero variance predictors using the following code: purges nearlero'v'arinumeric train} ltered

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!

Please use R Programming for the question in the attached image. Here is the link to the data file for BankData.csv for question 1:...

Please use R Programming for this question and take screenshots of R Code once you are finished. Data to use for question 1:...

Please use R Programming for these questions. Here is the link to the data file for Question 1: Bank Data:https://drive.google.com/file/d/18kGNrHUfgcVv2hMKqL5E05L40xCl6e1M/view?usp=sharing For...

Please use R Programming for this question and take screenshots of R Code once finished. Data to use for question 1: https://drive.google.com/file/d/18kGNrHUfgcVv2hMKqL5E05L40xCl6e1M/view?usp=sharing...

Please use R Programming and R Studio for this question. Link to the file of the data for this question: https://drive.google.com/file/d/18kGNrHUfgcVv2hMKqL5E05L40xCl6e1M/view?usp=sharing (Use the...

Please use R Programming for this question and take screenshots of code and output once finished. Here is the link to the data file for this question: Question 1: For this problem, Ivou will load and...

Please use R Programming for the question listed below. Here is the link to the data file for this question: Bank...

Please use R Programming for this question and take screenshots of code and output once finished. Here is the link to the data file for this question: Bank...

For this problem, let's consider a simplified version of the problem similar to the distributor in Brazil. The firm we are considering has three facilities, each with the capacity to serve 20 million...

A Drug manufacturing company produces three types of drugs (A, B, and C). The company uses two types of ingredients (I and II), of which 4000 and 6000 grams are available, respectively. The...

\ table [ [ , Productien departments,Service dspartments ] , [ , P 1 , , P ) , 3 1 , 5 2 ] , [ , 5 8 4 0 0 0 , 4 1 8 4 0 0 0 , 5 0 5 4 0 0 0 , 2 6 5 3 0 0 0 , 2 7 5 5 0 0 0 ] , [ Sevwe separtmens 5 1...

directions are posted in this photo Futures and Options Below are several scenarios that requires the use of derivatives. You will be recommending a basic strategy to these clients given what we have...

Uncertainty avoidance: the extent to which pay systems should reflect a need for pay consistency.

denigration of emotional outbursts; being reserved;

4 How should the success of Japanese people management policies and techniques be assessed in different countries cultural context?