Question: Please us the csv file and answer the question by R studio https://github.com/wampeh1/Ecog314_Spring2017/blob/master/Lecture5/lending_club_loans.csv Part a. # Right now the term varaible is a character taking

Please us the csv file and answer the question by R studio

https://github.com/wampeh1/Ecog314_Spring2017/blob/master/Lecture5/lending_club_loans.csv

Part a. # Right now the term varaible is a character taking on the values " 36 months" " 60 months". # using the mutate(), substr(), and as.numeric() functions, please change this variable so that it # takes on the values 36 or 60. Be sure to re-asign the results to lending_club_small

lending_club_small <- ####### # Part b # The int_rate variable is a charater variable and contains an annoying "%". In order to learn more about this variable # we need to convert this into a numeric. A useful function in this case is the gsub function. # The gsub function replaces all matches of a string in a larger string. For example if my_string = "my_name_is_shifrah" # then gsub("_", " ", my_string) will return "my name is shifrah". Use this function to turn the int_rate varaible from its current format # ("10.65%") into a decimal (0.1065)

lending_club_small <- ######### # Part c # In order to recode the emp_length variable we will first need to find the unique values in the # dataset and then decide how we want to encode them

# i. Find the unique values of the emp_length variable

# ii. Create a new variable called emp_term in your lending_club_small data frame which has numeric values for the # number of years worked. Treat "< 1 year" as 0 and "10+ years" as 10. Assign your results to lending_club_small # HINT: you will need to mutate the variable emp_term multiple times using the ifelse() function. # you can mutate the same variable multiple times within a mutate call.

########## # part d # i. Check the data dictionary for a description of the total_acc variable # what does the variable show?

## ANSWER HERE

# ii. Do you think it makes sense to leave it as an integer value # or would it be more useful to bin the data into character based buckets # such as small, medium, large? # Use the hist command to plot a basic histogram of the total_acc variable

# ANSWER HERE

# iii. Use the cut() function to create a new column, crdt_lines. You will what to look up the syntax for # this function. But generally cuts takes three arguments: a vector that you want to cut, a vector of the values to cut on # and a vector of labels for each of the cuts. # In this case the values should be "small" if total acc [0, 10], "medium" if (10, 25] and # "large" if (25, 100]. Be sure to assign your data frame to lending_club_small

############ # Part e # We have the loan amount, the interest rate, and the term, # use those to create a column, ttl_val, showing the total amount paid, A # with the formula A = P(1 + r/t)^nt. Assume monthly payments and that interest compounds monthly. # P = principle, r = rate, t = number of periods per year, n = number of years

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!