Question: You are applying for a position at the data science team of USDA and you are given data associated with determining appropriate parasite treatment of


You are applying for a position at the data science team of USDA and you are given data associated with determining appropriate parasite treatment of canines. The suggested treatment options are determined based on a logistic regression model that predicts if the canine is infected with a parasite. The data is given in the site: https://data.world/ehales/grls-parasitestudy/workspace/file?filename \( =\mathrm{CBC}_{\text {_data.csv Login using you University }} \) Google account to access the data and the description that includes a paper on the study (you dont need to read the paper to solve this problem necessarily). Your target variable y column is titled parasite_status. Question 3 Training - Loss function (5 points) Write the expression of the loss as a function of w that makes sense for you to use in this problem. LCE= NOTE: The loss will be a function that will include this function: (a)=1+ea1 Question 4 Training - Gradient (5 points) Write the expression of the gradient of the loss with respect to the parameters - show all your work. wLCE= Question 5 - Imbalanced dataset (10 points) You are now told that in the dataset p(y=0)>>p(y=1) Can you comment if the accuracy of Logistic Regression will be affected by such imbalance
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
