Question: We have so far been concerned with predicting numerical quantities, like salaries. Now suppose we want to predict which dining option on campus a UCSD

a UCSD student will select for lunch (e.g., Burger King, Tahini, FanFan,

We have so far been concerned with predicting numerical quantities, like salaries. Now suppose we want to predict which dining option on campus a UCSD student will select for lunch (e.g., Burger King, Tahini, FanFan, Dirty Birds, Taco Villa, etc.). Predicting a discrete category (as opposed to a number) is an important machine learning task called classification. We can use empirical risk minimization to make classifications, too. Suppose we have gathered a data set of n lunch preferences of students over the past week: Burger King Dirty Birds Burger King Tahini Taco Villa Burger King Dirty Birds Taco Villa The first step is to choose a number which will uniquely represent each option. For instance: Dirty Birds 1 Burger King 2 Tahini 3 Fan-Fan 4 Taco Villa 5 We then map each instance of a dining option to its corresponding number, giving us a new data set of numbers. For instance, the data above becomes: We then map each instance of a dining option to its corresponding number, giving us a new data set of numbers. For instance, the data above becomes: 21235215 a) 6. Now that we have converted the data to a list of numbers, we can make a prediction by minimizing the mean absolute loss. Explain why this is not a good idea. b) 66 Let's use the zero-one loss, which is defined as follows: L01(h,y)={0,1,h=yh=y As usual, define the risk to be the average loss: R01(h)=n1i=1nL01(h,y) Notice that R01(h) can be interpreted as the misclassification rate. That is, if R01(h)=.7, then predicting h would result in the wrong answer for 70% of the data points. Given the data set {4,2,4,1,3,4,4,3,2,5}, plot the empirical risk R01(h) for h[0,5]. Hint: the function should have point discontinuities. c) 6. Is gradient descent useful for minimizing the risk with zero-one loss? Why or why not? Make reference to your plot of the risk in your answer. Hint: the risk is indeed non-convex, but gradient descent can still be useful for minimizing non-convex functions. Is there some other reason

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

b) Let's use the zero-one loss, which is defined as follows: Lo1(h, y) = { 0, h = y, 1, h = y. As usual, define the risk to be the average loss: n 1 R01 (h) = Loi (h, y) n i=1 2 = .7, then Notice...

These two questions is what I will present, and please read through the case first then answer these two questions. Thank you! 1. Suppose you expected the dollar to appreciate in value by 10% against...

Describing Data Once we have collected data from surveys or experiments, we need to summarize and present the data in a way that will be meaningful to the reader. We will begin with graphical...

Confirming Pages C H A P T E R 19 Analyzing Information and Writing Reports Chapter Outline Using Your Time Efficiently Analyzing Data and Information for Reports Identifying the Source of the Data...

Set Student Name: 1. Describe the relationship between two variables that have a correlation coefficient value: a. Near -1 b. Near 0 c. Near 1 2. Data was collected where a weightlifter was asked to...

based on this case - i need to answer build a financial analysis to forciase what a new location should make. i already know it shkukd be regionally close but i am unclear how to build out the...

MBA ASSIGNMENT! PLEASE ANSWER THIS I REALLY NEED THIS! ITS ONE OF MY LAST ASSIGNMENTS! ALSO IF YOU USE YOUR OWN WORDS I WILL UPVOTE!!! Reading (sorry but its 29 pages and I couldn't link a pdf)...

Instuctor's Annotated Edition TENTH EDITION Understandable Statistics Concepts and Methods Charles Henry Brase Regis University Corrinne Pellillo Brase Arapahoe Community College Australia Brazil...

3 COLLEGE ALGEBRA - TRIGONOMETRY Business and Finance (MAT115) This course will start with a review of basic algebra (factoring, solving linear equations, and equalities, etc.) and proceed to a study...

Supply Chain Management Introduction Outline What is supply chain management? Significance of supply chain management. Push vs. Pull processes utdallas.edu/~metin 1 A Generic Supply Chain Sources:...

Poller Corporation (a fictional company) operates a chain of discount retail stores. The follo ing is information taken from a recent Potter annual report. Note 5: Mortgages on Property, Plant, and...

A chain of mass m = 0.80 kg and length l = 1.5 m rests on a rough-surfaced table so that one of its ends hangs over the edge. The chain starts sliding off the table all by itself provided the over-...

QUESTION 3 An investor sells a futures contract when the futures price is $ 1 , 5 0 0 . Each contract is on 1 0 0 units of an asset. The contract is closed out when the futures price is $ 1 , 5 4 0 ....

According to home and Wachowicz the Chief Financial officer should report to the Financial managerAccording to home and Wachowicz the Chief Financial officer should report to the Financial manager...

(Appendices) UNCOLLECTIBLE ACCOUNT EXPENSE: CREDIT SALES METHOD. The Glass House, a glass and china store, sells nearly half its merchandise on credit. During the past 4 years, the following data...

(Appendices) PURCHASES, TRANSPORTATION-IN, AND PURCHASE RETURNS. Alpharack Company sells a line of tennis equipment to retailers. Alpharack uses periodic inventory accounting. Alpharack engaged in...

(Appendices) Economics Select a country with an economic system that is considered communist or socialist. Conduct online research about quality of life, life expectancy rates, tax rates, income and...