Question: linear regression please help quick!! 91 The following is the simple linear regression model: y = Bo + Bx For a given set of (x;;

linear regression

linear regression please help quick!! 91 The

please help quick!!

91 The following is the simple linear regression model: y = Bo + Bx For a given set of (x;; yj),i= 1, .. k, the following best-fit equation can be used to calculate the Bo and B, values, 5 B1 (*;. syp)-(t=1:17 (27)-(k*72) Bo = 7 - BLE where (Xj: y, are observed values, * is the mean = x;), is the mean = And the corresponding line is called the line of best fit. (a) For 5 given points {(-2,-2).(0,0).(2.2).(-1,1).(2.-1)}, draw the points in the two-dimensional coordinate system. Assuming there is a linear relation between y and x. Make a guess of the line of best fit L in the form of y=kx+b for the five points. Draw the guessed best-fit line on the coordinate system as well. k k (b) Manually calculate the Bo and B, for the linear regression model using the formula given above. Provide the line of best fit Le based on the calculated Bo and B1- (c) The least-squares error is defined as below: Es} = 01- 970:- ) where y; is the predicted value (through the best fit line) for a given ti, and ; = (V; - ;). Compute the least-squares errors for both L, and L. Compare which line has a smaller least-squares error. (d) Obtain the linear regression model result through R to verify your calculation, by showing the R code. Question 2 Consider the training examples shown in the following table for a binary classification problem CID I Gender Type | Size | Class 1 2 3 CO CO co 4 5 CO 6 7 8 9 10 11 12 13 14 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I 1 I 1 1 M Family Small 1 M | Sports Medium M | Sports Medium M | Sports Large 1 MI Sports E Large M | Sports | E Large F Sports | Small 1 F | Sports Small 1 F | Sports Medium F | Luxury Large 1 M 1 Family Large 1 M | Family | E Large 1 M Family | Medium I M | Luxury | E Large F | Luxury Small 1 F | Luxury Small 1 F 1 I Luxury Medium F | Luxury Medium F | Luxury Medium FLuxury Large 1 15 16 17 18 19 CO C1 C1 C1 C1 C1 C1 C1 C1 C1 C1 20 (a) What is the overall entropy of this collection of training examples with respect to the two classes? (6) What is the information gain if the split is based on "Gender"? c) What is the information gain if the split is based on "Type"? (d) What is the information gain if the split is based on "Size"? (e) Which attribute provides the best split if information gain is the splitting criterion. (f) Based on the given training date set, calculate the probabilities required for a Nave Bayers classifier. Using Laplace smoothing for k=1 estimate to smooth the following probability estimates: P(C=CO) = ? P(C=C1) = ? P(G=MC=CO) = ? P(G=MC=C1) = ? P(G=FC=CO) = ? P(G=F C=C1) = ? P(T=F C=CO) = ? P(T=F C=C1) = ? P(T =S C=C0) = ? P(TES C=C1) = ? P(T EL C=C0) = ? P(T EL C=C1) = ? P(S=SC=CO) = ? P(S=SC=C1) = ? P(SEM C=CO) = ? P(S=MC=C1) = ? P(SEL C=C0) = ? P(SEL C=C1) = ? P(S=E C=CO) = ? P(S=E C=C1) = ? (g) Using your Nave Bayers model with Laplace smoothing for k=1 estimate to classify the following three records. Gender Type 1 Size M F F I Luxury | Medium | Luxury | Large I Luxury | E Large Question 3 Consider the data set shown in below: TID 1 Items bought 0001 0003 0010 0013 0022 0028 0029 0030 0037 0051 1 I 1 1 1 I 1 1 1 1 {P, S, T} {P, Q, R, T} {P, Q, S, T} {P, R, S, T} {Q, R, T} {Q, S, T} {R, S} {P, Q, R} {P, S, T} {P, Q, T} (2) Computer the support for itemsets {P}, {Q.S), and {P, Q, S} by considering each TID as a market basket. (6) Use the results in (a) to compute the confidence for the association rules (QS =P), and (P=QS). Is confidence a symmetric measure? (c) Assuming minsup=0.4, use the Apriori algorithm to generate all the frequent itemsets. (d) Generate all the association rules for the frequent 2-itemsets obtained in (c). Calculate the confidence for the rules. (e) For the rules obtained in (d), calculate the lift for the rules with the highest three confidences.- Question 4 Consider the following 10 points in the three-dimensional coordinate system: Pid | x | y | z P01 | 8 | 8 | 1 PO2 | 6 | 4 | 9 PO3 | 3 | 1 | 5 P04 | 6 | 3 | 2 PO5 | 8 | 6 | 6 P06 | 7 | 4 | 8 P07 | 3 | 9 | 4 P08 | 5 | 7 | 9 P09 | 7 | 1 | 3 P10 | 6 | 9 | 4 Assuming the points can be grouped into two clusters, and the initial clusters are (P01, PO2, PO3, P05). (PO4. PO6. P07, POS, P09. P10). Using K-means algorithm to perform clustering on the 10 points based on:- (a) Euclidean distance; (b) Manhattan distance. For two points P1 : (x1.yl zl) and P2 : (x2,y2,z2), the Manhattan distance is defined as d(P1,P2) = xl -x2|+ly1 - y2|+|z1 -z2| Find out the final clusters for the data points and the corresponding centroids. Show your calculations and list the clusters and centroids for each iteration

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related General Management Questions!

SIMPLE LINEAR REGRESSION PLEASE HELP ME UNDERSTAND HOW SOLVE PROBLEM IF YOU KNOW HOW TO DO SIMPLE LINEAR REGRESSION. Each week coaches in a certain football league face a decision during the game. On...

HELP me to solve these problems thanks The following is a portion of a classic data set called the \"pilot plot data" in Fitting Equations to Data by Daniel and Wood, published in 19?1. The response...

Please help me answer the following question. Thank you Question 4 X H Data Table Appraised Value Property Size Age 462.1 0.2214 41 362.4 0.2172 53 424.3 0.1635 21 546.9 0.4695 18 403.1 0.2564 45...

Requirement: A two-page summary report on Board of Directors effectiveness based on the attached paper. Attributes and Structure of an Effective Board of Directors: A Theoretical Investigation...

Need help with Simple Linear Regression Please! Everything is in order, separated by #'s. 1 {b} What does the scatter diagram developed in part (a) indicate about the relationship between the two...

And we wish to fit a simple linear regression model to this data. Use the projection matrix to calculate the following: Suppose you are given the following data: Y: ALMOND = leill And we wish to t a...

Set Student Name: 1. Describe the relationship between two variables that have a correlation coefficient value: a. Near -1 b. Near 0 c. Near 1 2. Data was collected where a weightlifter was asked to...

Instuctor's Annotated Edition TENTH EDITION Understandable Statistics Concepts and Methods Charles Henry Brase Regis University Corrinne Pellillo Brase Arapahoe Community College Australia Brazil...

Chrome Edit View History Bookmarks Profiles Tab Window Help 92% Sat 8:37 AM Q Final exam , Saturday 12/6/202 x Class : Sun , Tue and Thur ... final86 C...

Create charts to better understand data sets. For cross-sectional data, use a scatter chart. For time series data, use a line chart. Linear y = a + bx Logarithmic y = ln(x) Polynomial (2nd order) y =...

Suppose you are going to start a new business, do the projected following financial statements, and analyse them in your own words: Manufacturing Account Trading Account Profit & Loss Account A...

Campbell loaned Perry Dixon $7,000, which was secured by a possessory security interest in stock owned by Perry. The stock had a market value of $4,000. In addition, Campbell insisted that Perry...

Wew he cash recelicts schedule. Verthe ndfrional infornation. Wers the cash peomentu nchedule. eash deficiency with e mivut sipn or pacestietes ) \ table [ [ , \ table [ [ Gahar ] , [ Cash ] , [...

Can you answer both please and thank you 1 of 12 Two vectors A and B are shown in the figure. Vector A has a magnitude of rA - 22.5 and an angle of 04 - 30.3. Vector B has a magnitude of B - 48.5 and...