Question: Problem 5 : Calculating r - Squared We will continue the analysis started in Problem 4 by calculating the r - squared score for our
Problem : Calculating rSquared
We will continue the analysis started in Problem by calculating the squared score for our predictions. The
first step in this process in to calculate the mean of the observed values.
Complete the following steps in a single code cell:
Use the map transformation along with a lambda function to select the first element of each tuple
in the pairs RDD Call the mean method of the resulting RDD storing the result in a variable
named mean.
Print mean.
Note that this calculation might take a couple of minutes to complete.
We will now calculate the sum of the squared deviations between each observed value and their mean. This
quantity is sometimes referred to as SST or "total sum of squared deviations".
Complete the following steps in a single code cell:
Use the map transformation along with a lambda function to calculate the square of the difference
between each observed value in pairs and mean. Call the sum method of the resulting RDD
storing the result in a variable named SST
Print SST
We will now calculate the squared score for the predictions. The formula for this value is given as follow:
Complete the following steps in a single code cell:
Use SSE and SST to calculate squared, storing the result in a variable named
Print
prior code prob : # Read the data file into an RDD
pairsraw spark.sparkContext.textFileFileStoretablespairsdata.txt
# Count the number of elements
numelements pairsraw.count
printfNumber of elements: numelements
# Display the first elements as strings
printFirst elements:"
for element in pairsraw.take:
printelement
# Function to process each line
def processlinerow:
# Split the line at space and convert tokens to floats
return tuplemapfloat row.split
# Apply processline function and store in pairs RDD
pairs pairsraw.mapprocessline
# Display the first elements as tuples
print
First elements after processing:"
for element in pairs.take:
printelement
# Calculate SSE using lambda function and sum
SSE pairs.maplambda x: x xsum
printf
Sum of Squared Errors SSE: SSE
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
