Question: Hello, I make a Simple Linear Regression for my project, but I want to summary of each function from this Pythone code. I wrote some

Hello, I make a Simple Linear Regression for my project, but I want to summary of each function from this Pythone code.

I wrote some of it but I need your help. Please Help me.

This is my Simple Linear Regression Code.

# Simple Linear Regression

import csv

from random import seed

from random import randrange

from csv import reader

from csv import writer

from math import sqrt

# Load a CSV file

def load_csv(filename):

dataset = list()

with open(filename, 'r') as file:

# Get the first row of the CSV file to avoid runtime error

first_line = file.readline()

# Read the numeric values

csv_reader = reader(file)

for row in csv_reader:

if not row:

continue

dataset.append(row)

return dataset

# Convert string column to float

def str_column_to_float(dataset, column):

for row in dataset:

row[column] = float(row[column].strip())

# Split a dataset into a train and a test set

def train_test_split(dataset, split):

train = list()

train_size = split * len(dataset)

dataset_copy = list(dataset)

while len(train) < train_size:

index = randrange(len(dataset_copy))

train.append(dataset_copy.pop(index))

return train, dataset_copy

# Calculate root mean squared error

def rmse_metric(actual, predicted):

sum_error = 0.0

for i in range(len(actual)):

prediction_error = predicted[i] - actual[i]

sum_error += (prediction_error ** 2)

mean_error = sum_error / float(len(actual))

return sqrt(mean_error)

# Evaluate regression algorithm on training dataset

def evaluate_algorithm(dataset, algorithm, split, *args):

train, test = train_test_split(dataset, split)

test_set = list()

for row in dataset:

row_copy = list(row)

row_copy[-1] = None

test_set.append(row_copy)

predicted = algorithm(dataset, test_set)

actual = [row[-1] for row in dataset]

rmse = rmse_metric(actual, predicted)

return rmse

# Calculate the mean value of a list of numbers

def mean(values):

return sum(values) / float(len(values))

# Calculate the variance of a list of numbers

def variance(values, mean):

return sum( [ (x-mean)**2 for x in values ] )

# Calculate the covariance between x & y

def covariance(x, mean_x, y, mean_y):

covar = 0.0

for i in range(len(x)):

covar += (x[i]-mean_x)*(y[i]-mean_y)

return covar

# Calculate coefficients

def coefficients(dataset):

x = [row[0] for row in dataset]

y = [row[1] for row in dataset]

mean_x, mean_y = mean(x), mean(y)

a = covariance(x, mean_x, y, mean_y) / variance(x, mean_x)

b = mean_y - a*mean_x

return [b, a]

# Simple linear regression algorithm

def simple_linear_regression(train, test):

predictions = list()

b, a = coefficients(train)

for row in test:

y_hat = b + a * row[0]

predictions.append(y_hat)

print('a= %.3f b = %.3f' % (a,b))

return predictions

# Simple linear regression on insurance dataset

seed(1)

# Create list to hold the data that will be written to CSV file

final_data = list()

# Loop 5 times to get the data provided

count = 5

for i in range(count):

# Load and prepare data

filename = input('Enter name of CSV file: ' )

dataset = load_csv(filename)

# Conversion fcn

for i in range(len(dataset[0])):

str_column_to_float(dataset, i)

# Evaluate algorithm

split = 0.6

y_i= evaluate_algorithm(dataset, simple_linear_regression, split)

print('y_i: %.3f' % (y_i))

print()

# Get the values of a,b needed for the final CSV file

b,a = coefficients(dataset)

# Update list

final_data.append([filename, a, b])

# Write data to CSV file

output_file = input('Enter CSV file name to write contents to: ')

with open(output_file, 'w') as csv_file:

writer = csv.writer(csv_file)

writer.writerow(['Data File','a','b'])

for line in final_data:

writer.writerow(line)

# Display message that CSV was written

print('Contents were written to:', output_file)

Then This is my Summary of code so far, please help me what I don't write summary parts yet.

Acknowledge online code here

Error partition into

Include error (rsme from code. Just print it out)

Summary of the various methods used

1. load_csv

-accepts a csv provide by the user from the main

-attempts to open the file

-due to the csv files provided having a row of letters, we need to first read that row to prevent a runtime error while reading the numeric values

-creates a list to store the (x,y) pairs provided

-the list gets appended as long as there is a row to read from the csv files

2. str_column_to_float

-receives a list populated with the data from the csv file

-converts it from string to float values

3.train_test_split

-takes the list and uses a percentage of the data to train the model

-will use the training from the percentage provided to predict (test) the remaining data

4. rmse_metric

5. evaluate_algorithm

6. mean

-sums up the values in the list and divides that summation by the amount of entries

7. variance

8. covariance

9. coefficients

10. simple_linear_regression

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

# The following code will make the html font larger body{ /* Normal */ font-size: 20px; } td { /* Table */ font-size: 20px; } h1 { /* Header 1 */ font-size: 28px; color: DarkBlue; } h2 { /* Header 2...

I need to see the SPSS output. You need to have all z-scores, all charts, all descriptives data from SPSS, everything you used to answer the questions. I am sending you what the previous tutor sent...

Python Script : To complete the tasks listed below, open the Project Three Jupyter Notebook link in the Assignment Information module.This notebook contains your data set and the Python scripts for...

ML in a nutshell Optimization, and machine learning, are intimately connected. At a very coarse level, ML works as follows. First, you come up somehow with a very complicated model y = M(x, 0), which...

Total Number of Wins by Average Points Scored 70 60 50 Total Number of Wins 40 30 20 10 85 90 95 100 105 110 Average Points Scored Correlation between Average Points Scored and the Total Number of...

In: Step 6: Multiple Regression: Predicting the Total Number of Wins using Average Points Scored, Average Relative Skill, Average Points Differential and Average Relative Skill Differential The coach...

Project Three: Simple Linear Regression and Multiple Regression This notebook contains step-by-step directions for Project Three. It is very important to run through the steps in order. Some steps...

This question involves the use of AGGREGATE linear PYTHOIN regression on the Auto data set. (a) Perform a simple linear regression with mpg as the response and horsepower as the predictor. Describe...

STAT 480 Project 1 Instructions: 1. Please read this project quickly now and then again more carefully later, so that you will understand what your group needs to manage this project. If you have any...

Question 4 (a) Draw and write the truth-table for RS latch NOR version. Why is the condition RS = 11 not allowed? (b) Figure 4(ii) shows the clock and X waveforms for the circuit in Figure 4(i)....

Pearl Inc. is a Canadian controlled private corporation (CCPC) that owns 100% of the voting shares of Oyster Ltd. and 25% of the voting shares of Shell Corp. The fair market value of the Shell Corp....

Under what circumstances would you use Roberts Rules of Order?

Metlock Coffee Equipment sells European-style coffee makers and uses a periodic inventory system. Its inventory records show that at June 1 , Metlock had 14 units on hand at a cost of $242 each....

5. It often is a good idea to make others dependent on you for your expertise and knowledge.

3. What do you think are the main challenges and implications being faced regarding the changes in the service concept of KidZania, centered on experiential learning?

2. Organizational politics should have no role in the administration of public programs.