Question: Hello, I make a Simple Linear Regression for my project, but I want to summary of each function from this Pythone code. I wrote some

Hello, I make a Simple Linear Regression for my project, but I want to summary of each function from this Pythone code.

I wrote some of it but I need your help. Please Help me.

This is my Simple Linear Regression Code.

# Simple Linear Regression

import csv

from random import seed

from random import randrange

from csv import reader

from csv import writer

from math import sqrt

# Load a CSV file

def load_csv(filename):

dataset = list()

with open(filename, 'r') as file:

# Get the first row of the CSV file to avoid runtime error

first_line = file.readline()

# Read the numeric values

csv_reader = reader(file)

for row in csv_reader:

if not row:

continue

dataset.append(row)

return dataset

# Convert string column to float

def str_column_to_float(dataset, column):

for row in dataset:

row[column] = float(row[column].strip())

# Split a dataset into a train and a test set

def train_test_split(dataset, split):

train = list()

train_size = split * len(dataset)

dataset_copy = list(dataset)

while len(train) < train_size:

index = randrange(len(dataset_copy))

train.append(dataset_copy.pop(index))

return train, dataset_copy

# Calculate root mean squared error

def rmse_metric(actual, predicted):

sum_error = 0.0

for i in range(len(actual)):

prediction_error = predicted[i] - actual[i]

sum_error += (prediction_error ** 2)

mean_error = sum_error / float(len(actual))

return sqrt(mean_error)

# Evaluate regression algorithm on training dataset

def evaluate_algorithm(dataset, algorithm, split, *args):

train, test = train_test_split(dataset, split)

test_set = list()

for row in dataset:

row_copy = list(row)

row_copy[-1] = None

test_set.append(row_copy)

predicted = algorithm(dataset, test_set)

actual = [row[-1] for row in dataset]

rmse = rmse_metric(actual, predicted)

return rmse

# Calculate the mean value of a list of numbers

def mean(values):

return sum(values) / float(len(values))

# Calculate the variance of a list of numbers

def variance(values, mean):

return sum( [ (x-mean)**2 for x in values ] )

# Calculate the covariance between x & y

def covariance(x, mean_x, y, mean_y):

covar = 0.0

for i in range(len(x)):

covar += (x[i]-mean_x)*(y[i]-mean_y)

return covar

# Calculate coefficients

def coefficients(dataset):

x = [row[0] for row in dataset]

y = [row[1] for row in dataset]

mean_x, mean_y = mean(x), mean(y)

a = covariance(x, mean_x, y, mean_y) / variance(x, mean_x)

b = mean_y - a*mean_x

return [b, a]

# Simple linear regression algorithm

def simple_linear_regression(train, test):

predictions = list()

b, a = coefficients(train)

for row in test:

y_hat = b + a * row[0]

predictions.append(y_hat)

print('a= %.3f b = %.3f' % (a,b))

return predictions

# Simple linear regression on insurance dataset

seed(1)

# Create list to hold the data that will be written to CSV file

final_data = list()

# Loop 5 times to get the data provided

count = 5

for i in range(count):

# Load and prepare data

filename = input('Enter name of CSV file: ' )

dataset = load_csv(filename)

# Conversion fcn

for i in range(len(dataset[0])):

str_column_to_float(dataset, i)

# Evaluate algorithm

split = 0.6

y_i= evaluate_algorithm(dataset, simple_linear_regression, split)

print('y_i: %.3f' % (y_i))

print()

# Get the values of a,b needed for the final CSV file

b,a = coefficients(dataset)

# Update list

final_data.append([filename, a, b])

# Write data to CSV file

output_file = input('Enter CSV file name to write contents to: ')

with open(output_file, 'w') as csv_file:

writer = csv.writer(csv_file)

writer.writerow(['Data File','a','b'])

for line in final_data:

writer.writerow(line)

# Display message that CSV was written

print('Contents were written to:', output_file)

Then This is my Summary of code so far, please help me what I don't write summary parts yet.

Acknowledge online code here

Error partition into

Include error (rsme from code. Just print it out)

Summary of the various methods used

1. load_csv

-accepts a csv provide by the user from the main

-attempts to open the file

-due to the csv files provided having a row of letters, we need to first read that row to prevent a runtime error while reading the numeric values

-creates a list to store the (x,y) pairs provided

-the list gets appended as long as there is a row to read from the csv files

2. str_column_to_float

-receives a list populated with the data from the csv file

-converts it from string to float values

3.train_test_split

-takes the list and uses a percentage of the data to train the model

-will use the training from the percentage provided to predict (test) the remaining data

4. rmse_metric

5. evaluate_algorithm

6. mean

-sums up the values in the list and divides that summation by the amount of entries

7. variance

8. covariance

9. coefficients

10. simple_linear_regression

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!