Question: Hello, I make a Simple Linear Regression for my project, but I want to summary of each function from this Pythone code. I wrote some
Hello, I make a Simple Linear Regression for my project, but I want to summary of each function from this Pythone code.
I wrote some of it but I need your help. Please Help me.
This is my Simple Linear Regression Code.
# Simple Linear Regression
import csv
from random import seed
from random import randrange
from csv import reader
from csv import writer
from math import sqrt
# Load a CSV file
def load_csv(filename):
dataset = list()
with open(filename, 'r') as file:
# Get the first row of the CSV file to avoid runtime error
first_line = file.readline()
# Read the numeric values
csv_reader = reader(file)
for row in csv_reader:
if not row:
continue
dataset.append(row)
return dataset
# Convert string column to float
def str_column_to_float(dataset, column):
for row in dataset:
row[column] = float(row[column].strip())
# Split a dataset into a train and a test set
def train_test_split(dataset, split):
train = list()
train_size = split * len(dataset)
dataset_copy = list(dataset)
while len(train) < train_size:
index = randrange(len(dataset_copy))
train.append(dataset_copy.pop(index))
return train, dataset_copy
# Calculate root mean squared error
def rmse_metric(actual, predicted):
sum_error = 0.0
for i in range(len(actual)):
prediction_error = predicted[i] - actual[i]
sum_error += (prediction_error ** 2)
mean_error = sum_error / float(len(actual))
return sqrt(mean_error)
# Evaluate regression algorithm on training dataset
def evaluate_algorithm(dataset, algorithm, split, *args):
train, test = train_test_split(dataset, split)
test_set = list()
for row in dataset:
row_copy = list(row)
row_copy[-1] = None
test_set.append(row_copy)
predicted = algorithm(dataset, test_set)
actual = [row[-1] for row in dataset]
rmse = rmse_metric(actual, predicted)
return rmse
# Calculate the mean value of a list of numbers
def mean(values):
return sum(values) / float(len(values))
# Calculate the variance of a list of numbers
def variance(values, mean):
return sum( [ (x-mean)**2 for x in values ] )
# Calculate the covariance between x & y
def covariance(x, mean_x, y, mean_y):
covar = 0.0
for i in range(len(x)):
covar += (x[i]-mean_x)*(y[i]-mean_y)
return covar
# Calculate coefficients
def coefficients(dataset):
x = [row[0] for row in dataset]
y = [row[1] for row in dataset]
mean_x, mean_y = mean(x), mean(y)
a = covariance(x, mean_x, y, mean_y) / variance(x, mean_x)
b = mean_y - a*mean_x
return [b, a]
# Simple linear regression algorithm
def simple_linear_regression(train, test):
predictions = list()
b, a = coefficients(train)
for row in test:
y_hat = b + a * row[0]
predictions.append(y_hat)
print('a= %.3f b = %.3f' % (a,b))
return predictions
# Simple linear regression on insurance dataset
seed(1)
# Create list to hold the data that will be written to CSV file
final_data = list()
# Loop 5 times to get the data provided
count = 5
for i in range(count):
# Load and prepare data
filename = input('Enter name of CSV file: ' )
dataset = load_csv(filename)
# Conversion fcn
for i in range(len(dataset[0])):
str_column_to_float(dataset, i)
# Evaluate algorithm
split = 0.6
y_i= evaluate_algorithm(dataset, simple_linear_regression, split)
print('y_i: %.3f' % (y_i))
print()
# Get the values of a,b needed for the final CSV file
b,a = coefficients(dataset)
# Update list
final_data.append([filename, a, b])
# Write data to CSV file
output_file = input('Enter CSV file name to write contents to: ')
with open(output_file, 'w') as csv_file:
writer = csv.writer(csv_file)
writer.writerow(['Data File','a','b'])
for line in final_data:
writer.writerow(line)
# Display message that CSV was written
print('Contents were written to:', output_file)
Then This is my Summary of code so far, please help me what I don't write summary parts yet.
Acknowledge online code here
Error partition into
Include error (rsme from code. Just print it out)
Summary of the various methods used
1. load_csv
-accepts a csv provide by the user from the main
-attempts to open the file
-due to the csv files provided having a row of letters, we need to first read that row to prevent a runtime error while reading the numeric values
-creates a list to store the (x,y) pairs provided
-the list gets appended as long as there is a row to read from the csv files
2. str_column_to_float
-receives a list populated with the data from the csv file
-converts it from string to float values
3.train_test_split
-takes the list and uses a percentage of the data to train the model
-will use the training from the percentage provided to predict (test) the remaining data
4. rmse_metric
5. evaluate_algorithm
6. mean
-sums up the values in the list and divides that summation by the amount of entries
7. variance
8. covariance
9. coefficients
10. simple_linear_regression
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
