Question: How can I fix this error? Here are the instructions, 4.10 LAB: Prediction with Logistic Regression The file Invistico_Airline_LR.csv contains information from an airline using
How can I fix this error?
Here are the instructions,
4.10 LAB: Prediction with Logistic Regression
The file Invistico_Airline_LR.csv contains information from an airline using the alias Invistico Airline on customer satisfaction, as well as details on each customer. The columns of interest are Gender, Age, Class, Arrival_Delay_in_Minutes, and satisfaction.
Read the file Invistico_Airline_LR.csv into a data frame.
Obtain user defined values female, age, economy, and delay.
Re-code the categorical variables Gender, Class, and satisfaction into dummy variables.
Create a new data frame X from the predictor variables Gender_female, Age, Class_Eco, and Arrival_Delay_in_Minutes, in that order.
Create a response variable Y from the dummy variable satisfaction_satisfied.
Perform logistic regression on X and Y.
Use the user defined values to predict the probability that a customer with those values is satisfied.
Ex: If the input is 1 34 0 10 the ouput is:
Here is the code thus far
import the necessary modules import numpy as np import statsmodels.api as sm import pandas as pd
female = int(input()) age = float(input()) economy = int(input()) delay = float(input())
flights = pd.read_csv("Invistico_Airline_LR.csv") # read in the file Invistico_Airline_LR.csv
# remove missing data flights.dropna(axis = 0, inplace = True)
flights =pd.get_dummies(flights, columns=['satisfaction', 'Gender','Class']) # recode the categorical variables Gender, Class, and satisfaction as dummy variables
# create a new data frame from the variables Gender_Female, Age, Class_Eco, and Arrival_Delay_in_Minutes, in that order. X = flights[['Gender_Female', 'Age', 'Class_Eco', 'Departure_Delay_in_Minutes']] X = sm.add_constant(X) Y = flights[['satisfaction_satisfied']] # set Y as the response variable satisfaction_satisfied
model = sm.Logit(Y,X) # perform logistic regression on X and Y model = model.fit()
ex = np.array([1, female, age, economy, delay]) # create an array with 1 for the intercept, and the user input values female, age, economy, and delay
# find the predicted probablility that a customer with the user input values is satisfied prediction = model.predict(ex)
print(prediction)
Program errors displayed here Traceback (most recent call last): File "main.py", line 6 , in female = int (input()) ValueError: invalid literal for int() with base 10: ' 1 34 10
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
