Question: hello, I need help , when i try to call my pipline i get an error: from pyspark.ml . feature import VectorAssembler, OneHotEncoder, StringIndexer from
hello, I need help when i try to call my pipline i get an error:
from pyspark.mlfeature import VectorAssembler, OneHotEncoder, StringIndexer
from pyspark.mlevaluation import MulticlassClassificationEvaluator
from pyspark.mlclassification import LogisticRegression, DecisionTreeClassifier
from pyspark.ml import Pipeline
# create lists of feature names
numfeatures age "hypertension","heartdisease", "avgglucoselevel", "bmi"
catfeatures col for col in strokedfcolumns if col not in numfeatures stroke
# create lists of integerencoded and onehot encoded features
ixfeatures col ix for col in catfeatures
vecfeatures col vec" for col in catfeatures
# create StringIndexer, OneHotEncoder, and VectorAssembler objects
indexer StringIndexerinputColscatfeatures, outputColsixfeatures, handleInvalid"skip"
encoder OneHotEncoderinputColsixfeatures, outputColsvecfeatures
assembler VectorAssemblerinputColsnumfeatures vecfeatures, outputCol"features"
# create pipeline
pipeline Pipelinestagesindexer encoder, assembler
# fit pipeline to data
train pipeline.fitstrokedftransformstrokedf
# persist train DataFrame
train.persist
# display first rows of features and stroke columns of train
train.selectfeatures "stroke"show truncateFalse
train train.withColumnlabel trainstroke Applying the Model to New Data : data
"gender": Female "Female", "Male", "Male"
"age":
"hypertension":
"heartdisease":
"evermarried": No "Yes", "Yes", No
"worktype": Private "Selfemployed", "Private", "Govtjob"
"Residencetype": Urban "Rural", "Rural", "Urban"
"avgglucoselevel":
"bmi":
"smokingstatus": smokes "formerly smoked", "unknown", "never smoked"
newdata pdDataFramedata
newdata
processednewdata pipeline.fitnewdatatransformnewdata : AttributeError: 'DataFrame' object has no attribute jdf
~ipykernelcommand in
processednewdata pipeline.fitnewdatatransformnewdata
databrickspythonlibpythonsitepackagespandascoregenericpy in self name
and name not in self.accessors
and self.infoaxis.canholdidentifiersandholdsnamename
:
return selfname
return object.getattributeself name
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
