I am attempting to answer thefollowing: Utilize your Python environment to derive structure fromunstructured data. You will
Question:
I am attempting to answer thefollowing:
Utilize your Python environment to derive structure fromunstructured data. You will utilize the data set "AirlineSentiment" from Kaggle located at Kaggles website. From Welkin10,the dataset "Airline Sentiment". /welkin10/airline-sentinment
Using this data set, you will create a text analytics Pythonapplication that extracts themes from each comment using termfrequency?inverse document frequency (TF?IDF) or simple wordcounts. For the deliverable, provide your Python file and a .csvwith your results added as a column to the original data set.
I have the following code:
import pandas as pdimport numpy as np
#Using sklearn library to calculate tfidffrom sklearn.feature_extraction.text import TfidfVectorizer
#Download datasetdemo_document =pd.read_csv("C:/Users/Documents/Tweets.csv") dataset = []for demo in demo_document[:100]: dataset.append(" ".join(demo))
#Print first 15 documents of the datasetprint("DEMO DATASET")for i in range(15): print(i,dataset[i])
#fit_transform will calculate the idf-idf scoresmodel = TfidfVectorizer(use_idf=True)tfIdf = model.fit_transform(dataset)
#Print tf-idf of first 15 documents of the datasetprint("TF-IDF VALUES:")df = pd.DataFrame(tfIdf[:15].todense(),columns=model.get_feature_names())print (df)
#Export dataframe to .csvexport_df = pd.DataFrame(tfIdf.todense(),columns=model.get_feature_names())export_df.to_csv("C:/Users/Documents/Tweets_data.csv")
I am receiving the following error:
ValueError: empty vocabulary; perhaps the documents only containstop words.
How do I go about resolving this error?
Microeconomics An Intuitive Approach with Calculus
ISBN: 978-0538453257
1st edition
Authors: Thomas Nechyba