Question: Code This section needs to be completed using Python 3.6+. You will also require following packages: pandas numpy NLTK or SpaCy
Code This section needs to be completed using Python 3.6+. You will also require following packages: • pandas • numpy • NLTK or SpaCy • scikit-learn
use of libraries like scikit-learn is prohibited
![Q2. Logistic Regression: Code [25] In this task, you will learn to](https://dsd5zvtm8ll6.cloudfront.net/si.experts.images/questions/2024/02/65cb261a31308_37865cb261a0aba8.jpg)
Q2. Logistic Regression: Code [25] In this task, you will learn to build a Logistic Regression Classifier for the same "Financial Phrasebank" dataset. Bag of Words model will be used for this task. 1. Use 60% of the data selected randomly for training, 20% selected randomly for testing and the remaining 20% for validation set. Use classes 'positive' and 'negative' only. Perform the same cleaning tasks on the text data and build a vocabulary of the words. Link to the dataset. Link to the scikit-learn documentation for metrics. 2 2. Using CountVectorizer3, fit the cleaned train data. This will create the bag-of-words model for the train data. Transform test and validation sets using same CountVectorizer. 3. To implement the logistic regression using following equations, z = W.x = o(z) we need the weight vector W. Create an array of dimension equal to those of each x from the CountVectorizer. 4. Apply above equations over whole training dataset and calculate y and cross-entropy loss LCE which can be calculated as LCE = -y log y + (1 - y) log(1-) 5. Now, update the weights as follows: W+1 = W - (y - yi).xi Here, (y - y).x is the gradient of sigmoid function and a = 0.01 is the learning rate. 6. Repeat step 4 and step 5 for 500 iterations or epochs. For each iteration, calculate the cross-entropy loss on validation set. 7. Calculate the accuracy and macro-average precision, recall, and F1 score and provide the confusion matrix on the test set.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
