Question: Initially, remove the transcriptions having category labels less than 5 0 in the corpus as in [ 2 ] . Apply data preprocessing techniques also
Initially, remove the transcriptions having category labels less
than in the corpus as in Apply data preprocessing techniques also following the steps in For
feature extraction, apply BagofWords CountVectorizer and TFIDF TfidfVectorizer separately.
Implement Multinomial Nave Bayes, Random Forest, XGBoost, LightGBM for the traditional machine
learning algorithms of the medical text classification process. Then, apply at least one complex deep
neural network architecture ensemble learning using D CNN LSTM and GRU. Show the confusion
matrix, accuracy, precision, recall and Fscore for each category class of the implemented solutions.
In the next phase, use the NER code previously implemented in the first part of the project. Use the
labeled named entities and their category labels as the input, then follow the same training and
evaluation steps.
Finally, apply SMOTE oversampling method for the best accuracy values in the previous two phases
and compare accuracy, precision, recall and Fscore with and without oversampling. Write a report that
explains and illustrates the results step by step
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
