Question: Please help with 2i and 2j! Langauge is python 2i) Classification #2: Using Only Message Now, rather than using only the subject line, let's try

Please help with 2i and 2j! Langauge is python 2i) Classification #2:Please help with 2i and 2j! Langauge is python

2i) Classification #2: Using Only Message Now, rather than using only the subject line, let's try to classify the email conversation using the message field. Using the function train_DT , train a decision tree on message_train_x and category_train_Y and save the model as message_clf. In [ ]: # YOUR CODE HERE raise Not ImplementedError() In [ ]: assert isinstance(message_clf, DecisionTreeClassifier) assert hasattr(message_clf, "predict") We'll now check performance in our training data for this classification model: In [ ]: # You should observe an accuracy of around 100%. message_predicted_train_Y = message_clf.predict(message_train_x) print(classification_report(category_train_Y, message_predicted_train_Y)) And again, we'll check performance in our test data, for this model In [ ]: # You should observe an accuracy of around 57%. message_predicted_test_Y = message_clf.predict(message_test_x) print(classification_report(category_test_Y, message_predicted_test_Y)) In [ ]: assert message_predicted_train_Y.shape == (14984,) assert message_predicted_test_Y.shape == (3746,) precision, recall, -, = precision_recall_fscore_support(category_train_Y, message_predicted_train_Y) assert np.isclose (precision[0], 1.0, 0.027 assert np.isclose(precision[1], 1.00, 0.02) 2j) Classification #3: Combine subject and message Create a new column called combined in news_df which stores a combination of the subject and message fields. For example, let's say that the subject for a data point is "Fire in California" and the message is "Fire started because of a gender reveal party", then your combined column should contain "Fire in California Fire started because of a gender reveal party". Note: The subject and the message should be split by a space. In [ ]: # YOUR CODE HERE raise NotImplementedError() In [ ]: assert news_df.shape == (18730, 8) assert list[news_df.columns)[-1] == 'combined

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!