Question: Q3. Add dialogue context data and features Adjust create_character_document_from_dataframe and the other functions appropriately so the data incorporates the context of the line spoken by

Q3. Add dialogue context data and features

Adjust create_character_document_from_dataframe and the other functions appropriately so the data incorporates the context of the line spoken by the characters in terms of the lines spoken by other characters in the same scene (immediately before and after). You can also use scene information from the other columns (but NOT the gender and character names directly).

The code to be changed is:

corpusVectorizer = DictVectorizer() # corpusVectorizor which will just produce sparse vectors from feature dicts

# Any matrix transformers (e.g. tf-idf transformers) should be initialized here

def create_document_matrix_from_corpus(corpus, fitting=False):

"""Method which fits different vectorizers

on data and returns a matrix.

Currently just does simple conversion to matrix by vectorizing the dictionary. Improve this for Q3.

::corpus:: a list of (class_label, document) pairs.

::fitting:: a boolean indicating whether to fit/train the vectorizers (should be true on training data)

"""

# uses the global variable of the corpus Vectorizer to improve things

if fitting:

corpusVectorizer.fit([to_feature_vector_dictionary(doc) for name, doc in corpus])

doc_feature_matrix = corpusVectorizer.transform([to_feature_vector_dictionary(doc) for name, doc in corpus])

#training_feature_matrix[0].toarray()

return doc_feature_matrix

training_feature_matrix = create_document_matrix_from_corpus(training_corpus, fitting=True)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!