Question: HW _ 1 Lexicon - based Text Mining With Whom Do People Spend Happy Moments? Data HappyDB is a corpus of 1 0 0 ,

HW_1 Lexicon-based Text Mining
With Whom Do People Spend Happy Moments?
Data
HappyDB is a corpus of 100,000+ crowd-sourced descriptions of happy moments. HappyDB is
available on GitHub [1]. The cleaned data set is located at [2].
Task (10 points in total; four subtasks, 25% each)
(1) HappyDB provided a people dictionary[3], which is a lexicon of common social
relationships. Use this lexicon to find the top three social relationships mentioned in
happy moments, e.g. spouse, parents, children, friends, or someone else.
(2) Use your world knowledge to assess the strengths and weaknesses of the people
dictionary in terms of answering the question with whom do people spend happy
moments? You can define people,strength and weakness in your own way here.
If you think this dictionary is already perfect, you can articulate your argument and skip
task #3 instead.
(3) Modify the people dictionary to fix the weaknesses that you have identified. Use the
revised lexicon to redo task #1.
(4) Extract five context words before and after the most mentioned people, sort by frequency,
and explore patterns in the 100 most frequent context words, for example, any words
indicating activities that people do in happy moments?
Submission
Write a Python script in Jupyter Notebook and submit your code file named
HW1_firstname_lastname.ipynb. The file should include your code and your explanations in
comments. Make sure to include your name in both filename and comment on top of your code.
Grading criteria:
- Accuracy and reproducibility in analysis
- Clarity in explaining the analytical process and result

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!