Question: The question requires you to use SQL to solve a problem in natural language processing ( NLP ) . Do note that solving this problem

The question requires you to use SQL to solve a problem in natural language processing (NLP). Do note that solving this problem does not require any previous knowledge of NLP. In other words, in combination with your knowledge of SQL, the question itself provides all information necessary to solve the given problem.
The review table contains the text content of the film review and its star rating on a scale from 1 to 5. A review is negative if its rating is 3, neutral is its rating is =3, and positive if its rating is >3.
The token table represents a tokenized review. Tokenization is the first step in natural language processing (NLP). It splits text into a sequence of words and punctuation (i.e. tokens). Each token has a position in this sequence, which is stored in the attribute token_id. Each token has a part of speech (POS)(stored in the pos attribute). In grammar, a POS is a category of words that have similar grammatical properties, e.g. nouns, verbs, adjectives, etc. Each POS is described in the partofspeech table. Each token also has a lemma and a stem, which represent its normalised forms.
It can be checked whether a lemma has any sentiment associated with it. This information is stored in the sentiment table. The score attribute stores a real number between -5 and 5, which indicates the strength of the sentiment. Negative numbers indicate a negative sentiment, and, conversely, positive numbers indicate a positive sentiment. If a lemma is not present in the sentiment table, it is assumed that it has a neutral sentiment, i.e. the sentiment score of zero.
Stop words are commonly used words such as 'a', 'the', 'is', 'are', etc. Stop words are sometimes used in NLP to ignore words that carry very little useful information. For example, search engines normally reduce a query such as 'What is a stop word?' to just 'stop word' by ignoring the stop words 'what', 'is' and 'a'. The stopword table contains a list of lemmatised stop words. To check whether a token from the token table is a stop word or not, one simply needs to check whether its lemma is present in the stopword table.
Assuming that the sentiment score of a text document is based on the sentiment definition described below, find the film, which stars an actor with the first name JAYNE, whose reviews have the most negative sentiment score on average. Note: Ties are broken by sorting the titles of the tied films in alphabetical order.The following example illustrates the way in which the sentiment score is calculated.
document = The movie was absolutely fantastic , but the ending was disappointing .
For the sole purpose of this example, let us assume that we have the following sentiment lexicon:
The tokens that are not present in the lexicon are assumed to have neutral sentiment, i.e. their
sentiment score is zero.
The sentiment of a document is then calculated as the sentiment of a token with the most extreme
sentiment score. Consider for example, the following two documents:
document 1
sentiment(document 1
The question requires you to use SQL to solve a

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!