-
Notifications
You must be signed in to change notification settings - Fork 0
Approach
( See file negation_conversion.py )
Step 1:
We find the scope by searching for its start word and its end word with the find_portee(doc) method. We first search for the negative words in the sentence and then find their head words in the dependency tree. For each head word found, we check that it is not a negative word by checking if its label is not ‘neg’ and if it does not belong to our list of negative words to avoid giving it a NOT_ prefix later. And we end by returning its start word and its end word.
Step 2:
Going through all the words in a sentence, we look to see if the word has a negative meaning using POS tagging in Spacy[1] or if this word is in the negative word list that we built in order to capture negative scopes not detected by Spacy.
List of negative words:
word_negatifs = [ "no", "No", "not", "Not", "without", "never do", "Never", "Without", "none", "None", "nothing", "Nothing", "hardly", "hardly be", "barely", "n't" ]
In addition, we have constructed a list of words and punctuation marks designating the end of a negative scope in order to identify the end of a negative scope to avoid certain errors in dependency analysis.
List of words and punctuation marks:
word_portee = [ "but", "as","liked","even", "which", "and", "while", "why", "." ]
Step 3:
We try to convert the sentence by checking each word in it.
- We do nothing if:
- The word is in the constructed list of stop words of the negative scope
- The word is in the list of stop words of the negative scope of the tree
- We apply the NOT_ prefix to the word if:
- The word is in the list of words of the negative scope of the tree
- Otherwise we apply the NOT_ prefix to the word
Once all the words have been processed, we reconstructs the sentence by adding a space in front of each word in the list and returns this sentence.
( See APPENDIX 1 – Results of task 1 )
Using displaCy we were able to display the dependency tree for each of the sentences in order to study them[2].
This sentence has two negative staves, one that begins with the word “no” and ends with a comma and the other begins with “not” and ends with a period.
This negative stave begins at the beginning of the sequence and ends with a comma.
Build the negative scope using the dependency tree and not using a list of negative words.
Handle the various special cases that led us to false.
We fixed this:
• A negative scope that is not recognized in the dependency tree:
Conversion: I hardly recommend to you Abitazione Pigneto .
Real phrase: I hardly NOT_recommend NOT_to NOT_you NOT_Abitazione NOT_Pigneto .
• Small details like punctuation marks at the end of a sentence :
Conversion: Should n't NOT_it NOT_be NOT_minor NOT_league NOT_?
Real phrase: Should n't it be NOT_minor NOT_league ?
We chose (due to lack of time) not to resolve this:
• The application of the _NOT prefix in some cases is not immediately after a negative word.
Conversion: A bigger difference could hardly NOT_be NOT_imagined .
Real phrase: A bigger difference could hardly be NOT_imagined .
• Spacy tokenization errors:
Conversion: I have tried for two weeks without NOT_a NOT_returned NOT_e NOT_- NOT_mail and none NOT_have NOT_come NOT_back
Real phrase: I have tried for two weeks without NOT_a NOT_returned NOT_e-mail and none NOT_have NOT_come NOT_back
• A poorly detected end of range:
Conversion: I am barely NOT_able NOT_to NOT_stay awake while reading it .
Real phrase: I am barely NOT_able NOT_to NOT_stay NOT_awake while reading it .
We could not solve this:
• Adding a space at the beginning::
Conversion: I can cook all these things without NOT_a NOT_book .
real phrase: I can cook all these things without NOT_a NOT_book .
Wiki
Sentiment Analysis
Folders