Named Entities Recognition (Rasa-like syntax) Annotator for the vim editor.
This vim plugin helps to annotate named entities using simple entity annotation syntax, using inline-text mark-up tags, following the format used in RASA YAML files.
The final goal is to possibly demonstrate how fast is annotate with a text editor (specifically vim) a text file of intents + entities examples of a training set.
What the named entity [entity_value](entity_label)
syntax format is?
[entity_value](entity_label)
^ ^
| |
| entity name (label)
|
value (sequence of characters/words) for entity referenced with `entity_label`
Where:
-
entity_value
- is any sequence of characters or words,
- delimited by characters
[
and]
-
entity_label
- is the entity name (label)
is a string of kind "variable name" in a programming language style,
by example the label is made by alphabet letters and the character
_
- the label is delimited by characters
(
e)
- is the entity name (label)
is a string of kind "variable name" in a programming language style,
by example the label is made by alphabet letters and the character
By example, given the sentence
my name is Giorgio Robino and I live in Genova, corso Magenta 35/4
You want to annotate three entities (entity_label = entity_value):
person
=Giorgio Robino
city
=Genova
address
=corso Magenta 35/4
Using above described syntax, the annotated sentence is:
my name is [Giorgio Robino](person) and I live in [Genova](city), [corso Magenta 35/4](address)
What the plugin does?
With the vim plugin command :NeraSet
,
you can map up to 12 function keys (<F1>
,...,<F12>
) to a syntax substitution/decoration "macro"
that add a entity label
- to a visual selected text,
- or to the current word and a configurable number of adjacent words, setting the cursor to the start of the word (entity) you want to tag.
In vim command mode (:
) these commands are available:
command | description |
---|---|
:NeraSet functionKey label [contiguousWords] |
maps the specified functionKey to a substitution macro with argument label, and optional argument contiguousWords. functionKey valid values are number 1 ... 12 or strings F1 ... F12 , or <F1> ... <F12> key pressing.label is the entity name (single word in camelCase or snake_case). contiguousWords is a number of contiguous words to be selected, this is an optional argument (default value is 1). |
Utilities:
command | description |
---|---|
:NeraMapping |
shows function keys mapping |
:NeraLoad command_script_file |
load and execute a script file containing Nera commands or any other vim : commands |
:NeraLabels label ... [label] |
Set a list of labels, to be used afterward with NeraSet |
:NeraLabelsClear |
Clear the list of preset labels, to be used afterward with NeraSet |
Given the sentence (line):
my name is Giorgio Robino and I live in Genova, corso Magenta 35/4
To assign to function key <F1>
a substitution for visual mode and single word selection:
-
assign a new "macro" substitution to
<F1>
:NeraSet f1 person_name
-
put the cursor at the begin of the word you want to annotate:
my name is Giorgio Robino nd I live in Genova, corso Magenta 35/4 ^ | set the vim cursor here
-
press
<F1>
. The line is updated with the entity notation syntax decoration:my name is [Giorgio](person_name) Robino and I live in Genova, corso Magenta 35/4
Maybe the example above is not what you exactly want, because a full person name is usually composed
by two consecutive words (Giorgio Robino),
so you maybe want to preset (another or the same) function key <F1>
to automatically substitute the current and the successive word.
In this case, set the mapping with argument contiguousWords
set to 2
:
-
assign a new "macro" substitution to
<F2>
:NeraSet F2 person_name 2
-
again put the cursor at the begin of the word you want to annotate:
my name is Giorgio Robino and I live in Genova, corso Magenta 35/4 ^ | set the vim cursor here
-
press
<F2>
. The line is updated and in this casemy name is [Giorgio Robino](person_name) and I live in Genova, corso Magenta 35/4
Anyway, even if you do not specify the words number
argument,
you can proceed withe visual selection mode. So:
-
assign a new "macro" substitution to
<F3>
:NeraSet <F3> address
-
go in vim visual mode (pressing
v
) and select the span you want to annotate:my name is Giorgio Robino and I live in Genova, corso Magenta 35/4 ^ ^ | | start visual selection end visual selection
-
press
esc
and<F3>
. The line is updated and in this casemy name is [Giorgio Robino](person_name) and I live in [Genova, corso Magenta 35/4](address)
You want to prepare a precise (short) list of labels
you will use afterward to annotate with NeraSet
:
:NeraLabels name surname address city age gender
This list act as the reference list, to validate NeraSet
label argument.
By example,
NeraSet <F4> job
generates a warning message, because you are setting a label not previously declared:
warning: label 'job' is not one of the configured labels: name surname address city age gender
functionKey: <F4>, label: job, contiguous words: 1
Press ENTER or type command to continue
Execute all Nera commands previously saved in specified script file.
-
you create your script file
examples/my_project_configs.vim
containing Nera or other vim commands, by example:" " my_project_configs.vim " " F1 - F4 NeraSet <F1> name 1 NeraSet <F2> address 1 NeraSet <F3> company 1 NeraSet <F4> location 1 " F5 - F8 NeraSet <F5> email NeraSet <F6> name 2 NeraSet <F7> name 3 NeraSet <F8> address 1 " F9 - F12 NeraSet <F9> gender NeraSet <F10> address 3 NeraSet <F12> company 2
-
Afterward you run the script from command mode:
:NeraLoad examples/my_project_configs.vim
Suppose you run commands:
:NeraSet <F1> name 1
:NeraSet <F2> address 1
:NeraSet <F3> company 1
:NeraSet <F4> location 1
:NeraSet <F5> email
Afterward, you want to show the key mappings:
:NeraMapping
<F1> c1w[<C-R><C-O>"](name)<Esc>
<F2> c1w[<C-R><C-O>"](address)<Esc>
<F3> c1w[<C-R><C-O>"](company)<Esc>
<F4> c1w[<C-R><C-O>"](location)<Esc>
<F5> c1w[<C-R><C-O>"](email)<Esc>
<F6>
<F6>
<F7>
<F8>
<F9>
<F10>
<F11>
<F12>
Press ENTER or type command to continue
-
Commands arguments auto completion
When using command
NeraSet
you can use arguments auto completion (function key, labels, etc.). When using commandNeraLoad
you can exploit file name argument auto completion -
Undo labeling
If you are unhappy with your
NeraSet
labeling, just undo in vim as usual, pressingu
in normal mode! -
Visual mode is always on
Any time you assign a key with
NeraSet
, you set the word mode for a specified number of contiguous words, but you also enable the visual mode! You can optionally- select set the cursor at the start of word and press afterward the function key
- select in visual mode a span of words and press afterward the function key
Using vim-plug, in your .vimrc
file:
Plug 'solyarisoftware/nera.vim'
Some files available in examples directory of this repo. Here a live demo of this plugin commands usage to annotate entities:
This project is a work-in-progress proof-of-concept.
I'm not a vimscript expert, so any coding contribute is welcome.
For any proposal and issue, please submit here on github issues for bugs, suggestions, etc. You can also contact me via email (giorgio.robino@gmail.com).
I'm especially interested in any markup-based entity syntax formats alternative/different from RASA. Please let me know. Do not esitate to open a 'change request' issue.
IF YOU LIKE THE PROJECT, PLEASE ⭐️STAR THIS REPOSITORY TO SHOW YOUR SUPPORT! 🙏
- add a help / online tutorial command
- improve arguments validation
- extend syntax, managing not only RASA-like style syntax annotation, but also other different systems' syntax:
-
v. 0.5.0
- new commands
NeraLabels
andNeraLabelsClear
NeraSet
arguments auto-completion
- new commands
-
v. 0.4.1
NeraLoad
new command to load script of commandsNeraMapping
has now a cleaner list of key mappingsNeraSet
now accept the function key argument just pressing the corresponding function key!
- I made another plugin possibly complementary: Highlight.vim to colorize pattern of texts, with a random or specified background colors. A Possible usage is to highlight entity names and entity labels as show here: highlight entities having RASA-YAML entity annotation syntax
MIT (c) Giorgio Robino