QUESTION: I am using the Large Knowledge Base (LKB) gazetteer to obtain lookup annotations in the GATE framework. I was wondering how I can get the lookup annotation for the percentage character (%). If I add "2%" to the RDF dictionary, I get lookup annotations for "2" and "2%", but when I just add "%", I can't get any lookup annotations.


The LKB gazetteer is designed to deal with regular names of objects and therefore it treats punctuation separately. It considers punctuation at the end of a name optional.

For example, if "Oracle Corp." exists without quotation marks in the dictionary, then both Oracle Corp and Oracle Corp. will be annotated if present in the text. The "." (dot) in the original name will not be trimmed so Oracle Corp% will not be annotated as "." (dot) is different from "%" (percentage).

As a side effect, however, pure punctuation strings cannot be annotated. We recommend that you create a regular gazetteer, which works along the LKB Gazetteer and let this gazetteer handle strings that do not contain alphanumeric characters.

