To browse through the corpus, and see what meaning representations are assigned to texts, go to the explorer. The explorer is a tool to browse through the parallel corpus. It shows one document (a sentence or a short text) at a time, in at least two languages (of which one is always English). If you want to see the translation, click on German, Dutch or Italian. To find more about the semantic analysis of a text, select one of the following five tabs:
the document in its raw format;
showing the result of segmentation, the text being split into word and sentence tokens;
sentence alignment for languages other than English;
syntactic and semantic analysis for each sentence;
the meaning representation for the entire text.
The document set has a unique identifier, consisting of a two-digit part (ranging from 00 to 99) and a four-digit document number (ranging from 0000 to 9999). There are three tabs with extra information about the analysed text:
- bits of wisdom
individual (manual) annotation that corrected machine output;
all warnings produced by the semantic technology pipeline producing the initial analysis.
Documents are sorted on size. You can select a different document by clicking on the icons on the top (previous, random, or next). Documents are regularly reprocessed as soon as there are updates in the models or annotations. It is also possible to force reprocessing by clicking the circular icon on the top of the screen.
The sentences environment is where it all happens. This is the most exciting part, where the semantic analysis of the words (lexical semantics) and the sentences (compositional semantics) is shown. You can select the layers of analysis that you want to see for a sentence:
the semantic tag (part-of-speech tagging for semanticists);
the non-logical symbol (basically: lemmatisation and normalisation);
the WordNet synset of which the word is a member;
the VerbNet roles selected for a word with a functional category;
information about the antecedent of a referring expression;
the supertag, a.k.a. lexical category in combinatorial categorial grammar
the lexical semantics in the format of a discourse representation structure.