Manual
To browse through the corpus, and see what meaning representations are assigned to texts, go to the explorer. The explorer is a tool to browse through the parallel corpus. It shows one document (a sentence or a short text) at a time, in one or more languages. If you want to see the translation, click on the corresponding language code, i.e., DE, NL or IT. To find more about the semantic analysis of a text, select one of the following five tabs:
- raw
the document in its raw format; - tokens
showing the result of segmentation, the text being split into word and sentence tokens; - syntax
syntactic and semantic analysis for each sentence; - semantics
the meaning representation for the entire text in different formats.
The document set has a unique identifier, consisting of a two-digit part (ranging from 00 to 99) and a four-digit document number (ranging from 0000 to 9999). There are three tabs with extra information about the analysed text:
- bits of wisdom
individual (manual) annotation that corrected machine output; - warnings
all warnings produced by the semantic technology pipeline producing the initial analysis; - metadata
source of the document, language, terms of use.
Documents are sorted on size. You can select a different document by clicking on the icons on the top (previous, random, or next). Documents are regularly reprocessed as soon as there are updates in the models or annotations. It is also possible to force reprocessing by clicking the circular icon on the top of the screen.
The syntax environment is where it all happens. This is the most exciting part, where the semantic analysis of the words (lexical semantics) and the sentences (compositional semantics) is shown. You can select the layers of analysis that you want to see for a sentence:
- sem
the semantic tag (part-of-speech tagging for semanticists); - sym
the non-logical symbol (basically: lemmatisation and normalisation); - sns
the WordNet synset of which the word is a member; - rol
the semantics roles selected for a word with a functional category; - ref
information about the antecedent of a referring expression; - scp
information about the scopre of an operator; - cat
the supertag, a.k.a. lexical category in combinatorial categorial grammar - drs
the lexical semantics in the format of a lambda-DRS.