Data

The PMB is continuously improving. Stable versions of the PMB are provided in the releases below. We provide two different types of semantically annotated data: Meaning Representations and Semantic Tags. The data is divided in gold (completely corrected/verified by humans), silver (partially corrected) and bronze (machine generated). The numbers shown refer to gold instances.


Meaning Representations

version # en # de # it # nl release date full sample
5.1.0 11,987 3,179 1,958 1,557 25-04-2024 1.0 GB
5.0.1 11,446 3,065 1,884 1,522 24-04-2024 1.0 GB
4.0.0 10,715 2,844 1,686 1,467 22-10-2021 2.8 GB 1 MB
3.0.0 8,403 1,979 1,062 1,012 12-02-2020 1.9 GB 1 MB
2.2.0 5,929 1,419 724 633 20-12-2018 1.4 GB
2.1.0 4,555 1,175 635 586 07-06-2018 314 MB
2.0.0 3,925 1,048 568 527 25-04-2018 282 MB
1.0.0 2,049 641 387 394 22-12-2017 11 MB

We would be pleased if you cite the following paper when you want to refer to the PMB data:

Lasha Abzianidze, Johannes Bjerva, Kilian Evang, Hessel Haagsma, Rik van Noord, Pierre Ludmann, Duc-Duy Nguyen, Johan Bos (2017): The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp 242–247, Valencia, Spain. [PDF] [BibTeX]


Universal Semantic Tags

version # en # de # it # nl silver inc. release date download
0.2.0 14,129 2,924 1,741 1,354 yes 16-01-2024 36 MB ZIP file
0.1.0 5,438 0 0 0 yes 01-05-2018 19 MB ZIP file

We would be pleased if you cite the following paper when you want to refer to the PMB semantic tags:

Lasha Abzianidze, Johan Bos (2017): Towards Universal Semantic Tagging. Proceedings of the 12th International Conference on Computational Semantics (IWCS 2017) -- Short Papers, pp 1–6, Montpellier, France. [PDF] [BibTeX]