Skip to content

The Morphologically Annotated Part of BulTreeBank

This distribution represents only the morphological information encoded in BulTreeBank – HPSG-based Treebank of Bulgarian. It contains about 214000 tokens. It was used for the training of the TreeTagger for Bulgarian.

It contains sentences from Bulgarian Grammar Textbooks, Newspapers, Literature and other sources of texts.

Full documentation (Style Book, Tagset description) of the Treebank can be found on:

This distribution represents only the morphological information encoded in BulTreeBank – HPSG-based Treebank of Bulgarian. It contains about 214000 tokens. It was used for the training of the TreeTagger for Bulgarian.

It contains sentences from Bulgarian Grammar Textbooks, Newspapers, Literature and other sources of texts.

Full documentation (Style Book, Tagset description) of the Treebank can be found on:

Extended metadata

Download metadata

Go to resource page