This distribution represents only the morphological information encoded in BulTreeBank – HPSG-based Treebank of Bulgarian. It contains about 214000 tokens. It was used for the training of the TreeTagger for Bulgarian.
It contains sentences from Bulgarian Grammar Textbooks, Newspapers, Literature and other sources of texts.
Full documentation (Style Book, Tagset description) of the Treebank can be found on: http://www.bultreebank.org/TechRep.html
This distribution represents only the morphological information encoded in BulTreeBank – HPSG-based Treebank of Bulgarian. It contains about 214000 tokens. It was used for the training of the TreeTagger for Bulgarian.
It contains sentences from Bulgarian Grammar Textbooks, Newspapers, Literature and other sources of texts.
Full documentation (Style Book, Tagset description) of the Treebank can be found on: http://www.bultreebank.org/TechRep.html
Extended metadata
dc:type
corpus
dc:title
The Morphologically Annotated Part of BulTreeBank
dc:identifier
oai:clarino.uib.no:bul-treebank
dc:description
This distribution represents only the morphological information encoded in BulTreeBank – HPSG-based Treebank of Bulgarian. It contains about 214000 tokens. It was used for the training of the TreeTagger for Bulgarian.
It contains sentences from Bulgarian Grammar Textbooks, Newspapers, Literature and other sources of texts.
Full documentation (Style Book, Tagset description) of the Treebank can be found on: http://www.bultreebank.org/TechRep.html