Randomized extraction of the New Norwegian corpus
Utvidet metadata
dc:type | |
dc:title | Randomized extraction of the New Norwegian corpus |
dc:identifier | oai:repo.clarino.uib.no:11509/140 |
dc:description | Randomized extraction of the New Norwegian Corpus (Nynorskkorpuset). Contains sentences in New Norwegian (Nynorsk) from the year 2000 and after. Tab-separated, one word pr. line, lemmatized and morphologically tagged, year and domain information is given. Annotation is done with the Oslo-Bergen tagger. Sentences in the Bokmål standard have been removed. This corpus is intended for use in the development of language technology. Size: 3,3 million sentences, 57,5 million words. |
dc:publisher | |
dc:format | |
dc:date | |
dc:date | |
dc:rights | |
dc:rights | |
dc:rights | |
dc:rights |