INESS list of lexical units unknown to the NorGram lexicon

In the INESS project, Norwegian texts in Norwegian Bokmål and Nynorsk are parsed with the NorGram grammar and lexicon. When text is parsed, there will always be words that are unknown to the morphological analyzer and/or the lexicon. INESS has therefore developed an intelligent browser-based preprocessing interface which facilitates, among other things, the efficient treatment of unknown word forms. The list of word forms that have not been automatically recognized are manually inspected.
While some of these result from OCR errors and others are simply typos, most unrecognized word forms are productive compounds, words only occurring in MWEs, names, foreign words, neologisms, interjections, dialect words, and systematic, or intended, misspellings. To read more about the types of lexical units registered, please refer to the documentation at http://clarino.uib.no/iness/page?page-id=Text_preprocessing.

Extended metadata

Download resources

Download metadata

Download metadata http://hdl.handle.net/11509/90@format=cmdi

dc:type
dc:title	INESS list of lexical units unknown to the NorGram lexicon
dc:identifier	oai:repo.clarino.uib.no:11509/90
dc:description	In the INESS project, Norwegian texts in Norwegian Bokmål and Nynorsk are parsed with the NorGram grammar and lexicon. When text is parsed, there will always be words that are unknown to the morphological analyzer and/or the lexicon. INESS has therefore developed an intelligent browser-based preprocessing interface which facilitates, among other things, the efficient treatment of unknown word forms. The list of word forms that have not been automatically recognized are manually inspected. While some of these result from OCR errors and others are simply typos, most unrecognized word forms are productive compounds, words only occurring in MWEs, names, foreign words, neologisms, interjections, dialect words, and systematic, or intended, misspellings. To read more about the types of lexical units registered, please refer to the documentation at http://clarino.uib.no/iness/page?page-id=Text_preprocessing.
dc:publisher
dc:format
dc:date
dc:date
dc:rights
dc:rights
dc:rights
dc:rights

INESS list of lexical units unknown to the NorGram lexicon

Extended metadata

Dublin Core (DC)

Download resources

Download metadata