Skip to content

N-gram – Norwegian Bokmål

These n-grams (n=1-6) are made on the basis of the texts in Norwegian Newspaper Corpus and the news texts from the text corpus from Nordic Language Technology AS (NST). In total, the source material consists of 1175 million words of running text.

The n-grams are sorted alphabetically and by frequency, respectively. Frequency lists (unigrams) are published in a separate download. A simplified version, listing the 1000 most frequent n-grams is also available for download.

These n-grams (n=1-6) are made on the basis of the texts in Norwegian Newspaper Corpus and the news texts from the text corpus from Nordic Language Technology AS (NST). In total, the source material consists of 1175 million words of running text.

The n-grams are sorted alphabetically and by frequency, respectively. Frequency lists (unigrams) are published in a separate download. A simplified version, listing the 1000 most frequent n-grams is also available for download.

Extended metadata

Download resources

Download metadata