Skip to content
National Library of Norway
|
Språkbanken
Norsk
The Norwegian Language Bank
Resource Catalogue
I samarbeid med
Vis filter
Skjul filter
Type
Origin
Vis filter
Skjul filter
Text
19.01.2023
bokselskap.no 2023
bokselskap.no is a corpus of books and texts in the public domain, i.e. texts by authors who have been dead for at least 70 years, or books and texts published with permission of the copyright holders …
Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
19.01.2023
Tool
01.01.2023
NB DH-LAB
NB DH-LAB is a corpus infrastructure for social sciences and humanities computing. The infrastructure provides researchers with methods for doing qualitative and quantitative analyses of the digital …
Origin:
Language Bank
Licence:
MIT license
Type:
Tool
Updated:
01.01.2023
Text
23.12.2022
Translation memories from Målfrid
This corpus derives from the Målfrid corpus, and contains translation memories based on parallel text extracted from 132 public sector internet domains. The file format is compressed tmx for each …
Language:
English, Norwegian Bokmål, Norwegian Nynorsk
Origin:
Language Bank
Licence:
Norwegian Licence for Open Government Data (NLOD)
Type:
Text
Updated:
23.12.2022
Text
21.12.2022
Norwegian Newspaper Corpus Bokmål
The Norwegian Newspaper Corpus (NNC) Bokmål version is a large monitor corpus representing contemporary Norwegian language in the written variety Norwegian Bokmål. A corresponding corpus is …
Language:
Norwegian, Norwegian Bokmål
Origin:
CLARINO Bergen Centre
Licence:
Creative_Commons-BY-NC (CC-BY-NC)
Type:
Text
Updated:
21.12.2022
Text
21.12.2022
N-grams from NBdigital 2022
This resource contains n-grams - i.e. uni-, bi- and trigrams - from all books and newspapers that had been digitized at the National Library of Norway up to July 15 2022. The n-grams have been …
Language:
Norwegian Bokmål, Norwegian Nynorsk, Northern Sami, Southern Sami, Lule Sami, Kven
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
21.12.2022
Speech, Text
15.12.2022
Norwegian Voice Control Corpus
The Norwegian Voice Control Corpus (NVCC) is a text and speech corpus consisting of written queries in Norwegian Bokmål and Nynorsk within a number of intents, and voice recordings of these queries. …
Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
15.12.2022
Speech, Text
01.12.2022
The LIA Treebank
The LIA Treebank includes 7536 speech segments and 77 701 tokens from LIA Norwegian. The treebank is annotated with morphological and dependency-style syntactic analysis and manually corrected. The …
Language:
Norwegian, Norwegian Nynorsk
Origin:
CLARINO Text Laboratory Centre
Licence:
Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
Type:
Speech, Text
Updated:
01.12.2022
Speech, Text, Video
01.12.2022
The NDC Treebank
The NDC Treebank includes 4637 speech segments and 66 042 tokens from the Norwegian part of Nordic Dialect Corpus. The segments are taken from 30 transcribed interviews from 17 places in Norway. The …
Language:
Norwegian, Norwegian Bokmål
Origin:
CLARINO Text Laboratory Centre
Licence:
Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
Type:
Speech, Text, Video
Updated:
01.12.2022
Text
05.10.2022
META-NORD Sofie Danish Treebank
The Danish part of the META-NORD Sofie Parallel Treebank. This treebank is a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” (Sophie's World) by …
Language:
Danish
Origin:
CLARINO Bergen Centre
Licence:
unspecified
Type:
Text
Updated:
05.10.2022
Text
05.10.2022
Text material from Forskning.no (1998 – 2017)
Data set containing texts from the popular science website forskning.no from the period 1998 - 2017. The text material is constituted by articles published by Forskning.no belonging to the following …
Language:
Norwegian, Norwegian Bokmål
Origin:
CLARINO Bergen Centre
Licence:
CLARIN_RES-DEP
Type:
Text
Updated:
05.10.2022
Vis filter
Skjul filter