Skip to content
National Library of Norway
|
Språkbanken
Norsk
The Norwegian Language Bank
Resource Catalogue
I samarbeid med
Vis filter
Skjul filter
Type
Origin
Vis filter
Skjul filter
Speech, Text
19.12.2023
NST Norwegian ASR Database (16 kHz) – Reorganized
This database was created by Nordic Language Technology for the development of automatic speech recognition and dictation in Norwegian. In this version (from 2022), the organization of the data has …
Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
19.12.2023
Tool
20.11.2023
Mapping between Norwegian municipalities and dialect regions
This resource provides a mapping between Norwegian municipalities and dialect regions, and can be used, e.g., to infer the dialect region of a speaker in a speech dataset based on their place of …
Origin:
Language Bank
Licence:
Creative_Commons-BY (CC-BY)
Type:
Tool
Updated:
20.11.2023
Speech, Text
15.11.2023
Stortinget Speech Corpus version 1.0
The Stortinget Speech Corpus (SSC) is a 5000+ hours speech dataset for weak supervision ASR created from audio and aligned proceedings text from Stortinget, the Norwegian Parliament. It contains …
Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
15.11.2023
Text
27.10.2023
NDT 2.0 with Constituent Structure
In this version of the Norwegian Dependency Treebank 2.0 constituent structure (c-structure) similar to the one found in NorGramBank has been added. This can be used to train one syntactic parser for …
Language:
Norwegian Bokmål, Norwegian Nynorsk
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
27.10.2023
Tool
20.10.2023
spaCy for Norwegian Nynorsk
These spaCy models are trained on the NorNE dataset in a version compatible with Universal Dependencies. spaCy is a widely used library in python for language technology applications. spaCy does not …
Origin:
Language Bank
Licence:
MIT license
Type:
Tool
Updated:
20.10.2023
Text
24.08.2023
Norwegian Dependency Treebank 2.0
This is version 2.0 of the Norwegian Dependency Treebank (NDT), developed by the National Library of Norway in 2011-2014. In version 2.0 of NDT, the grammatical annotations remain the same as in the …
Language:
Norwegian Bokmål, Norwegian Nynorsk
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
24.08.2023
Speech, Text
18.08.2023
Norwegian Conversation Speech Corpus
NB Samtale is a speech corpus made by the Language Bank at the National Library of Norway. The corpus contains orthographically transcribed speech from podcasts and recordings of live events at the …
Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
18.08.2023
Speech, Text
13.07.2023
Norwegian Parliamentary Speech Corpus 2.0
This is version 2.0 of The Norwegian Parliamentary Speech Corpus (NPSC). In version 2.0, a number of changes have been made to the transcriptions, and some identified errors in the corpus have been …
Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
13.07.2023
Text
13.06.2023
The Georgian National Corpus
International partnership project, supported by the Volkswagen Foundation within the program Between Europe and the Orient – A Focus on Research and Higher Education in/on Central Asia and the …
Language:
Georgian, Middle Georgian, Old Georgian, Mingrelian, Svan
Origin:
CLARINO Bergen Centre
Licence:
unspecified
Type:
Text
Updated:
13.06.2023
Text
11.05.2023
Norwegian UD Treebank
Universal Dependencies (UD) is a framework for annotating grammar consistently in different languages. The grammatical annotations include tokenization, part-of-speech tags (POS), morphological …
Language:
Norwegian Bokmål, Norwegian Nynorsk
Origin:
Language Bank
Licence:
Creative_Commons-BY-SA (CC-BY-SA)
Type:
Text
Updated:
11.05.2023
Vis filter
Skjul filter