Ressurser fra ressursbanken Archive - Page 2 of 121 - Språkbanken

Nasjonalbiblioteket Språkbanken

I samarbeid med

NST Norwegian ASR Database (16 kHz) – Reorganized

This database was created by Nordic Language Technology for the development of automatic speech recognition and dictation in Norwegian. In this version (from 2022), the organization of the data has …

Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
19.12.2023
Mapping between Norwegian municipalities and dialect regions

This resource provides a mapping between Norwegian municipalities and dialect regions, and can be used, e.g., to infer the dialect region of a speaker in a speech dataset based on their place of …

Origin:
Language Bank
Licence:
Creative_Commons-BY (CC-BY)
Type:
Tool
Updated:
20.11.2023
Stortinget Speech Corpus version 1.0

The Stortinget Speech Corpus (SSC) is a 5000+ hours speech dataset for weak supervision ASR created from audio and aligned proceedings text from Stortinget, the Norwegian Parliament. It contains …

Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
15.11.2023
NDT 2.0 with Constituent Structure

In this version of the Norwegian Dependency Treebank 2.0 constituent structure (c-structure) similar to the one found in NorGramBank has been added. This can be used to train one syntactic parser for …

Language:
Norwegian Bokmål, Norwegian Nynorsk
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
27.10.2023
spaCy for Norwegian Nynorsk

These spaCy models are trained on the NorNE dataset in a version compatible with Universal Dependencies. spaCy is a widely used library in python for language technology applications. spaCy does not …

Origin:
Language Bank
Licence:
MIT license
Type:
Tool
Updated:
20.10.2023
Norwegian Dependency Treebank 2.0

This is version 2.0 of the Norwegian Dependency Treebank (NDT), developed by the National Library of Norway in 2011-2014. In version 2.0 of NDT, the grammatical annotations remain the same as in the …

Language:
Norwegian Bokmål, Norwegian Nynorsk
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Text
Updated:
24.08.2023
Norwegian Conversation Speech Corpus

NB Samtale is a speech corpus made by the Language Bank at the National Library of Norway. The corpus contains orthographically transcribed speech from podcasts and recordings of live events at the …

Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
18.08.2023
Norwegian Parliamentary Speech Corpus 2.0

This is version 2.0 of The Norwegian Parliamentary Speech Corpus (NPSC). In version 2.0, a number of changes have been made to the transcriptions, and some identified errors in the corpus have been …

Language:
Norwegian
Origin:
Language Bank
Licence:
Creative_Commons-ZERO (CC-ZERO)
Type:
Speech, Text
Updated:
13.07.2023
The Georgian National Corpus

International partnership project, supported by the Volkswagen Foundation within the program Between Europe and the Orient – A Focus on Research and Higher Education in/on Central Asia and the …

Language:
Georgian, Middle Georgian, Old Georgian, Mingrelian, Svan
Origin:
CLARINO Bergen Centre
Licence:
unspecified
Type:
Text
Updated:
13.06.2023
Norwegian UD Treebank

Universal Dependencies (UD) is a framework for annotating grammar consistently in different languages. The grammatical annotations include tokenization, part-of-speech tags (POS), morphological …

Language:
Norwegian Bokmål, Norwegian Nynorsk
Origin:
Language Bank
Licence:
Creative_Commons-BY-SA (CC-BY-SA)
Type:
Text
Updated:
11.05.2023