NST Pronunciation Lexicon for Danish
Extended metadata
- resource Common Info:
- resource Type: lexicalConceptualResource
- identification Info:
- resource Name: NST Pronunciation Lexicon for Danish
- resource Name: NST uttaleleksikon for dansk
- description: This pronunciation lexicon for Danish was originally produced by Nordic Language Technology (NST), and contains approximately 238,000 entries. The word list consists of a frquency-based 100k list and some additional material. The lexicon is available as one file in simple text format. Each entry/line contains 51 data fields, separated by a semicolon. Not all fields are equally relevant for all purposes, but given the format it is easy to extract the information you need. The lexicon contains, among other things, information about the decomposition of compounds and one or more phonetic transcriptions. All transcriptions have been done manually. Some lexical tools that can be used to handle the lexicon, can be downloaded as a separate zip file. The transcription format is SAMPA (Speech Assessment Methods Phonetic Alphabet).
- description: Dette uttaleleksikonet for dansk vart opphavleg produsert av Nordisk språkteknologi (NST), og inneheld om lag 238.000 oppslag. Ordlista tek utgangspunkt i dei 100.000 mest frekvente ordformene i det danske tekstkorpuset til NST. Heile leksikonet ligg føre som ei fil i rein tekst-format. Kvart oppslag (line) inneheld 51 postar, skilde med semikolon. Ikkje alle postane er like relevante for alle føremål, men gitt formatet er det lett å hente ut den informasjonen ein treng. Leksikonet inneheld mellom anna informasjon om dekomponeringsledd i samansettingar og ein eller flere fonetiske transkripsjon.ar Transkripsjonsarbeidet er gjort manuelt. Diverse leksikalsk verktøy som kan nyttast til å handsame leksikonet, kan lastast ned som ein eigen zip-fil. Transkripsjonsformatet er SAMPA (Speech Assessment Methods Phonetic Alphabet).
- url: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-26/
- P I D: hdl:21.11146/26
- identifier: sbr-26
- distribution Info:
- licence Info:
- user Category: Public
- distribution Access Medium: downloadable
- download Location: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-26/
- licence:
- licence Family: Creative Commons (CC)
- licence Name: Creative_Commons-ZERO (CC-ZERO)
- licence Url: https://creativecommons.org/publicdomain/zero/1.0/
- licensor:
- actor Info:
- actor Type: organization
- role: Licensor
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info:
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- distribution Rights Holder
- actor Info:
- actor Type: organization
- role: Distribution Rights Holder
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info:
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info:
- actor Type: organization
- role: Contact
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- actor Info:
- actor Type: person
- role: Metadata Creator
- person Info:
- surname: Lindstad
- given Name: Arne Martinus
- affiliation:
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- actor Info:
- actor Type: organization
- role: Resource Creator
- organization Info:
- organization Name: Nordic Language Technology AS
- organization Name: Nordisk språkteknologi AS
- organization Short Name: NST
- organization Short Name: NST
- lexical Conceptual Resource Info Rev1:
- lexical Conceptual Resource Type: computationalLexicon
- lexical Conceptual Resource Part General Info:
- linguality Info:
- linguality Type: monolingual
- language Info:
- language Id: da
- language Name: Danish
- size Per Language:
- size Info:
- size: 237873
- size Unit: words
- size Info:
- size: 33,3
- size Unit: mb
- modality Info:
- modality Type: writtenLanguage
- modality Type Details: Frequency-based wordlist, from news text, added data (e.g. named entities) from other sources. Corresponding phoentic transcriptions.
- size Info:
- size: 237873
- size Unit: words
- size Info:
- size: 33,3
- size Unit: mb
- domain Info:
- domain: speech technology
- creation Info:
- creation Mode: mixed
- lexical Conceptual Resource Encoding Info:
- encoding Level: phonetics
- linguistic Information: phonetics-Transcription
- conformance To Standards Best Practices: other
- theoretic Model: SAMPA
- lexical Conceptual Resource Part Info Rev1:
- media Type: text
- lexical Conceptual Resource Text Info:
- text Format Info:
- mime Type: text/csv
- size Per Text Format:
- size Info:
- size: 237873
- size Unit: words
- size Info:
- size: 33,3
- size Unit: mb
- character Encoding Info:
- character Encoding: UTF-8
dc:type | lexicalConceptualResource |
dc:title | NST Pronunciation Lexicon for Danish |
dc:identifier | oai:nb.no:sbr-26 |
dc:description | This pronunciation lexicon for Danish was originally produced by Nordic Language Technology (NST), and contains approximately 238,000 entries. The word list consists of a frquency-based 100k list and some additional material. The lexicon is available as one file in simple text format. Each entry/line contains 51 data fields, separated by a semicolon. Not all fields are equally relevant for all purposes, but given the format it is easy to extract the information you need. The lexicon contains, among other things, information about the decomposition of compounds and one or more phonetic transcriptions. All transcriptions have been done manually. Some lexical tools that can be used to handle the lexicon, can be downloaded as a separate zip file. The transcription format is SAMPA (Speech Assessment Methods Phonetic Alphabet). |
dc:publisher | |
dc:format | downloadable |
dc:date | 2000-01-01 |
dc:date | 2003-02-24 |
dc:rights | Public |
dc:rights | Creative Commons (CC) |
dc:rights | Creative_Commons-ZERO (CC-ZERO) |
dc:rights | https://creativecommons.org/publicdomain/zero/1.0/ |
dc:creator | Nordic Language Technology AS |
dc:lang | Danish |