NST N-gram – bokmål
Utvidet metadata
- resource Common Info:
- resource Type: corpus
- identification Info:
- resource Name: NST N-gram – Norwegian Bokmål
- resource Name: NST N-gram – bokmål
- description: These n-grams are derived from parts of the Text Corpus from Nordic Language Technology AS (NST). The source material consists of 510 million words of running text. The n-grams are also available as an overview listing only the 1000 most frequent n-grams (n=1-6). In the full version, all the derived n-grams (n=1-6) are sorted alphabetically and by frequency, respectively. Frequency lists (unigrams) are also available separately.
- description: N-grammene er laget med utgangspunkt i deler av tekstkorpuset etter Nordisk språkteknologi AS (NST). Datagrunnlaget for materialet er 510 millioner ord løpende tekst. Materialet er også tilgjengelig som en oversikt over de 1000 mest frekvente n-grammene (n=1-6). I den komplette versjonen er alle-n-grammene sortert henholdsvis alfabetisk og etter frekvens. Det er også laget frekvenslister over enkeltordene i materialet (unigram).
- url: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-3/
- P I D: hdl:21.11146/3
- identifier: sbr-3
- distribution Info:
- licence Info:
- user Category: Public
- distribution Access Medium: downloadable
- download Location: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-3/
- licence:
- licence Family: Creative Commons (CC)
- licence Name: Creative_Commons-ZERO (CC-ZERO)
- licence Url: https://creativecommons.org/publicdomain/zero/1.0/
- licensor:
- actor Info:
- actor Type: organization
- role: Licensor
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info:
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- distribution Rights Holder
- actor Info:
- actor Type: organization
- role: Distribution Rights Holder
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info:
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info:
- actor Type: organization
- role: Contact
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- actor Info:
- actor Type: person
- role: Metadata Creator
- person Info:
- surname: Birkenes
- given Name: Magnus Breder
- affiliation:
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- actor Info:
- actor Type: person
- role: Resource Creator
- person Info:
- surname: Hofland
- given Name: Knut
- affiliation:
- organization Info:
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UiB
- corpus Info:
- corpus Type: Ngram Corpus
- corpus Part Info:
- media Type: textNgram
- corpus Text Ngram Info:
- ngram Info:
- base Item: word
- order: 6
- text Format Info:
- mime Type: text/plain
- size Per Text Format:
- size Info:
- size: 510000000
- size Unit: words
- character Encoding Info:
- character Encoding: UTF-8
- corpus Part General Info:
- linguality Info:
- linguality Type: monolingual
- language Info:
- language Id: nb
- language Name: Norwegian Bokmål
- size Per Language:
- size Info:
- size: 510000000
- size Unit: words
- language Variety Info:
- language Variety Type: other
- language Variety Name: news text
- size Per Language Variety:
- size Info:
- size: 510000000
- size Unit: words
- modality Info:
- modality Type: writtenLanguage
- modality Type Details: news text
- size Per Modality:
- size Info:
- size: 510000000
- size Unit: words
- size Info:
- size: 510000000
- size Unit: words
dc:type | corpus |
dc:title | NST N-gram – bokmål |
dc:identifier | oai:nb.no:sbr-3 |
dc:description | N-grammene er laget med utgangspunkt i deler av tekstkorpuset etter Nordisk språkteknologi AS (NST). Datagrunnlaget for materialet er 510 millioner ord løpende tekst. Materialet er også tilgjengelig som en oversikt over de 1000 mest frekvente n-grammene (n=1-6). I den komplette versjonen er alle-n-grammene sortert henholdsvis alfabetisk og etter frekvens. Det er også laget frekvenslister over enkeltordene i materialet (unigram). |
dc:publisher | |
dc:format | downloadable |
dc:date | 2012-01-02 |
dc:date | 2012-06-12 |
dc:rights | Public |
dc:rights | Creative Commons (CC) |
dc:rights | Creative_Commons-ZERO (CC-ZERO) |
dc:rights | https://creativecommons.org/publicdomain/zero/1.0/ |
dc:creator | Knut Hofland |
dc:lang | bokmål |