Leksikografisk bokmålskorpus
Utvidet metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: Leksikografisk bokmålskorpus
- resource Name: The Lexicographic Corpus for Norwegian Bokmål
- description: The corpus consists of texts collected from available literature/prose from 1985 to 2013. The corpus is composed of texts from five genres: non-fiction prose (45 %) fiction (35 %) newpapers/magazines (10 %), TV subtitles (5 %), and non-standardized, unpublished texts (5 %), all in all 100 mill words. The corpus is grammatically tagged with the original version of The Oslo-Bergen tagger.
- description: Korpuset består av tekster hentet fra tilgjengelig litteratur/prosa fra 1985 til 2013. Korpuset har tekster fra fem sjangere: sakprosa (45%) skjønnlitteratur (35%) aviser og periodika (10%), TV-teksting( 5%), og upublisert materiale, småtrykk (5%), alt i alt 100 mill ord. Korpuset er grammatisk merket med den opprinnelige versjonen av Oslo-Bergen taggeren.
- resource Short Name: LBK2013
- resource Short Name: LBK2013
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/prosjekter/lbk/
- P I D: http://hdl.handle.net/11538/0000-000B-C022-5
- distribution Info
- licence Info
- user Category: Academic
- distribution Access Medium: accessibleThroughInterface
- execution Location: https://tekstlab.uio.no/glossa2/bokmal
- licence
- licence Family: CLARIN
- licence Name: CLARIN_ACA-NC-LOC-ND
- licence Url: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&NORED=1&ND=1
- conditions Of Use: BY
- conditions Of Use: ID
- conditions Of Use: LOC
- conditions Of Use: NC
- conditions Of Use: ND
- conditions Of Use: NORED
- non Standard Conditions Of Use: Due to agreements with the third party copyright holders, the corpus is only available through Glossa, a search and post-processing tool developed by the Text Laboratory.
- licensor:
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Oslo
- organization Name: Universitetet i Oslo
- organization Short Name: UiO
- organization Short Name: UoO
- department Name: Department of Linguistics and Scandinavian Studies
- department Name: Institutt for lingvistiske og nordiske studier (ILN)
- communication Info
- email: r.e.v.fjeld@iln.uio.no
- url: http://www.hf.uio.no/iln/english/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- distribution Rights Holder
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Oslo
- organization Name: Universitetet i Oslo
- organization Short Name: UiO
- organization Short Name: UoO
- department Name: Department of Linguistics and Scandinavian Studies
- department Name: Institutt for lingvistiske og nordiske studier (ILN)
- communication Info
- email: tekstlab-post@iln.uio.no
- url: http://www.hf.uio.no/iln/english/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- licence Info
- contact
- actor Info
- actor Type: person
- organization Info
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info
- email: tekstlab-post@iln.uio.no
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- actor Type: person
- person Info
- surname: Ruth E. Vatvedt
- given Name: Fjeld
- affiliation:
- organization Info
- organization Name: University of Oslo
- organization Name: Universitetet i Oslo
- organization Short Name: UiO
- organization Short Name: UoO
- department Name: Department of Linguistics and Scandinavian Studies
- department Name: Institutt for lingvistiske og nordiske studier (ILN)
- actor Info
- metadata Creation Date: 07.08.2015
- metadata Last Date Updated: 12.12.2018
- metadata Creator
- actor Info
- actor Type: person
- person Info
- surname: Hagen
- given Name: Kristin
- organization Info
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info
- email: kristin.hagen@iln.uio.no
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- version: 2013
- documentation Unstructured
- role: documentation
- document Unstructured: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/prosjekter/lbk/
- creation End Date: 31.12.2013
- resource Creator
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Oslo
- organization Name: Universitetet i Oslo
- organization Short Name: UiO
- organization Short Name: UoO
- department Name: Department of Linguistics and Scandinavian Studies
- department Name: Institutt for lingvistiske og nordiske studier (ILN)
- communication Info
- email: r.e.v.fjeld@iln.uio.no
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- actor Type: organization
- organization Info
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info
- email: tekstlab-post@iln.uio.no
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- corpus Info
- corpus Type: Written Corpus
- corpus Part Info
- media Type: text
- corpus Text Info
- text Format Info
- mime Type: txt
- character Encoding Info
- character Encoding: latin1
- text Format Info
- corpus Part General Info
- source Work Info
- work Description: The corpus consists of texts collected from available literature/prose from 1985 to 2013. The corpus is composed of texts from five genres: non-fiction prose (45 %) fiction (35 %) newpapers/magazines (10 %), TV subtitles (5 %), and non-standardized, unpublished texts (5 %), all in all 100 mill words.
- linguality Info
- linguality Type: monolingual
- language Info
- language Id: Nb
- language Name: Norwegian Bokmål
- modality Info
- modality Type: writtenLanguage
- size Info
- size: 100 mill
- size Unit: tokens
- annotation Info
- annotation Type: morphosyntacticAnnotation-posTagging
- annotation Type: lemmatization
- segmentation Level: word
- tagset: The Oslo Bergen-tagger tagset: http://tekstlab.uio.no/obt-ny/english/index.html
- tagset Language Id: Nb
- tagset Language Name: Norwegian bokmål
- theoretic Model: Constraint grammar
- annotation Mode: automatic
- annotation Manual Unstructured
- role: annotationManual
- document Unstructured: http://www.tekstlab.uio.no/obt-ny/english/index.html
- annotation Tool
- target Resource Name U R I: The Oslo-Bergen Tagger: http://tekstlab.uio.no/obt-ny/english/index.html
- classification Info
- genre Info
- genre Type: textGenre
- genre: factual prose
- genre Info
- genre Type: textGenre
- genre: fiction and drama
- genre Info
- genre Type: textGenre
- genre: newspaper and magazines
- genre Info
- genre Type: textGenre
- genre: unstandardised
- genre Info
- time Coverage Info
- time Coverage: 1985 – 2013
- source Work Info
dc:type | corpus |
dc:title | Leksikografisk bokmålskorpus |
dc:identifier | oai:tekstlab.uio.no:LBK2013 |
dc:description | Korpuset består av tekster hentet fra tilgjengelig litteratur/prosa fra 1985 til 2013. Korpuset har tekster fra fem sjangere: sakprosa (45%) skjønnlitteratur (35%) aviser og periodika (10%), TV-teksting( 5%), og upublisert materiale, småtrykk (5%), alt i alt 100 mill ord. Korpuset er grammatisk merket med den opprinnelige versjonen av Oslo-Bergen taggeren. |
dc:publisher | |
dc:format | accessibleThroughInterface |
dc:date | |
dc:date | 2013-12-31 |
dc:rights | Academic |
dc:rights | CLARIN |
dc:rights | CLARIN_ACA-NC-LOC-ND |
dc:rights | https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&NORED=1&ND=1 |
dc:creator | University of Oslo |
dc:creator | The Text Laboratory |
dc:lang | bokmål |