Norsk-engelsk parallellkorpus frå offentlege nettstadar
Utvidet metadata
- resource Common Info:
- resource Type: corpus
- identification Info:
- resource Name: Norwegian-English Parallel Corpus from Public Web Sites
- resource Name: Norsk-engelsk parallellkorpus frå offentlege nettstadar
- description: This is a sentence-aligned parallel corpus built from the public web sites www.nav.no, www.nyinorge.no and skatteetaten.no. These web sites provide information in both Norwegian Bokmål and Nynorsk, and parts of this is translated into English. The material is split in two corpora, one for Norwegian Bokmål-English, and one for Norwegian Nynorsk-English. Only sentences with a corresponding translation are included in the corpora. The corpora were made by Paul Meurer and Andrew Salway at the University of Bergen for the Language Bank. See the attached report for a description of how this was done. The corpora are also available at the Clarino Bergen Centre's corpus management and analysis system Corpuscle (https://clarino.uib.no/korpuskel/).
- description: Dette er eit parallellkorpus laga med utgangspunkt i tekster frå dei offentlege nettstadane www.nav.no, www.nyinorge.no og skatteetaten.no. Desse nettstadane publiserer informasjon på både bokmål og nynorsk, og delar av dette vert omsett til engelsk. Materialet er delt i to korpus, eitt for bokmål-engelsk, og eitt for nynorsk-engelsk. Berre material med ei tilsvarande engelsk omsetjing er inkludert i korpusa. Korpusa vart laga for Språkbanken av Paul Meurer og Andrew Salway ved Universitetet i Bergen. Rapporten for korleis dei gjekk fram ligg ved korpset. Korpusa er òg tilgjengelege i korpushandsamingssystemet Korpuskel ved Clarino Bergen Centre (https://clarino.uib.no/korpuskel/).
- url: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-68/
- P I D: hdl:21.11146/68
- identifier: sbr-68
- distribution Info:
- licence Info:
- user Category: Public
- distribution Access Medium: downloadable
- download Location: https://www.nb.no/sprakbanken/ressurskatalog/oai-nb-no-sbr-68/
- licence:
- licence Family: Creative Commons (CC)
- licence Name: Creative_Commons-BY (CC-BY)
- licence Url: https://creativecommons.org/licenses/by/4.0/
- conditions Of Use: BY
- licensor:
- actor Info:
- actor Type: organization
- role: Licensor
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info:
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- distribution Rights Holder
- actor Info:
- actor Type: organization
- role: Distribution Rights Holder
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- communication Info:
- email: sprakbanken@nb.no
- url: https://www.nb.no/sprakbanken/
- address: P.O. Box 2674 Solli
- zip Code: 0203
- city: Oslo
- region: Oslo
- country: Norway
- actor Info:
- actor Type: organization
- role: Contact
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- actor Info:
- actor Type: person
- role: Metadata Creator
- person Info:
- surname: Lindstad
- given Name: Arne Martinus
- affiliation:
- organization Info:
- organization Name: National Library of Norway
- organization Name: Nasjonalbiblioteket
- organization Short Name: NLN
- organization Short Name: NB
- department Name: The Language Bank
- department Name: Språkbanken
- actor Info:
- actor Type: organization
- organization Info:
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UiB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- department Name: Institutt for lingvistiske, litterære og estetiske studium
- corpus Info:
- corpus Type: Written Corpus
- corpus Part Info:
- media Type: text
- corpus Text Info:
- text Format Info:
- mime Type: application/xml
- corpus Part General Info:
- linguality Info:
- linguality Type: bilingual
- multilinguality Type: parallel
- multilinguality Type Details: Parallelized text from public service web sites
- language Info:
- language Id: nn
- language Name: Norwegian Nynorsk
- size Per Language:
- size Info:
- size: 289722
- size Unit: tokens
- size Info:
- size: 21056
- size Unit: sentences
- language Variety Info:
- language Variety Type: other
- language Variety Name: Formal written language
- language Info:
- language Id: en
- language Name: English
- size Per Language:
- size Info:
- size: 353837
- size Unit: tokens
- size Info:
- size: 20998
- size Unit: sentences
- language Variety Info:
- language Variety Type: other
- language Variety Name: Formal written language, translated from Norwegian Nynorsk
- modality Info:
- modality Type: writtenLanguage
- size Info:
- size: 2
- size Unit: files
- size Info:
- size: 5,29
- size Unit: mb
- annotation Info:
- annotation Type: other
- segmentation Level: sentence
- annotation Mode: automatic
- annotation Mode Details: https://www.nb.no/sbfil/dok/20180402_report.pdf
- creation Info:
- creation Mode: automatic
- creation Mode Details: https://www.nb.no/sbfil/dok/20180402_report.pdf
- corpus Part General Info:
- linguality Info:
- linguality Type: bilingual
- multilinguality Type: parallel
- multilinguality Type Details: Parallelized text from public service web sites
- language Info:
- language Id: nb
- language Name: Norwegian Bokmål
- size Per Language:
- size Info:
- size: 359401
- size Unit: tokens
- size Info:
- size: 26771
- size Unit: sentences
- language Variety Info:
- language Variety Type: other
- language Variety Name: Formal written language
- language Info:
- language Id: en
- language Name: English
- size Per Language:
- size Info:
- size: 448717
- size Unit: tokens
- size Info:
- size: 26693
- size Unit: sentences
- language Variety Info:
- language Variety Type: other
- language Variety Name: Formal written language, translated from Norwegian Bokmål
- modality Info:
- modality Type: writtenLanguage
- size Info:
- size: 3
- size Unit: files
- size Info:
- size: 6,71
- size Unit: mb
- annotation Info:
- annotation Type: other
- segmentation Level: sentence
- annotation Mode: automatic
- annotation Mode Details: https://www.nb.no/sbfil/dok/20180402_report.pdf
- creation Info:
- creation Mode: automatic
- creation Mode Details: https://www.nb.no/sbfil/dok/20180402_report.pdf
dc:type | corpus |
dc:title | Norsk-engelsk parallellkorpus frå offentlege nettstadar |
dc:identifier | oai:nb.no:sbr-68 |
dc:description | Dette er eit parallellkorpus laga med utgangspunkt i tekster frå dei offentlege nettstadane www.nav.no, www.nyinorge.no og skatteetaten.no. Desse nettstadane publiserer informasjon på både bokmål og nynorsk, og delar av dette vert omsett til engelsk. Materialet er delt i to korpus, eitt for bokmål-engelsk, og eitt for nynorsk-engelsk. Berre material med ei tilsvarande engelsk omsetjing er inkludert i korpusa. Korpusa vart laga for Språkbanken av Paul Meurer og Andrew Salway ved Universitetet i Bergen. Rapporten for korleis dei gjekk fram ligg ved korpset. Korpusa er òg tilgjengelege i korpushandsamingssystemet Korpuskel ved Clarino Bergen Centre (https://clarino.uib.no/korpuskel/). |
dc:publisher | |
dc:format | downloadable |
dc:date | |
dc:date | 2018-04-02 |
dc:rights | Public |
dc:rights | Creative Commons (CC) |
dc:rights | Creative_Commons-BY (CC-BY) |
dc:rights | https://creativecommons.org/licenses/by/4.0/ |
dc:creator | University of Bergen |
dc:lang | nynorsk |
dc:lang | engelsk |
dc:lang | bokmål |
dc:lang | engelsk |