Proceedings of Norwegian parliamentary debates (2008-2015)
Extended metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: Proceedings of Norwegian parliamentary debates (2008-2015)
- description: The corpus "Proceedings of Norwegian parliamentary debates (2008-2015)" is a collection of transcriptions of Norwegian parliamentary debates between 2008 and 2015, downloaded from https://data.stortinget.no/. Each sentence has the following metadata which is searchable: (1) language variety – Norwegian bokmål (nob) or Norwegian nynorsk (nno), based on the automatic recognition of language variety, implemented by Paul Meurer at Uni Research Computing. There are also some transcriptions from speeches in English and Danish. (2) Speaker's name (3) Date and time (4) Political party to which the speaker belongs (5) Type of contribution (e.g. 'hovedinnlegg' [main contrbution] or 'replikk' [reply]). To read more details about the source material, please see the component Corpus info > Corpus part general info > Source work info AVAILABILITY: the material is searchable via the corpus workbench Corpuscle. There is ongoing work to analyse text from the corpus using the the treebank portal INESS.
- url: http://clarino.uib.no/korpuskel/landing-page?identifier=stortinget
- url: http://clarino.uib.no/korpuskel/landing-page?resource=stortinget&view=short
- P I D: hdl:11495/DA65-D02F-0EB0-9
- identifier: stortinget
- distribution Info
- licence Info
- user Category: Public
- execution Location: http://hdl.handle.net/11495/DA65-D02F-0EB0-9
- licence
- licence Family: DIFI
- licence Name: Norwegian Licence for Open Government Data (NLOD)
- licence Url: http://data.norge.no/nlod/no
- conditions Of Use: BY
- licence Info
- contact
- actor Info
- actor Type: organization
- organization Info
- organization Name: CLARINO Bergen Centre
- communication Info
- email: clarin@uib.no
- url: https://repo.clarino.uib.no/
- url: https://clarin.b.uib.no
- city: Bergen
- country: Norway
- actor Info
- metadata Info
- metadata Creation Date: 08.02.2016
- metadata Last Date Updated: 10.02.2016
- metadata Creator
- actor Info
- actor Type: person
- person Info
- surname: Lyse
- given Name: Gunn Inger
- sex: female
- position: Researcher (Ph.D)
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: clarin@uib.no
- url: https://repo.clarino.uib.no/
- url: https://clarin.b.uib.no
- city: Bergen
- country: Norway
- actor Info
- resource Documentation Info
- documentation Unstructured
- role: documentation
- document Unstructured: http://clarino.uib.no/korpuskel/overview?identifier=stortinget
- documentation Unstructured
- resource Creation Info
- creation End Date: 08.02.2016
- resource Creator
- actor Info
- actor Type: person
- person Info
- surname: Meurer
- given Name: Paul
- sex: male
- position: Senior researcher
- affiliation:
- organization Info
- organization Name: Uni Research AS
- department Name: Uni Research Computing
- communication Info
- email: paul.meurer@uni.no
- actor Info
- funding Project:
- project Info
- project Name: Common Language Resources and Technology Infrastructure Norway
- project Short Name: CLARINO
- url: http://clarin.b.uib.no/
- funding Type: nationalFunds
- funder: the Research Council of Norway
- funding Country: Norway
- project Start Date: 01.01.2012
- project End Date: 31.12.2017
- resource Relation
- related Resource
- reference Scope: thisResource
- related Resource
- reference Scope: externalResource
- resource Reference: hdl:11495/DA54-7C36-1050-2
- relation Type
- relation Name: annotates
- related Resource
- corpus Info
- corpus Type: Written Corpus
- corpus Part General Info
- source Work Info
- work Description: Data from https://data.stortinget.no/. the data were downloaded by starting from the URL: https://data.stortinget.no/eksport/publikasjoner?publikasjontype=referat&sesjonid=2014-2015, and correspondingly until 2008-2009, which returns lists of publication IDs. The transcriptions themselves were downloaded (in XML) with URLs containing a publication ID, e.g.: https://data.stortinget.no/eksport/publikasjon?publikasjonid=s141008 From these, Paul Meurer at Uni Research Computing extracted those elements encided within the tags: <innlegg…>…<innlegg>.
- linguality Info
- linguality Type: monolingual
- language Info
- language Id: nb
- language Name: Norwegian Bokmål
- language Info
- language Id: nn
- language Name: Norwegian Nynorsk
- language Info
- language Id: no
- language Name: Norwegian
- modality Info
- modality Type: writtenLanguage
- size Info
- size: 29482445
- size Unit: tokens
- size Info
- size: 28533334
- size Unit: words
- annotation Info
- annotation Type: speechAnnotation-orthographicTranscription
- annotation Description: This material is transcriptions of public parliamentary debates in the Norwegian Stortinget.
- annotation Standoff: false
- annotation Info
- annotation Type: other
- annotation Description: Metadata connected to each sentence: lang (language code e.g. nob or nno); name (name of speaker); party (political party); time (time of utterance, e.g. 2008-10-08, 10:03:46); type (type of utterance in debate, e.g. hovedinnlegg [main speech] or replikk [reply/comment].)
- segmentation Level: sentence
- classification Info
- genre Info
- genre Type: textGenre
- genre: unstandardised
- unstandardised Genre: political debates (transcribed into text)
- genre Info
- time Coverage Info
- time Coverage: 2008-2015
- source Work Info
dc:type | corpus |
dc:title | Proceedings of Norwegian parliamentary debates (2008-2015) |
dc:identifier | oai:clarino.uib.no:stortinget |
dc:description | The corpus "Proceedings of Norwegian parliamentary debates (2008-2015)" is a collection of transcriptions of Norwegian parliamentary debates between 2008 and 2015, downloaded from https://data.stortinget.no/. Each sentence has the following metadata which is searchable: (1) language variety – Norwegian bokmål (nob) or Norwegian nynorsk (nno), based on the automatic recognition of language variety, implemented by Paul Meurer at Uni Research Computing. There are also some transcriptions from speeches in English and Danish. (2) Speaker's name (3) Date and time (4) Political party to which the speaker belongs (5) Type of contribution (e.g. 'hovedinnlegg' [main contrbution] or 'replikk' [reply]). To read more details about the source material, please see the component Corpus info > Corpus part general info > Source work info AVAILABILITY: the material is searchable via the corpus workbench Corpuscle. There is ongoing work to analyse text from the corpus using the the treebank portal INESS. |
dc:publisher | |
dc:format | |
dc:date | |
dc:date | 2016-02-08 |
dc:rights | Public |
dc:rights | DIFI |
dc:rights | Norwegian Licence for Open Government Data (NLOD) |
dc:rights | http://data.norge.no/nlod/no |
dc:creator | Paul Meurer |
dc:lang | Norwegian Bokmål |
dc:lang | Norwegian Nynorsk |
dc:lang | Norwegian |