BigBrother-korpuset – nedlastbare transkripsjoner
Utvidet metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: The BigBrother Corpus – downloadable transcriptions
- resource Name: BigBrother-korpuset – nedlastbare transkripsjoner
- description: BigBrother-korpuset er et talespråkskorpus som består av den første sesongen av realityserien BigBrother som ble sendt på TVNorge våren 2001. Deltakerne i BigBrother er i alderen 23-36 år og snakker ulike dialekter. BigBrother-korpuset inneholder lyd- og videoopptak av nesten alle de 100 sendingene som ble vist på tv. Denne nedlastbare versjoner inneholder transkripsjonene, cirka 44 300 tokens. Materialet er ortografisk transkribert. BigBrother-korpuset er et unikt talespråkskorpus der deltakerne arbeider sammen, diskutere, argumenterer, krangler, gråter, ler, roper og elsker. I motsetning til kontrollerte talespråksinnspillinger som ofte er begrenset til intervjuer og dialog, har BigBrother-materialet samtaler om alle mulige temaer og innen ulike genre. Noen ganger er sterke følelser i sving, og dette kan tenkes å innvirkning på språket.
- description: The BigBrother Corpus is a speech corpus with recordings from the first season of the BigBrother show, sent on Norwegian television by TVNorge in the first half of 2001. The participants in BigBrother speak different dialects, but primarily they come from the east of Norway. They are aged 23-36 years. The BigBrother Corpus contains audio and video recordings of almost all the 100 broadcasts that was shown on television. The downloadable version of the corpus contains approx. 440 300 tokens, orthographically transcribed. The BigBrother Corpus is a unique speech corpus where the participants work together, discuss, argue, quarrel, cries, laugh, shout, make love etc. In contrast to controlled recordings that are limited to interviews and dialogue, the BigBrother-material has conversations about all possible topics and within different genre. Sometimes strong feelings are in turn, which also can conceivably have an impact on the language.
- resource Short Name: BigBrother – transcriptions
- url:
- P I D:
- distribution Info
- licence Info
- user Category: Public
- distribution Access Medium: downloadable
- download Location:
- licence
- licence Family: Creative Commons (CC)
- licence Name: Creative_Commons-BY-NC-SA (CC-BY-NC-SA)
- licence Url:
- conditions Of Use: BY
- conditions Of Use: NC
- conditions Of Use: SA
- non Standard Conditions Of Use: The corpus has audio and video recordings classified as personal data. The production company Nordic Entertainment has generously given their consent to the usage of the videos as a speech corpus, but the audio and video files are accessible only through Glossa, a search and post-processing tool developed by the Text Laboratory. Every individual researcher is responsible for treating the participants with respect and sincerity. Furthermore, the informants in the corpora should be anonymized, e.g. by changing their names, in every published paper or other output.
- licensor:
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Oslo
- organization Name: Universitetet i Oslo
- organization Short Name: UiO
- organization Short Name: UoO
- department Name: Department of Linguistics and Scandinavian Studies
- department Name: Institutt for lingvistiske og nordiske studier (ILN)
- communication Info
- email:
- url:
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- distribution Rights Holder
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Oslo
- organization Name: Universitetet i Oslo
- organization Short Name: UiO
- organization Short Name: UoO
- department Name: Department of Linguistics and Scandinavian Studies
- department Name: Institutt for lingvistiske og nordiske studier (ILN)
- communication Info
- email:
- url:
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- licence Info
- ipr Holder
- actor Info
- actor Type: organization
- organization Info
- organization Name: Nordic Entertainment (ipr holder of the videos)
- actor Info
- contact
- actor Info
- actor Type: organization
- organization Info
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info
- email:
- url:
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- metadata Info
- metadata Creation Date: 24.02.2015
- metadata Last Date Updated: 06.04.2021
- metadata Creator
- actor Info
- actor Type: person
- person Info
- surname: Hagen
- given Name: Kristin
- organization Info
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info
- email:
- url:
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- version Info
- version: Second version
- validation Info
- validated: true
- validation Type: content
- validation Mode: manual
- validation Mode Details: The transcriptions are proof read against the audio files.
- validation Extent: full
- validator:
- actor Info
- actor Type: organization
- organization Info
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info
- email:
- url:
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- documentation Unstructured
- role: documentation
- document Unstructured:
- creation Start Date: 01.08.2007
- creation End Date: 31.12.2009
- resource Creator
- actor Info
- actor Type: organization
- organization Info
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info
- email:
- url:
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
- actor Info
- funding Project:
- project Info
- project Name: Developing and completing language resources: The Big Brother show as a modern speech corpus
- url:
- funding Type: nationalFunds
- funder: The Research Council of Norway, the KUNSTI program (Kunnskapsutvikling for norsk språkteknologi).
- funding Country: Norway
- project Start Date: 31.08.2007
- project End Date: 31.12.2007
- corpus Info
- corpus Type: Written Corpus
- corpus Part Info
- media Type: text
- corpus Text Info
- text Format Info
- mime Type: Downloadable transcriptions in txt and html format
- size Per Text Format
- size Info
- size: 440 338
- size Unit: tokens
- size Info
- character Encoding Info
- character Encoding: utf-8
- text Format Info
- corpus Part General Info
- person Source Set Info
- number Of Persons: 12
- age Of Persons: adult
- age Range Start: 23
- age Range End: 36
- sex Of Persons: mixed
- origin Of Persons: native
- dialect Accent Of Persons: Some dialects represented, all of them from Southern Norway.
- linguality Info
- linguality Type: monolingual
- language Info
- language Id: No
- language Name: Norwegian
- language Info
- language Id: Nb
- language Name: Norwegian Bokmål
- modality Info
- modality Type: spokenLanguage
- modality Type Details: Informal language from all settings in the BigBrother house.
- annotation Info
- annotation Type: speechAnnotation-orthographicTranscription
- annotation Manual Unstructured
- role: annotationManual
- document Unstructured: Orthographic transcription,cf Bokmålsordboka (Wangensteen 2004)
- annotation Manual Unstructured
- role: annotationManual
- document Unstructured:
- annotation Tool
- target Resource Name U R I: Transcriber ( )
- classification Info
- genre Info
- genre Type: speechGenre
- genre: informal
- unstandardised Genre: All kinds of situations in the BigBrother house. The participants prepare dinner, eat, sleep, make love, discuss, work together etc etc. Lots of emotions.
- genre Info
- time Coverage Info
- time Coverage: 2001
- person Source Set Info
dc:type | corpus |
dc:title | BigBrother-korpuset – nedlastbare transkripsjoner |
dc:identifier | |
dc:description | BigBrother-korpuset er et talespråkskorpus som består av den første sesongen av realityserien BigBrother som ble sendt på TVNorge våren 2001. Deltakerne i BigBrother er i alderen 23-36 år og snakker ulike dialekter. BigBrother-korpuset inneholder lyd- og videoopptak av nesten alle de 100 sendingene som ble vist på tv. Denne nedlastbare versjoner inneholder transkripsjonene, cirka 44 300 tokens. Materialet er ortografisk transkribert. BigBrother-korpuset er et unikt talespråkskorpus der deltakerne arbeider sammen, diskutere, argumenterer, krangler, gråter, ler, roper og elsker. I motsetning til kontrollerte talespråksinnspillinger som ofte er begrenset til intervjuer og dialog, har BigBrother-materialet samtaler om alle mulige temaer og innen ulike genre. Noen ganger er sterke følelser i sving, og dette kan tenkes å innvirkning på språket. |
dc:publisher | |
dc:format | downloadable |
dc:date | 2007-08-01 |
dc:date | 2009-12-31 |
dc:rights | Public |
dc:rights | Creative Commons (CC) |
dc:rights | Creative_Commons-BY-NC-SA (CC-BY-NC-SA) |
dc:rights | |
dc:creator | The Text Laboratory |
dc:lang | norsk |
dc:lang | bokmål |