COLT – The Bergen Corpus of London Teenage Language (with audio recordings)
Extended metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: COLT – The Bergen Corpus of London Teenage Language (with audio recordings)
- description: COLT is a corpus of London Teenage Language with audio recordings. It is now distributed via the search engine Corpuscle. Corpuscle allows you to pass queries to the corpus, and you may ask for concordances, collocations and distribution. The corpus results from the project COLT. The aim of the project was to create a corpus of British English spontaneous teenage talk and make it available for research, first on the internet, next as an orthographically and prosodically transcribed CD-ROM version, and finally as a CD-ROM version with both text and sound. The recordings were made by 31 volunteering 13-17 year old boys and girls from five socially different school boroughs, so-called ‘recruits’ equipped with a Sony Walkman, a lapel microphone and a log book. The entire material of roughly half a million words was orthographically transcribed by trained transcribers employed by the Longman Group for transcribing The British National Corpus (BNC). A copy of this version of COLT was incorporated in the BNC. At the Bergen end, the orthographically transcribed material was subsequently submitted to careful editing, which involved correcting misinterpreted talk, reducing the number of <unclear> passages and adding untranscribed talk. The edited version was then tagged for word classes in the same way as the BNC by a research team at Lancaster university.
- resource Short Name: COLT
- url: http://clarino.uib.no/korpuskel/landing-page?identifier=colt&view=short
- url: http://clu.uni.no/icame/colt/
- P I D: hdl:11495/D9B6-13F8-41BB-1
- identifier: colt
- distribution Info
- licence Info
- user Category: Academic
- licence
- licence Family: CLARIN
- licence Name: CLARIN_ACA-NC-LOC-PRIV-ND-*
- licence Url: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&PRIV=1&NORED=1&ND=1
- conditions Of Use: BY
- conditions Of Use: ID
- conditions Of Use: LOC
- conditions Of Use: NC
- conditions Of Use: ND
- conditions Of Use: NORED
- conditions Of Use: PRIV
- licensor:
- actor Info
- actor Type: person
- person Info
- surname: Jørgensen
- given Name: Annette Myre
- sex: female
- position: Associate Professor
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Foreign Languages
- department Name: Institutt for fremmedspråk (IF)
- communication Info
- email: Annette.Myre@if.uib.no
- licence Info
- ipr Holder
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Foreign Languages
- department Name: Institutt for fremmedspråk (IF)
- communication Info
- email: Annette.Myre@if.uib.no
- actor Info
- actor Info
- actor Type: person
- person Info
- surname: Jørgensen
- given Name: Annette Myre
- sex: female
- position: Associate Professor
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Foreign Languages
- department Name: Institutt for fremmedspråk (IF)
- communication Info
- email: Annette.Myre@if.uib.no
- actor Type: organization
- organization Info
- organization Name: CLARINO Bergen Centre
- communication Info
- email: clarin@uib.no
- url: https://repo.clarino.uib.no/xmlui/
- metadata Creation Date: 27.08.2015
- metadata Last Date Updated: 04.04.2022
- metadata Creator
- actor Info
- actor Type: person
- person Info
- surname: Lyse
- given Name: Gunn Inger
- sex: female
- position: Researcher (Ph.D)
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: clarin@uib.no
- actor Info
- funding Project:
- project Info
- project Name: Språkkontakt og ungdomsspråk i Norden
- project Name: Nordic Teenage Language
- project Short Name: UNO
- funding Type: nationalFunds
- funder: Research Council of Norway
- funding Country: Norway
- corpus Info
- corpus Type: Multimodal Corpus
- corpus Part Info
- media Type: audio
- corpus Audio Info
- audio Size Info
- size Info
- size: 444 166
- size Unit: words
- size Info
- setting Info
- naturality: spontaneous
- conversational Type: multilogue
- audio Size Info
- corpus Text Info
- text Format Info
- mime Type: text/plain
- character Encoding Info
- character Encoding: UTF-8
- text Format Info
- corpus Part General Info
- linguality Info
- linguality Type: monolingual
- language Info
- language Id: en
- language Name: English
- language Variety Info
- language Variety Type: jargon
- language Variety Name: teenage language
- language Variety Info
- language Variety Type: dialect
- language Variety Name: London English
- modality Info
- modality Type: spokenLanguage
- modality Type Details: Spontaneous speech among teenagers
- modality Info
- modality Type: writtenLanguage
- modality Type Details: Transciptions of the recorded speech
- size Info
- size: 689 885
- size Unit: tokens
- size Info
- size: 444 166
- size Unit: words
- annotation Info
- annotation Type: speechAnnotation-orthographicTranscription
- segmentation Level: word
- segmentation Level: wordGroup
- annotation Mode Details: COLT has been transcibed to be made searchable as text. Using the program Transcriber, the recordings were orthographically transkribed. Apart from the ortographic words, there is specific annotations for imitation and citing, incomplete words (%) and unclear words (XXX), rising vs. falling intonation for questions. The user is meant to listen to the sound file while reading the transciption; thus there is no annotation for non-linguistic sounds such as coughing, dog's bark. I Corpuscle the user may click on the sound file to listen while reading the transcription.
- annotation Tool
- target Resource Name U R I: Transcriber
- classification Info
- genre Info
- genre Type: audioGenre
- genre: informal
- unstandardised Genre: teenage language
- genre Info
- time Coverage Info
- time Coverage: 1993
- linguality Info
dc:type | corpus |
dc:title | COLT – The Bergen Corpus of London Teenage Language (with audio recordings) |
dc:identifier | oai:clarino.uib.no:colt |
dc:description | COLT is a corpus of London Teenage Language with audio recordings. It is now distributed via the search engine Corpuscle. Corpuscle allows you to pass queries to the corpus, and you may ask for concordances, collocations and distribution. The corpus results from the project COLT. The aim of the project was to create a corpus of British English spontaneous teenage talk and make it available for research, first on the internet, next as an orthographically and prosodically transcribed CD-ROM version, and finally as a CD-ROM version with both text and sound. The recordings were made by 31 volunteering 13-17 year old boys and girls from five socially different school boroughs, so-called ‘recruits’ equipped with a Sony Walkman, a lapel microphone and a log book. The entire material of roughly half a million words was orthographically transcribed by trained transcribers employed by the Longman Group for transcribing The British National Corpus (BNC). A copy of this version of COLT was incorporated in the BNC. At the Bergen end, the orthographically transcribed material was subsequently submitted to careful editing, which involved correcting misinterpreted talk, reducing the number of <unclear> passages and adding untranscribed talk. The edited version was then tagged for word classes in the same way as the BNC by a research team at Lancaster university. |
dc:publisher | |
dc:format | |
dc:date | |
dc:date | |
dc:rights | Academic |
dc:rights | CLARIN |
dc:rights | CLARIN_ACA-NC-LOC-PRIV-ND-* |
dc:rights | https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&PRIV=1&NORED=1&ND=1 |
dc:lang | English |