TAUS – Talemålsundersøkelsen i Oslo

Materialet fra Talemålsundersøkelsen i Oslo (TAUS) er basert på uformelle intervjuer med folk fra Oslo, som ble gjort i 1971-73. Informantene er hovedsakelig fra to østlige bydeler (Vålerenga og Kampen) og en vestlig (Frogner), og har en sosial bakgrunn som kan anses representative med hensyn til utdanning og yrke, og oppvekstmiljø. Personene faller i tre grupper ut fra alder: ungdom (15 – 17 år), unge voksne (20 – 30) og voksne (34 – 75).

Temaene for intervjuene er opplevelser og beskrivelser fra barndom og oppvekst, og det er flere innslag av muntlige fortellinger. Samtalene har foregått hjemme hos de enkelte og i en uhøytidelig og uformell tone, slik at den språklige stilen kan betegnes som uformell dagligtale.

I 2006 – 2007 er A- og C-serien av TAUS-lydbåndene digitalisert, og alle intervjuene er transkribert ortografisk. Transkripsjonene er dessuten koplet sammen med de digitaliserte lydfilene. Hele materialet er søkbart via søkeverktøyet Glossa. Det er mulig å søke både i de originale, fonetiske TAUS-transkripsjonene og i de ortografiske. Vær oppmerksom på at noen av de originale TAUS-lydbåndene har gått tapt. Disse intervjuene mangler derfor i dette søkbare materialet. Les mer om dette under fanen Informanter.

I 2014 – 2019 er B-serien digitalisert og transkribert gjennom LIA-prosjektet.

I 2018 ble TAUS lagt inn i en ny versjon av Glossa, Logg deg inn med Feide, Clarin eller ta kontakt med Tekstlaboratoriet.

I januar 2020 ble TAUS v.3 publisert med alt tilgjengelig materiale fra A-, B- og C-serien. Korpuset har 86 talere og 387 551 tokens.

I 2014 – 2019 er B-serien digitalisert og transkribert gjennom LIA-prosjektet.

I 2018 ble TAUS lagt inn i en ny versjon av Glossa, Logg deg inn med Feide, Clarin eller ta kontakt med Tekstlaboratoriet.

I januar 2020 ble TAUS v.3 publisert med alt tilgjengelig materiale fra A-, B- og C-serien. Korpuset har 86 talere og 387 551 tokens.

Utvidet metadata

resource Common Info:
resource Type: corpus
identification Info:
resource Name: TAUS – Talemålsundersøkelsen i Oslo
resource Name: TAUS – The spoken language investigation in Oslo
description: The material from TAUS (The spoken language investigation in Oslo) is based on informal interviews with people from Oslo. The interviews were made in 1971-73. The informants are mainly from two eastern districts (Vålerenga and Kampen) and a western (Frogner), and have a social background that can be considered representative with respect to education, occupation and place of adolescence. The informants fall into three groups based on age: youth (15 – 17 years), young adults (20 – 30) and adults (34 – 75). The topics for the interviews are experiences and descriptions from childhood and adolescence. The interviews were conducted at home with an unceremoniously and informal tone, so that the linguistic style can be described as informal vernacular. In 2006 – 2007 the TAUS-tapes from the A and B series were digitized, and all the interviews were transcribed orthographically and linked to the digital audio files. The transcriptions are now searchable via the search interface tool Glossa. In 2014 – 2019 the tapes from the B-series were digitized and transcribed during the LIA-project (https://www.hf.uio.no/iln/english/research/projects/language-infrastructure-made-accessible/index.html). In January 2020 TAUS v.3 was published with all available material from the A, B og C series. TAUS v.3 has 86 speakers and 387 551 tokens.
description: Materialet fra Talemålsundersøkelsen i Oslo (TAUS) er basert på uformelle intervjuer med folk fra Oslo, som ble gjort i 1971-73. Informantene er hovedsakelig fra to østlige bydeler (Vålerenga og Kampen) og en vestlig (Frogner), og har en sosial bakgrunn som kan anses representative med hensyn til utdanning og yrke, og oppvekstmiljø. Personene faller i tre grupper ut fra alder: ungdom (15 – 17 år), unge voksne (20 – 30) og voksne (34 – 75). Temaene for intervjuene er opplevelser og beskrivelser fra barndom og oppvekst, og det er flere innslag av muntlige fortellinger. Samtalene har foregått hjemme hos de enkelte og i en uhøytidelig og uformell tone, slik at den språklige stilen kan betegnes som uformell dagligtale. I 2006 – 2007 er A- og C-serien av TAUS-lydbåndene digitalisert, og alle intervjuene er transkribert ortografisk. Transkripsjonene er dessuten koplet sammen med de digitaliserte lydfilene. Hele materialet er søkbart via søkeverktøyet Glossa. Det er mulig å søke både i de originale, fonetiske TAUS-transkripsjonene og i de ortografiske. Vær oppmerksom på at noen av de originale TAUS-lydbåndene har gått tapt. Disse intervjuene mangler derfor i dette søkbare materialet. Les mer om dette under fanen Informanter. I 2014 – 2019 er B-serien digitalisert og transkribert gjennom LIA-prosjektet. I 2018 ble TAUS lagt inn i en ny versjon av Glossa, Logg deg inn med Feide, Clarin eller ta kontakt med Tekstlaboratoriet. I januar 2020 ble TAUS v.3 publisert med alt tilgjengelig materiale fra A-, B- og C-serien. Korpuset har 86 talere og 387 551 tokens.
resource Short Name: TAUS
url: http://www.tekstlab.uio.no/nota/taus/index.html
P I D: http://hdl.handle.net/11538/0000-0005-E7C2-B
distribution Info:
licence Info:
user Category: Academic
distribution Access Medium: accessibleThroughInterface
execution Location: http://www.tekstlab.uio.no/nota/taus/index.html
execution Location: http://www.tekstlab.uio.no/nota/taus/english.html
licence:
licence Family: CLARIN
licence Name: CLARIN_ACA-NC-LOC-PRIV-ND-*
licence Url: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&PRIV=1&NORED=1&ND=1
conditions Of Use: *
conditions Of Use: BY
conditions Of Use: ID
conditions Of Use: LOC
conditions Of Use: NC
conditions Of Use: ND
conditions Of Use: NORED
conditions Of Use: PRIV
non Standard Conditions Of Use: The corpus has audio and video recordings classified as personal data. In agreement with NSD, the Data Protection Official in Norway, the corpus is accessible only through Glossa, a search and post-processing tool developed by the Text Laboratory. The video and audio excerpts given by the search interface can not be shown in public unless you have an agreement with the Text Laboratory. Please note that every individual researcher is responsible for treating the participants in the corpus with respect and sincerity. Furthermore, the participants must be kept anonymous in every published paper or other output.
licensor:
actor Info:
actor Type: organization
organization Info:
organization Name: University of Oslo
organization Name: Universitetet i Oslo
organization Short Name: UiO
organization Short Name: UoO
department Name: Department of Linguistics and Scandinavian Studies
department Name: Institutt for lingvistiske og nordiske studier (ILN)
communication Info:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zip Code: 0317
city: OSLO
country: Norway
distribution Rights Holder
- actor Info:
- actor Type: organization
- organization Info:
- organization Name: University of Oslo
- organization Name: Universitetet i Oslo
- organization Short Name: UiO
- organization Short Name: UoO
- department Name: Department of Linguistics and Scandinavian Studies
- department Name: Institutt for lingvistiske og nordiske studier (ILN)
- communication Info:
- email: tekstlab-post@iln.uio.no
- url: http://www.hf.uio.no/iln/english/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
contact
- actor Info:
- actor Type: organization
- organization Info:
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info:
- email: tekstlab-post@iln.uio.no
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
metadata Info:
metadata Creation Date: 31.07.2015
metadata Last Date Updated: 04.05.2021
metadata Creator
- actor Info:
- actor Type: person
- person Info:
- surname: Hagen
- given Name: Kristin
- organization Info:
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info:
- email: kristin.hagen@iln.uio.no
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
version Info:
version: Third version
validation Info:
validated: true
validation Type: content
validation Mode: manual
validation Mode Details: The transcriptions are proof read against the audio files.
validation Extent: full
validator:
actor Info:
actor Type: organization
organization Info:
organization Name: The Text Laboratory
organization Short Name: Textlab
department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
communication Info:
email: tekstlab-post@iln.uio.no
url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
address: Box 1102 Blindern
zip Code: 0317
city: OSLO
country: Norway
resource Documentation Info:
documentation Unstructured:
role: documentation
document Unstructured: http://www.tekstlab.uio.no/nota/taus/index.html
documentation Structured:
role: documentation
document Info:
document Type: book
title: Oslomål. TAUS skrift nr. 6. (Hovedrapport.)
author: E. Hanssen, Th. Hoel, E. H. Jahr, O. Rekdal, G. Wiggen.
year: 1978
documentation Structured:
role: documentation
document Info:
document Type: mastersThesis
title: Sosio-syntaktisk undersøking av talemålet til utvalgte grupper Oslo-ungdom.
author: Wiggen, Geirr
year: 1974
resource Creation Info:
creation Start Date: 01.01.1970
creation End Date: 15.01.2020
resource Creator
- actor Info:
- actor Type: organization
- role: Står som førsteforfatter av prosjektrapporten. TAUS var ellers et gruppearbeid.
- person Info:
- surname: Hanssen
- given Name: Eskil
- sex: male
- organization Info:
- organization Name: Prosjektet Talemålsundersøkelsen i Oslo (1971-1976)
- department Name: Tidligere Institutt for Nordisk språk og litteratur ved UiO.
- communication Info:
- email: eskil.hanssen@iln.uio.no
- actor Info:
- actor Type: organization
- organization Info:
- organization Name: The Text Laboratory
- organization Short Name: Textlab
- department Name: Department of Linguistics and Scandinavian Studies, University of Oslo
- communication Info:
- email: tekstlab-post@iln.uio.no
- url: http://www.hf.uio.no/iln/om/organisasjon/tekstlab/
- address: Box 1102 Blindern
- zip Code: 0317
- city: OSLO
- country: Norway
funding Project:
project Info:
project Name: Talemålsundersøkelsen i Oslo
project Short Name: TAUS
funding Type: nationalFunds
funder: NAVF, Norges almennvitenskaplige forskningsråd
funding Country: Norge
project Start Date: 01.01.1971
project End Date: 31.12.1976
funding Project:
project Info:
project Name: Digitalisering og retranskribering av TAUS
funding Type: nationalFunds
funder: Utstyrsmidler fra Humanistisk fakultet, Universitetet i Oslo
funder: Professor Didrik Arup Seips fond
funding Country: Norge
project Start Date: 01.01.2006
project End Date: 31.12.2007
funding Project:
project Info:
project Name: LIA (Language Infrastructure made Accessible)
project Short Name: LIA
project I D: 22 59 41
url: http://tekstlab.uio.no/LIA/
url: https://www.hf.uio.no/iln/english/research/projects/language-infrastructure-made-accessible/index.html
funding Type: nationalFunds
funder: The Research Council of Norway
funding Country: Norway
project Start Date: 04.01.2014
project End Date: 31.12.2019

corpus Info:
corpus Type: Multimodal Corpus
corpus Part Info:
media Type: text
corpus Text Info:
text Format Info:
mime Type: txt
size Per Text Format:
size Info:
size: 387 551
size Unit: tokens
character Encoding Info:
character Encoding: Unicode
corpus Part Info:
media Type: audio
corpus Audio Info:
audio Size Info:
size Info:
size: ca 5
size Unit: gb
audio Content Info:
speech Items: freeSpeech
setting Info:
naturality: spontaneous
conversational Type: dialogue
audience: no
interactivity: interactive
interaction: Informal interviews that sounds more formal in 2015
audio Format Info:
mime Type: wav and mp3
signal Encoding: linearPCM
sampling Rate: 32
quantization: 64
number Of Tracks: 1
recording Quality: low
compression Info:
compression: true
compression Name: mpeg
corpus Part General Info:
person Source Set Info:
number Of Persons: 86
age Of Persons: teenager
age Of Persons: adult
age Of Persons: elderly
age Range Start: 15
age Range End: 75
sex Of Persons: mixed
origin Of Persons: native
dialect Accent Of Persons: Oslo dialect: from Kampen, Vålerenga (Oslo east) and Frogner (Oslo west)
linguality Info:
linguality Type: monolingual
language Info:
language Id: No
language Name: Norwegian
language Info:
language Id: Nb
language Name: Norwegian Bokmål
modality Info:
modality Type: spokenLanguage
modality Type Details: Orthographic transcription. Some of the interviews in the A series also have the original phonetic TAUS transcription linked to the orthographic transcription. The B series transcriptions have phonetic transcriptions following the LIA guidelines together with orthographic transcriptions.
size Info:
size: 387 551
size Unit: tokens
annotation Info:
annotation Type: morphosyntacticAnnotation-posTagging
annotated Elements: other
segmentation Level: word
tagset: POS tagset created for the statistical NoTa-tagger – based on the tagset of the Oslo Bergen Tagger.
tagset Language Id: Nb
tagset Language Name: Norwegian Bokmål
theoretic Model: TreeTagger
annotation Mode: automatic
annotation Manual Structured:
role: annotationManual
document Info:
document Type: article
title: Tagging a Norwegian Speech Corpus
author: Anders Nøklestad and Åshild Søfteland
editor: Joakim Nivre,Heiki-Jaan Kaalep,Kadri Muischnek, Mare Koit
year: 2007
book Title: Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007
pages: 245–248
conference: Nodalida 2007
document Language Name: English
document Language Id: en
annotation Manual Structured:
role: annotationManual
document Info:
document Type: article
title: Manuell morfologisk tagging av NoTa-materialet med støtte fra en statistisk tagger.
author: Åshild Søfteland og Anders Nøklestad
editor: Janne Bondi Johannessen og Kristin Hagen
year: 2008
publisher: Novus forlag
book Title: Språk i Oslo. Ny forskning omkring talespråk
pages: 226–234.
I S B N: 978-82-7099-471-7
document Language Name: Norwegian
document Language Id: nb
annotation Manual Structured:
role: annotationManual
document Info:
document Type: manual
title: NoTa-taggeren: TAGGEVEILEDNING
author: Åshild Søfteland
year: 2007
url: http://www.tekstlab.uio.no/nota/oslo/Taggeveiledning2.pdf
document Language Name: Norwegian bokmål
document Language Id: nb
annotation Info:
annotation Type: speechAnnotation-orthographicTranscription
annotation Type: speechAnnotation-phoneticTranscription
annotation Manual Unstructured:
role: annotationManual
document Unstructured: Orthographic transcription,cf Bokmålsordboka (Wangensteen 2004)
annotation Manual Structured:
role: annotationManual
document Info:
document Type: manual
title: Transkripsjonsveiledning for NoTa-Oslo
author: Kristin Hagen
year: 2008
url: http://www.tekstlab.uio.no/nota/oslo/transkripsjon/NoTa-transkripsjonsveil22.pdf
annotation Manual Structured:
role: annotationManual
document Info:
document Type: manual
title: Transkripsjonsrettleiing for LIA
author: Kristin Hagen and Live Håberg and Eirik Olsen and Åshild Søfteland
year: 2018
url: http://tekstlab.uio.no/LIA/pdf/transkripsjonsrettleiing_lia.pdf
annotation Tool:
target Resource Name U R I: Transcriber (http://trans.sourceforge.net/en/presentation.php )
annotation Tool:
target Resource Name U R I: ELAN: https://tla.mpi.nl/tools/tla-tools/elan/ (for the B series)
annotation Tool:
target Resource Name U R I: https://www.hf.uio.no/iln/english/about/organization/text-laboratory/services/oslo-transliterator/index.html
classification Info:
genre Info:
genre Type: speechGenre
genre: semi formal
unstandardised Genre: interviews
genre Info:
genre Type: speechGenre
genre: informal
unstandardised Genre: B series: Conversations between interviewer and informants. Some of them are friends, some of them are pretending to be friends as a part of the task.
time Coverage Info:
time Coverage: 1971 – 1976
time Coverage Info:
time Coverage: In 2006 – 2007 the TAUS-tapes were digitized, and all the interviews were transcribed orthographically and linked to the digital audio files.
time Coverage Info:
time Coverage: In 2014 – 2019 the tapes from the B series were digitalized and transcribed. In 2020 the new TAUS v.3 corpus was published
geographic Coverage Info:
geographic Coverage: Oslo (Vålerenga, Kampen and Oslo. In the B series there are also some other locations in Oslo)
recording Info:
recording Device Type: other
recording Environment: other
recorder Actor:
actor Info:
actor Type: organization
person Info:
surname: Hanssen
given Name: Eskil
sex: male
organization Info:
organization Name: Prosjektet Talemålsundersøkelsen i Oslo (1971-1976)
communication Info:
email: eskil.hanssen@iln.uio.no

dc:type	corpus
dc:title	TAUS – Talemålsundersøkelsen i Oslo
dc:identifier	oai:tekstlab.uio.no:taus
dc:description	Materialet fra Talemålsundersøkelsen i Oslo (TAUS) er basert på uformelle intervjuer med folk fra Oslo, som ble gjort i 1971-73. Informantene er hovedsakelig fra to østlige bydeler (Vålerenga og Kampen) og en vestlig (Frogner), og har en sosial bakgrunn som kan anses representative med hensyn til utdanning og yrke, og oppvekstmiljø. Personene faller i tre grupper ut fra alder: ungdom (15 – 17 år), unge voksne (20 – 30) og voksne (34 – 75). Temaene for intervjuene er opplevelser og beskrivelser fra barndom og oppvekst, og det er flere innslag av muntlige fortellinger. Samtalene har foregått hjemme hos de enkelte og i en uhøytidelig og uformell tone, slik at den språklige stilen kan betegnes som uformell dagligtale. I 2006 – 2007 er A- og C-serien av TAUS-lydbåndene digitalisert, og alle intervjuene er transkribert ortografisk. Transkripsjonene er dessuten koplet sammen med de digitaliserte lydfilene. Hele materialet er søkbart via søkeverktøyet Glossa. Det er mulig å søke både i de originale, fonetiske TAUS-transkripsjonene og i de ortografiske. Vær oppmerksom på at noen av de originale TAUS-lydbåndene har gått tapt. Disse intervjuene mangler derfor i dette søkbare materialet. Les mer om dette under fanen Informanter. I 2014 – 2019 er B-serien digitalisert og transkribert gjennom LIA-prosjektet. I 2018 ble TAUS lagt inn i en ny versjon av Glossa, Logg deg inn med Feide, Clarin eller ta kontakt med Tekstlaboratoriet. I januar 2020 ble TAUS v.3 publisert med alt tilgjengelig materiale fra A-, B- og C-serien. Korpuset har 86 talere og 387 551 tokens.
dc:publisher
dc:format	accessibleThroughInterface
dc:date	1970-01-01
dc:date	2020-01-15
dc:rights	Academic
dc:rights	CLARIN
dc:rights	CLARIN_ACA-NC-LOC-PRIV-ND-*
dc:rights	https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/ClarinEulaAca?ID=1&AFFIL=EDU&BY=1&NC=1&LOC=1&PRIV=1&NORED=1&ND=1
dc:creator	Prosjektet Talemålsundersøkelsen i Oslo (1971-1976)
dc:creator	The Text Laboratory
dc:lang	norsk
dc:lang	bokmål

Last ned ressurser

Gå til ressursside

Gå til ressursside http://www.tekstlab.uio.no/nota/taus/index.html

TAUS – Talemålsundersøkelsen i Oslo

Utvidet metadata

Resource Common Info

Corpus Info

Dublin Core (DC)

Last ned ressurser

Gå til ressursside