META-NORD Sofie Norwegian treebank
Extended metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: META-NORD Sofie Norwegian treebank
- description: The Norwegian part of the META-NORD Sofie Parallel Treebank, a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” (Sophie's World) by Jostein Gaarder, published by Aschehoug forlag. The treebank consists of grammatical annotations of extracts from the original and was created by the INESS project for META-NORD. For more information, see the metadata description of the META-NORD Sofie Parallel Treebank.
- url: http://clarino.uib.no/iness/landing-page?resource=nob-sofie-lfg&view=short
- url: http://clarino.uib.no/iness/landing-page?resource=nob-sofie-lfg
- url: http://clarino.uib.no/iness/landing-page?resource=sofie-par&view=short
- P I D: hdl:11495/D918-9A8D-3EE7-1
- identifier: nob-sofie-lfg
- distribution Info
- licence Info
- user Category: Public
- distribution Access Medium: downloadable
- distribution Access Medium: accessibleThroughInterface
- download Location: http://hdl.handle.net/11495/D918-9A8D-3EE7-1
- execution Location: http://hdl.handle.net/11495/D918-9A8D-3EE7-1
- attribution Text: The "Sofie analyses" is research material based on the novel "Sofies verden" [Sophie's world] by Jostein Gaarder, published by Aschehoug Forlag. If you use INESS in your research, please link to the INESS webpage (http://clarino.uib.no/iness) in materials included with your data. We suggest the following reference in your scientific publications: Victoria Rosén, Koenraad De Smedt, Paul Meurer, and Helge Dyvik. An open infrastructure for advanced treebanking. In Jan Hajič, Koenraad De Smedt, Marko Tadić, and António Branco (eds.) META-RESEARCH Workshop on Advanced Treebanking at LREC2012, pages 22–29, Istanbul, Turkey, May 2012.
- licence
- licence Family: none
- licence Name: unspecified
- licence Url: http://clarino.uib.no/comedi/licenses/sofie-license.txt
- conditions Of Use: BY
- conditions Of Use: LRT
- conditions Of Use: NORED
- ipr Holder
- actor Info
- actor Type: person
- person Info
- surname: Gaarder
- given Name: Jostein
- sex: male
- communication Info
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: post@lle.uib.no
- actor Info
- licence Info
- contact
- actor Info
- actor Type: person
- person Info
- surname: Rosén
- given Name: Victoria
- sex: female
- position: Associate Professor
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: iness@uib.no
- actor Info
- metadata Info
- metadata Creation Date: 15.05.2015
- source: The present metadata are authoritative metadata. They are based on metadata from the project META-NORD (project end date 31.01.2013), published in the META-SHARE catalogue.
- original Metadata Schema: META-SHARE
- original Metadata Link: http://metashare.nb.no/repository/browse/meta-nord-sofie-norwegian-treebank/a5490ac4395711e2b66e001708556d5a221d085385924368a2f909543c41fbe1/#
- metadata Language Name: English
- metadata Language Id: en
- metadata Last Date Updated: 10.06.2016
- metadata Creator
- actor Info
- actor Type: person
- person Info
- surname: Gyri Smørdal
- given Name: Losnegaard
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: gyri.losnegaard@uib.no
- email: clarin@uib.no
- actor Info
- validated: true
- validation Type: content
- validation Mode: manual
- validation Mode Details: The analyses are manually rated as gold analysis, no good analysis, acceptable analysis. No analysis means the parser did not return any results for a given sentence due to lack of coverage in lexicon or grammar, or due to capacity problems.
- validation Extent: full
- validation Extent Details: All linguistic analyses and alignments between treebanks have been manually validated.
- validation Report Unstructured
- role: validationReport
- document Unstructured: Manual evaluation of all analyses (255 sentences/parse units). Gold analysis: 225 (88%), no good: 3 (1%), no analysis: 20 (8%), acceptable analysis 8 (3%). Evaluation details can be provided on request. All alignments have been manually verified.
- validator:
- actor Info
- actor Type: person
- person Info
- surname: Gyri Smørdal
- given Name: Losnegaard
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- actor Type: person
- person Info
- surname: Lyse
- given Name: Gunn Inger
- sex: female
- position: Researcher (Ph.D)
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- email: iness@uib.no
- email: clarin@uib.no
- actor Type: person
- person Info
- surname: Thunes
- given Name: Martha
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- creation Start Date: 2011
- creation End Date: 2012
- funding Project:
- project Info
- project Name: Infrastructure for the Exploration of Syntax and Semantics
- project Short Name: INESS
- url: http://clarino.uib.no/iness/
- funding Type: nationalFunds
- funder: The Research Council of Norway under the Infrastruktur program
- funder: University of Bergen
- funding Country: Norway
- project Start Date: 01.04.2010
- project End Date: 31.03.2016
- project Name: META-NORD
- project I D: The META-NORD project has received funding from the European Commission through the CIP ICT PSP Prog
- url: http://meta-nord.eu
- funding Type: euFunds
- funder: European Commission through the CIP ICT PSP Programme
- project Start Date: 01.02.2011
- project End Date: 31.01.2013
- corpus Info
- corpus Type: Treebank
- corpus Part Info
- media Type: text
- corpus Part General Info
- source Work Info
- title: Sofies verden
- work Description: The novel Sofies verden (Sophie's world), ISBN: 9788203254147.
- author:
- actor Info
- actor Type: person
- person Info
- surname: Gaarder
- given Name: Jostein
- sex: male
- publisher:
- actor Info
- actor Type: organization
- organization Info
- organization Name: Aschehoug forlag
- organization Name: Aschehoug Publishing House
- communication Info
- email: Even.Rakil@aschehoug.no
- url: http://www.aschehoug.no/om/english
- city: Oslo
- country: Norway
- telephone Number: +47 22400449
- source Work Info
- linguality Info
- linguality Type: monolingual
- language Info
- language Id: no
- language Name: Norwegian
- language Info
- language Id: nb
- language Name: Norwegian bokmål
- modality Info
- modality Type: writtenLanguage
- size Info
- size: 250
- size Unit: sentences
- size Info
- size: 3119
- size Unit: words
- annotation Info
- annotation Type: syntacticAnnotation-treebanks
- annotation Standoff: false
- segmentation Level: sentence
- annotation Format: Negra/Tiger XML
- tagset: http://prosjekt.digital.uni.no/projects/inesspublic/wiki/NorGram_Lexical_Categories_(Preterminals); http://prosjekt.digital.uni.no/projects/inesspublic/wiki/NorGram_Phrase_Structure_Categories; http://prosjekt.digital.uni.no/projects/inesspublic/wiki/NorGram_F-structure_Features
- theoretic Model: Lexical Functional Grammar (LFG)
- annotation Mode: mixed
- annotation Mode Details: Automatic parsing, manual disambiguation using discriminants.
- annotation Manual Unstructured
- role: annotationManual
- document Unstructured: http://clarino.uib.no/iness/page?page-id=_NorGram_annotator_guidelines_
- annotator:
- actor Info
- actor Type: person
- person Info
- surname: Gyri Smørdal
- given Name: Losnegaard
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- actor Info
- actor Type: person
- person Info
- surname: Lyse
- given Name: Gunn Inger
- sex: female
- position: Researcher (Ph.D)
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: iness@uib.no
- email: clarin@uib.no
- actor Type: person
- person Info
- surname: Thunes
- given Name: Martha
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- annotation Type: alignment
- annotation Standoff: true
- segmentation Level: sentence
- annotation Mode: interactive
- annotator:
- actor Info
- actor Type: person
- person Info
- surname: Meurer
- given Name: Paul
- sex: male
- position: Senior researcher
- affiliation:
- organization Info
- organization Name: Uni Research AS
- department Name: Uni Research Computing
- communication Info
- email: paul.meurer@uni.no
- genre Info
- genre Type: textGenre
- genre: fiction and drama
- creation Mode: mixed
- creation Mode Details: The annotation is created through iterative parsebanking. The analyses produced by XLE with the Norwegian LFG grammar NorGram are disambiguated and stored in the parsebank. The annotation process involves disambiguating the parsing results interactively using discriminants, and mending any lack of coverage in lexicon and grammar by editing lexical entries and grammar rules, and reparsing the sentence.
- creation Tool
- target Resource Name U R I: XLE, NorGram (online demonstrator: http://clarino.uib.no/iness/xle-web)
dc:type | corpus |
dc:title | META-NORD Sofie Norwegian treebank |
dc:identifier | oai:clarino.uib.no:nob-sofie-lfg |
dc:description | The Norwegian part of the META-NORD Sofie Parallel Treebank, a syntactically annotated parallel corpus based on the first chapters of the novel “Sofies verden” (Sophie's World) by Jostein Gaarder, published by Aschehoug forlag. The treebank consists of grammatical annotations of extracts from the original and was created by the INESS project for META-NORD. For more information, see the metadata description of the META-NORD Sofie Parallel Treebank. |
dc:publisher | |
dc:format | downloadable |
dc:date | 2011 |
dc:date | 2012 |
dc:rights | Public |
dc:rights | none |
dc:rights | unspecified |
dc:rights | http://clarino.uib.no/comedi/licenses/sofie-license.txt |
dc:lang | Norwegian |
dc:lang | Norwegian bokmål |