INESS Sofie Norwegian Treebank
Utvidet metadata
- resource Common Info
- resource Type: corpus
- identification Info
- resource Name: INESS Sofie Norwegian Treebank
- description: The INESS Sofie Norwegian Treebank. The treebank is a syntactically annotated corpus based on the first chapters of the novel “Sofies verden” by Jostein Gaarder, published by Aschehoug forlag. The sentence-analyses are produced by INESS for the META-NORD project, whose goal was to promote the accessability of existing treebanks for the languages in the project. The corpus is automatically analyzed with the NorGram LFG grammar and all analyses are manually verified.
- resource Short Name: Norwegian Sofie (large treebank)
- url: http://clarino.uib.no/iness/landing-page?resource=nob-sofie&view=short
- url: http://clarino.uib.no/iness/landing-page?resource=nob-sofie
- url: http://clarino.uib.no/iness/landing-page?resource=NorGram
- P I D: hdl:11495/D9DB-327C-6330-2
- identifier: nob-sofie
- distribution Info
- licence Info
- user Category: Academic
- distribution Access Medium: downloadable
- distribution Access Medium: accessibleThroughInterface
- download Location: http://hdl.handle.net/11495/D9DB-327C-6330-2
- execution Location: http://hdl.handle.net/11495/D9DB-327C-6330-2
- licence
- licence Family: none
- licence Name: unspecified
- licence Url: http://clarino.uib.no/comedi/licenses/sofie-license.txt
- conditions Of Use: BY
- conditions Of Use: LRT
- conditions Of Use: NORED
- ipr Holder
- actor Info
- actor Type: person
- person Info
- surname: Gaarder
- given Name: Jostein
- sex: male
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- actor Info
- actor Type: organization
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: post@lle.uib.no
- actor Info
- licence Info
- contact
- actor Info
- actor Type: person
- person Info
- surname: Rosén
- given Name: Victoria
- sex: female
- position: Associate Professor
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: iness@uib.no
- actor Info
- metadata Info
- metadata Creation Date: 12.08.2015
- source: The present metadata are authoritative metadata. They are based on metadata from the project META-NORD (project end date 31.01.2013), published in the META-SHARE catalogue.
- original Metadata Schema: META-SHARE
- original Metadata Link: http://metashare.nb.no/repository/browse/the-iness-sofie-norwegian-treebank/5e7bd6164aa111e28b63001708556d5a63dbc6d61e774a20ae197061b84ac0e0/
- metadata Language Name: English
- metadata Language Id: en
- metadata Last Date Updated: 10.06.2016
- metadata Creator
- actor Info
- actor Type: person
- person Info
- surname: Gyri Smørdal
- given Name: Losnegaard
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- communication Info
- email: gyri.losnegaard@uib.no
- email: clarin@uib.no
- actor Info
- validated: true
- validation Type: content
- validation Mode: manual
- validation Extent: full
- validation Extent Details: Manual validation of all annotations and alignments.
- validator:
- actor Info
- actor Type: person
- person Info
- surname: Jóhannsdóttir
- given Name: Kristín M.
- sex: female
- funding Project:
- project Info
- project Name: META-NORD
- project I D: The META-NORD project has received funding from the European Commission through the CIP ICT PSP Prog
- url: http://meta-nord.eu
- funding Type: euFunds
- funder: European Commission through the CIP ICT PSP Programme
- project Start Date: 01.02.2011
- project End Date: 31.01.2013
- corpus Info
- corpus Type: Treebank
- corpus Part Info
- media Type: text
- corpus Text Info
- text Format Info
- mime Type: text
- character Encoding Info
- character Encoding: utf-8
- text Format Info
- corpus Part General Info
- source Work Info
- title: Sofies verden
- work Description: The novel Sofies verden (Sophie's world), ISBN: 9788203254147.
- author:
- actor Info
- actor Type: person
- person Info
- surname: Gaarder
- given Name: Jostein
- sex: male
- publisher:
- actor Info
- actor Type: organization
- organization Info
- organization Name: Aschehoug forlag
- organization Name: Aschehoug Publishing House
- communication Info
- email: even.rakil@aschehougagency.no
- url: http://www.aschehoug.no/om/english
- city: Oslo
- country: Norway
- telephone Number: +47 22400449
- source Work Info
- linguality Info
- linguality Type: monolingual
- language Info
- language Id: no
- language Name: Norwegian
- language Info
- language Id: nb
- language Name: Norwegian bokmål
- modality Info
- modality Type: writtenLanguage
- size Info
- size: 1151
- size Unit: sentences
- size Info
- size: 15224
- size Unit: words
- annotation Info
- annotation Type: syntacticAnnotation-treebanks
- annotated Elements: other
- annotation Standoff: false
- segmentation Level: sentence
- segmentation Level: phrase
- segmentation Level: word
- annotation Format: Negra/Tiger XML
- tagset: http://prosjekt.digital.uni.no/projects/inesspublic/wiki/NorGram_Lexical_Categories_(Preterminals); http://prosjekt.digital.uni.no/projects/inesspublic/wiki/NorGram_Phrase_Structure_Categories; http://prosjekt.digital.uni.no/projects/inesspublic/wiki/NorGram_F-structure_Features
- theoretic Model: Constituency, with some dependency features
- annotation Mode: mixed
- annotation Mode Details: Automatically parsed with IceParser (annotation tool) and then manually corrected and enhanced.
- annotation Manual Unstructured
- role: annotationManual
- document Unstructured: http://clarino.uib.no/iness/page?page-id=_NorGram_annotator_guidelines_
- annotation Tool
- target Resource Name U R I: http://icenlp.sourceforge.net/
- annotator:
- actor Info
- actor Type: person
- person Info
- surname: Meurer
- given Name: Paul
- sex: male
- position: Senior researcher
- affiliation:
- organization Info
- organization Name: Uni Research AS
- department Name: Uni Research Computing
- actor Info
- actor Type: person
- person Info
- surname: Thunes
- given Name: Martha
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- actor Type: person
- person Info
- surname: Rosén
- given Name: Victoria
- sex: female
- position: Associate Professor
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- actor Type: person
- person Info
- surname: Dyvik
- given Name: Helge
- sex: male
- position: Professor
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- department Name: Institutt for lingvistiske, litterære og estetiske studier (LLE)
- actor Type: person
- person Info
- surname: Lyse
- given Name: Gunn Inger
- sex: female
- position: Researcher (Ph.D)
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Name: Universitetet i Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- actor Type: person
- person Info
- surname: Gyri Smørdal
- given Name: Losnegaard
- sex: female
- affiliation:
- organization Info
- organization Name: University of Bergen
- organization Short Name: UiB
- organization Short Name: UoB
- department Name: Department of Linguistic, Literary and Aesthetic Studies
- genre Info
- genre Type: textGenre
- genre: fiction and drama
- time Coverage: 2002
- creation Mode: mixed
- creation Mode Details: The annotation is created through iterative parsebanking. The analyses produced by XLE with the Norwegian LFG grammar NorGram are disambiguated and stored in the parsebank. The annotation process involves disambiguating the parsing results interactively using discriminants, and mending any lack of coverage in lexicon and grammar by editing lexical entries and grammar rules, and reparsing the sentence.
- creation Tool
- target Resource Name U R I: XLE, NorGram (online demonstrator: http://clarino.uib.no/iness/xle-web)
dc:type | corpus |
dc:title | INESS Sofie Norwegian Treebank |
dc:identifier | oai:clarino.uib.no:nob-sofie |
dc:description | The INESS Sofie Norwegian Treebank. The treebank is a syntactically annotated corpus based on the first chapters of the novel “Sofies verden” by Jostein Gaarder, published by Aschehoug forlag. The sentence-analyses are produced by INESS for the META-NORD project, whose goal was to promote the accessability of existing treebanks for the languages in the project. The corpus is automatically analyzed with the NorGram LFG grammar and all analyses are manually verified. |
dc:publisher | |
dc:format | downloadable |
dc:date | |
dc:date | |
dc:rights | Academic |
dc:rights | none |
dc:rights | unspecified |
dc:rights | http://clarino.uib.no/comedi/licenses/sofie-license.txt |
dc:lang | norsk |
dc:lang | Norwegian bokmål |