Skip to content

The LIA Treebank

The LIA Treebank includes 7536 speech segments and 77 701 tokens from LIA Norwegian. The treebank is annotated with morphological and dependency-style syntactic analysis and manually corrected. The treebank is available in three versions: A downloadable version in conllx format, a searchable version in the search interface Glossa and a downloadable version in in conllu format. The conllu version is automatically converted to Universal Dependencies and includes 5250 speech segments and 55 410 tokens.

LIA Norwegian is a speech corpus with old recordings (1939 – 1996) from four Norwegian universities: NTNU, UoB, UoO and UoT.

The LIA Treebank includes 7536 speech segments and 77 701 tokens from LIA Norwegian. The treebank is annotated with morphological and dependency-style syntactic analysis and manually corrected. The treebank is available in three versions: A downloadable version in conllx format, a searchable version in the search interface Glossa and a downloadable version in in conllu format. The conllu version is automatically converted to Universal Dependencies and includes 5250 speech segments and 55 410 tokens.

LIA Norwegian is a speech corpus with old recordings (1939 – 1996) from four Norwegian universities: NTNU, UoB, UoO and UoT.

Extended metadata

Download resources

Go to resource page