Hopp til innhold

The Freiburg – Brown Corpus of American English

The Freiburg – Brown Corpus of American English (Frown) contains texts from 1991.

Like the original Brown and LOB corpora, Frown contains 500 texts of around 2000 words each, distributed across 15 text categories, 9 informative and 6 imaginative.

The Freiburg update of the Brown corpus (Frown) is part of the ‘Brown family’ of corpora. Work on the compilation of Frown and its counterpart, the Freiburg-LOB corpus of British English (F-LOB), began in 1991. Both corpora were intended to match the Brown and LOB corpora as closely as possible in size and composition, with the only difference that they should represent the language of the early 1990s.

The texts were not obtained by random sampling but were selected carefully to match the Brown corpus as closely as possible. The main aim was to achieve a close comparability with Brown rather than some kind of general statistical representativeness of printing output in the United States, in order to provide linguists with an empirical basis to study language change in progress. There are two versions of the Frown corpus, the original version and a POS-tagged version – produced jointly with Geoffrey Leech (Lancaster) and Nick Smith (then Lancaster, now Leicester).

The Freiburg – Brown Corpus of American English (Frown) contains texts from 1991.

Like the original Brown and LOB corpora, Frown contains 500 texts of around 2000 words each, distributed across 15 text categories, 9 informative and 6 imaginative.

The Freiburg update of the Brown corpus (Frown) is part of the ‘Brown family’ of corpora. Work on the compilation of Frown and its counterpart, the Freiburg-LOB corpus of British English (F-LOB), began in 1991. Both corpora were intended to match the Brown and LOB corpora as closely as possible in size and composition, with the only difference that they should represent the language of the early 1990s.

The texts were not obtained by random sampling but were selected carefully to match the Brown corpus as closely as possible. The main aim was to achieve a close comparability with Brown rather than some kind of general statistical representativeness of printing output in the United States, in order to provide linguists with an empirical basis to study language change in progress. There are two versions of the Frown corpus, the original version and a POS-tagged version – produced jointly with Geoffrey Leech (Lancaster) and Nick Smith (then Lancaster, now Leicester).

Utvidet metadata

Last ned metadata

Gå til ressursside