|
Information:
Deutsch
English
Français
Español
Impressum
|
|
Schweizer Text Korpus – Theoretische Grundlagen, Korpusdesign und Abfragemöglichkeiten
Abstract
The SWISS TEXT CORPUS (CHTK) has made it its goal to extensively document the German language of the
20th century in Switzerland. In this way, and in its parallel function as a sub-corpus of
the Corpus C4, that will consist of 20 million text words (tokens) each from Germany, Austria,
Italy/South Tirol and, as already said, Switzerland, it represents a classical reference corpus both
for the standard German language in Switzerland as well as in the entire German-speaking area of Western
Europe. A reference corpus should meet the requirement of comprehensively depicting the central repertoire
of a language, i.e. the generally used vocabulary of this language, which is why questions of corpus
structure and general planning (corpus design) play a decisive role (cf. Lemnitzer/Zinsmeister
(2006: 106), where the type of the reference corpus is contrasted with the special corpus). Four and a
half years after the start of the project, the SWISS TEXT CORPUS was made available to the general public
in April 2009, as a research instrument. The following article outlines in brief the history of this
research project and deals with fundamental and specific decisions that had to be made in the design
of such a reference corpus, and with how the CHTK is compiled. Together with a concluding overview of
some retrieval and analysis options offered by the CHTK, this article also provides an overview of the
potential of this new research instrument and supplies the background knowledge required to work with
the CHTK. For reasons of space, the methods of working, the corpus-driven approaches, cannot be
thematised here (cf. Bubenhofer 2008, 2006).
|