1.2 The Word Classes
The determination of word classes in di-lemmata is based on the considerations outlined below.
Every attempt to classify word forms brings with it the difficulty that the morphological, syntactical and semantic aspects are intermingled and lead to differing results. A unified and generally accepted classification does not exist and according to linguists will not exist in future. Consequently, one could have used the systematic found in the latest Grammatik-Duden as a basis for the process of lemmatisation. For di-lemmata a different approach has been selected. The classification used here (after the words in the texts have been lemmatised) does not attempt to render a detailed grammatical description, rather is subject to the main objective of the program viz. to allow and open new possibilities of computer-assisted text analysis. In other words, this classification is not an attempt to propose a new grammatical model, but simply to allow a framework for new enquiries.
At this point, questions as to sense and meaning, (Sinn und Bedeutung, Signifikant und Signifikat, Systematik und Pragmatik etc.) are not under discussion (nor indeed of particular relevance to di-lemmata). There is undoubtedly some justification in asserting that the semantic content of a text lies mainly with the principal parts of speech such as noun, adjective and verb, and any analysis of a poet's vocabulary and use of words is primarily concentrated on these specific forms.
They are differentiated from the remaining word classes (referred to here as "Restklasse" = "Remainder") by their "semantic function" as well as being "open classes". These so-called "open classes" include lexemes that arise new; others become disused and disappear in the course of time. Forms such as articles, pronouns and conjunctions form the "Restklasse", they are considered "closed classes" as few, if any, new lexemes are formed. Their number is mostly constant.
Between the principal parts of speech and the "Restklasse" is a substantial quantitative difference. This is especially noticeable with an increased volume of texts. "Restklasse" word forms have a relatively constant number of lexemes whereas the principal parts of speech tend to vary in quantity. The opposite is true of the frequency with which lexemes occur in the texts: forms in the "Restklasse" such as articles and conjunctions ("und") are much more frequent. The only exception to this is the verb "sein" which also occurs in its function of an auxiliary. In Trakl's work, for example, no principal parts of speech appear in the top 10 most frequent lexemes, only 2 nouns and 3 adjectives when the top 25 are taken into consideration. They are: "Nacht" and "Schatten" and "dunkel", "schwarz" and "blau".
The "Restklasse" is not of outstanding (semantic) importance in questions of literary criticism. Nonetheless, an attempt was made to divide it into sub-classes – somewhat limited and with some exceptions. Every unambiguous lexeme was sorted to the appropriate word class, but a differentiation as found in formal grammars was not undertaken. All pronouns were classified together without further differentiation as were also the various forms of conjunctions.
Following the overall objective and intention of the project – viz. meaningful support of literary analysis – the word forms proper names, foreign words and quotations are listed under C. It is understood that these 3 categories do not strictly belong here; nonetheless it was deemed helpful to treat the 3 groups in this fashion.
Finally, the scope of the lemmatisation process should be mentioned: all the poetic texts were lemmatised. Diaries, letters and other writings (not considered poetic texts) were not lemmatised. A Register of Persons has been compiled for these texts.
The Word Classes:
|A. Main Parts of Speech:||
|B. Die Restklasse:||
7. Pronominal Adverbs
11. Ambiguous Entries
12. Proper Names
13. Foreign Words