- Research DataData of The acquisition of consonant clusters and word stress by early second language learners of German: Evidence for cross-linguistic influence?2023-10-20This study compared word-prosodic abilities of early second language learners (eL2) and monolingual learners of German. We examined the production of word-initial and word-final clusters and the placement of stress and analyzed potential effects of cross-linguistic influence (CLI). Monolingual German-speaking children (n = 38) and eL2-learners of German (n = 26; Age of onset to German 24 to 41 months) aged between 53 and 60 months completed a pseudoword repetition task following the metrical and phonotactic constraints of German. We collected background information via parental questionnaires. The eL2-learners acquired 12 different L1s. To explore the effects of CLI, we grouped the heritage languages by the number of consonants permitted in word-initial and word-final position, the segmental make-up of clusters, and stress patterns. The production accuracy of word-initial clusters and word stress was very high, indicating a high degree of maturation and showing no effects of CLI. In contrast, the production accuracy of word-final clusters was lower and effects of CLI were found, presumably related to smaller sonority distances compared to word-initial clusters. The study contributes empirically to the under-investigated area of eL2 word-prosodic development.
- Research DataSection-Type Constraints on the Choice of Linguistic Mechanisms in Research Articles: A Corpus-Based Approach2023This thesis investigates the structure of research articles in the field of Computational Linguistics with the goal of establishing that a set of distinctive linguistic features is associated with each section type. The empirical results of the study are derived from the quantitative and qualitative evaluation of research articles from the ACL Anthology Corpus. More than 20,000 articles were analyzed for the purpose of retrieving the target section types and extracting the predefined set of linguistic features from them. Approximately 1,100 articles were found to contain all of the following five section types: abstract, introduction, related work, discussion, and conclusion. These were chosen for the purpose of comparing the frequency of occurrence of the linguistic features across the section types. Making use of frameworks for Natural Language Processing, the Stanford CoreNLP Module, and the Python library SpaCy, as well as scripts created by the author, the frequency scores of the features were retrieved and analyzed with state-of-the-art statistical techniques. The results show that each section type possesses an individual profile of linguistic features which are associated with it more or less strongly. These section-feature associations are shown to be derivable from the hypothesized purpose of each section type. Overall, the findings reported in this thesis provide insights into the writing strategies that authors employ so that the overall goal of the research paper is achieved. The results of the thesis can find implementation in new state-of-the-art applications that assist academic writing and its evaluation in a way that provides the user with a more sophisticated, empirically based feedback on the relationship between linguistic mechanisms and text type. In addition, the potential of the identification of text-type specific linguistic characteristics (a text-feature mapping) can contribute to the development of more robust language-based models for disinformation detection.