학술논문

The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs
Article
Document Type
Academic Journal
Source
Behavior Research Methods. August 2021, Vol. 53 Issue 4, p1799, 18 p.
Subject
Croatia
United Kingdom
Language
English
Abstract
Author(s): Anita Peti-Stantic [sup.1] [sup.2], Maja Andel [sup.1] [sup.3], Vedrana Gnjidic [sup.1], Gordana Kerestes [sup.1] [sup.4], Nikola Ljubesic [sup.1] [sup.5] [sup.6], Irina Masnikosa [sup.1], Mirjana Tonkovic [sup.1] [sup.4], Jelena Tusek [...]
Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but word class is seldom systematically treated, although there are indications of its significance for imageability and concreteness. This paper presents the Croatian Psycholinguistic Database (CPD; available at: https://doi.org/10.17234/megahr.2019.hpb), containing 6000 Croatian nouns, verbs, adjectives and adverbs, rated for concreteness, imageability, age of acquisition, and subjective frequency. Moreover, we present computationally obtained extrapolations of concreteness and imageability to the remainder of the Croatian lexicon (available at: https://github.com/megahr/lexicon/blob/master/predictions/hr_c_i.predictions.txt). In the two studies presented here, we explore the significance of word class for concreteness and imageability in human and computationally obtained ratings. The observed correlations in the CPD indicate correspondences between psycholinguistic measures expected from the literature. Word classes exhibit differences in subjective frequency, age of acquisition, concreteness and imageability, with significant differences between nouns, verbs, adjectives and adverbs. In the computational study which focused on concreteness and imageability, concreteness obtained higher correlations with human ratings than imageability, and the system underpredicted the concreteness of nouns, and overpredicted the concreteness of adjectives and adverbs. Overall, this suggests that word class contains schematic conceptual and distributional information. Schematic conceptual content seems to be more significant in human ratings of concreteness and less significant in computationally obtained ratings, where distributional information seems to play a more significant role. This suggests that word class differences should be theoretically explored.