Using a large social media dataset and open-vocabulary methods from computational linguistics, we explored differences in language use across gender, affiliation, and assertiveness. In Study 1, we analyzed topics (groups of semantically similar words) across 10 million messages from over 52,000 Facebook users. Most language differed little across gender. However, topics most associated with self-identified female participants included friends, family, and social life, whereas topics most associated with self-identified male participants included swearing, anger, discussion of objects instead of people, and the use of argumentative language. In Study 2, we plotted male- and female-linked language topics along two interpersonal dimensions prevalent in gender research: affiliation and assertiveness. In a sample of over 15,000 Facebook users, we found substantial gender differences in the use of affiliative language and slight differences in assertive language. Language used more by self-identified females was interpersonally warmer, more compassionate, polite, and-contrary to previous findings-slightly more assertive in their language use, whereas language used more by self-identified males was colder, more hostile, and impersonal. Computational linguistic analysis combined with methods to automatically label topics offer means for testing psychological theories unobtrusively at large scale.
Adequate normalization minimizes the effects of systematic technical variations and is a prerequisite for getting meaningful biological changes. However, there is inconsistency about miRNA normalization performances and recommendations. Thus, we investigated the impact of seven different normalization methods (reference gene index, global geometric mean, quantile, invariant selection, loess, loessM, and generalized procrustes analysis) on intra- and inter-platform performance of two distinct and commonly used miRNA profiling platforms.
We contrasted the predictive power of three measures of semantic richness-number of features (NFs), contextual dispersion (CD), and a novel measure of number of semantic neighbors (NSN)-for a large set of concrete and abstract concepts on lexical decision and naming tasks. NSN (but not NF) facilitated processing for abstract concepts, while NF (but not NSN) facilitated processing for the most concrete concepts, consistent with claims that linguistic information is more relevant for abstract concepts in early processing. Additionally, converging evidence from two datasets suggests that when NSN and CD are controlled for, the features that most facilitate processing are those associated with a concept’s physical characteristics and real-world contexts. These results suggest that rich linguistic contexts (many semantic neighbors) facilitate early activation of abstract concepts, whereas concrete concepts benefit more from rich physical contexts (many associated objects and locations).
The Food Choice Questionnaire (FCQ) assesses the importance that subjects attribute to nine factors related to food choices: health, mood, convenience, sensory appeal, natural content, price, weight control, familiarity and ethical concern. This study sought to assess the applicability of the FCQ in Brazil; it describes the translation and cultural adaptation from English into Portuguese of the FCQ via the following steps: independent translations, consensus, back-translation, evaluation by a committee of experts, semantic validation and pre-test. The pre-test was run with a randomly sampled group of 86 male and female college students from different courses with a median age of 19. Slight differences between the versions were observed and adjustments were made. After minor changes in the translation process, the committee of experts considered that the Brazilian Portuguese version was semantically and conceptually equivalent to the English original. Semantic validation showed that the questionnaire is easily understood. The instrument presented a high degree of internal consistency. The study is the first stage in the process of validating an instrument, which consists of face and content validity. Further stages, already underway, are needed before other researchers can use it.
Why do people self-report an aversion to words like “moist”? The present studies represent an initial scientific exploration into the phenomenon of word aversion by investigating its prevalence and cause. Results of five experiments indicate that about 10-20% of the population is averse to the word “moist.” This population often speculates that phonological properties of the word are the cause of their displeasure. However, data from the current studies point to semantic features of the word-namely, associations with disgusting bodily functions-as a more prominent source of peoples' unpleasant experience. “Moist,” for averse participants, was notable for its valence and personal use, rather than imagery or arousal-a finding that was confirmed by an experiment designed to induce an aversion to the word. Analyses of individual difference measures suggest that word aversion is more prevalent among younger, more educated, and more neurotic people, and is more commonly reported by females than males.
The meaning of language is represented in regions of the cerebral cortex collectively known as the ‘semantic system’. However, little of the semantic system has been mapped comprehensively, and the semantic selectivity of most regions is unknown. Here we systematically map semantic selectivity across the cortex using voxel-wise modelling of functional MRI (fMRI) data collected while subjects listened to hours of narrative stories. We show that the semantic system is organized into intricate patterns that seem to be consistent across individuals. We then use a novel generative model to create a detailed semantic atlas. Our results suggest that most areas within the semantic system represent information about specific semantic domains, or groups of related concepts, and our atlas shows which domains are represented in each area. This study demonstrates that data-driven methods–commonplace in studies of human neuroanatomy and functional connectivity–provide a powerful and efficient means for mapping functional representations in the brain.
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 1 year ago
How universal is human conceptual structure? The way concepts are organized in the human brain may reflect distinct features of cultural, historical, and environmental background in addition to properties universal to human cognition. Semantics, or meaning expressed through language, provides indirect access to the underlying conceptual structure, but meaning is notoriously difficult to measure, let alone parameterize. Here, we provide an empirical measure of semantic proximity between concepts using cross-linguistic dictionaries to translate words to and from languages carefully selected to be representative of worldwide diversity. These translations reveal cases where a particular language uses a single “polysemous” word to express multiple concepts that another language represents using distinct words. We use the frequency of such polysemies linking two concepts as a measure of their semantic proximity and represent the pattern of these linkages by a weighted network. This network is highly structured: Certain concepts are far more prone to polysemy than others, and naturally interpretable clusters of closely related concepts emerge. Statistical analysis of the polysemies observed in a subset of the basic vocabulary shows that these structural properties are consistent across different language groups, and largely independent of geography, environment, and the presence or absence of a literary tradition. The methods developed here can be applied to any semantic domain to reveal the extent to which its conceptual structure is, similarly, a universal attribute of human cognition and language use.
The psychological state of love is difficult to define, and we often rely on metaphors to communicate about this state and its constituent experiences. Commonly, these metaphors liken love to a physical force-it sweeps us off our feet, causes sparks to fly, and ignites flames of passion. Even the use of “attraction” to refer to romantic interest, commonplace in both popular and scholarly discourse, implies a force propelling two objects together. The present research examined the effects of exposing participants to a physical force (magnetism) on subsequent judgments of romantic outcomes. Across two studies, participants exposed to magnets reported greater levels of satisfaction, attraction, intimacy, and commitment.
Creativity is a complex, multi-faceted concept encompassing a variety of related aspects, abilities, properties and behaviours. If we wish to study creativity scientifically, then a tractable and well-articulated model of creativity is required. Such a model would be of great value to researchers investigating the nature of creativity and in particular, those concerned with the evaluation of creative practice. This paper describes a unique approach to developing a suitable model of how creative behaviour emerges that is based on the words people use to describe the concept. Using techniques from the field of statistical natural language processing, we identify a collection of fourteen key components of creativity through an analysis of a corpus of academic papers on the topic. Words are identified which appear significantly often in connection with discussions of the concept. Using a measure of lexical similarity to help cluster these words, a number of distinct themes emerge, which collectively contribute to a comprehensive and multi-perspective model of creativity. The components provide an ontology of creativity: a set of building blocks which can be used to model creative practice in a variety of domains. The components have been employed in two case studies to evaluate the creativity of computational systems and have proven useful in articulating achievements of this work and directions for further research.
The claim that Eskimo languages have words for different types of snow is well-known among the public, but has been greatly exaggerated through popularization and is therefore viewed with skepticism by many scholars of language. Despite the prominence of this claim, to our knowledge the line of reasoning behind it has not been tested broadly across languages. Here, we note that this reasoning is a special case of the more general view that language is shaped by the need for efficient communication, and we empirically test a variant of it against multiple sources of data, including library reference works, Twitter, and large digital collections of linguistic and meteorological data. Consistent with the hypothesis of efficient communication, we find that languages that use the same linguistic form for snow and ice tend to be spoken in warmer climates, and that this association appears to be mediated by lower communicative need to talk about snow and ice. Our results confirm that variation in semantic categories across languages may be traceable in part to local communicative needs. They suggest moreover that despite its awkward history, the topic of “words for snow” may play a useful role as an accessible instance of the principle that language supports efficient communication.