Using a large social media dataset and open-vocabulary methods from computational linguistics, we explored differences in language use across gender, affiliation, and assertiveness. In Study 1, we analyzed topics (groups of semantically similar words) across 10 million messages from over 52,000 Facebook users. Most language differed little across gender. However, topics most associated with self-identified female participants included friends, family, and social life, whereas topics most associated with self-identified male participants included swearing, anger, discussion of objects instead of people, and the use of argumentative language. In Study 2, we plotted male- and female-linked language topics along two interpersonal dimensions prevalent in gender research: affiliation and assertiveness. In a sample of over 15,000 Facebook users, we found substantial gender differences in the use of affiliative language and slight differences in assertive language. Language used more by self-identified females was interpersonally warmer, more compassionate, polite, and-contrary to previous findings-slightly more assertive in their language use, whereas language used more by self-identified males was colder, more hostile, and impersonal. Computational linguistic analysis combined with methods to automatically label topics offer means for testing psychological theories unobtrusively at large scale.
Adequate normalization minimizes the effects of systematic technical variations and is a prerequisite for getting meaningful biological changes. However, there is inconsistency about miRNA normalization performances and recommendations. Thus, we investigated the impact of seven different normalization methods (reference gene index, global geometric mean, quantile, invariant selection, loess, loessM, and generalized procrustes analysis) on intra- and inter-platform performance of two distinct and commonly used miRNA profiling platforms.
We contrasted the predictive power of three measures of semantic richness-number of features (NFs), contextual dispersion (CD), and a novel measure of number of semantic neighbors (NSN)-for a large set of concrete and abstract concepts on lexical decision and naming tasks. NSN (but not NF) facilitated processing for abstract concepts, while NF (but not NSN) facilitated processing for the most concrete concepts, consistent with claims that linguistic information is more relevant for abstract concepts in early processing. Additionally, converging evidence from two datasets suggests that when NSN and CD are controlled for, the features that most facilitate processing are those associated with a concept’s physical characteristics and real-world contexts. These results suggest that rich linguistic contexts (many semantic neighbors) facilitate early activation of abstract concepts, whereas concrete concepts benefit more from rich physical contexts (many associated objects and locations).
The Food Choice Questionnaire (FCQ) assesses the importance that subjects attribute to nine factors related to food choices: health, mood, convenience, sensory appeal, natural content, price, weight control, familiarity and ethical concern. This study sought to assess the applicability of the FCQ in Brazil; it describes the translation and cultural adaptation from English into Portuguese of the FCQ via the following steps: independent translations, consensus, back-translation, evaluation by a committee of experts, semantic validation and pre-test. The pre-test was run with a randomly sampled group of 86 male and female college students from different courses with a median age of 19. Slight differences between the versions were observed and adjustments were made. After minor changes in the translation process, the committee of experts considered that the Brazilian Portuguese version was semantically and conceptually equivalent to the English original. Semantic validation showed that the questionnaire is easily understood. The instrument presented a high degree of internal consistency. The study is the first stage in the process of validating an instrument, which consists of face and content validity. Further stages, already underway, are needed before other researchers can use it.
Why do people self-report an aversion to words like “moist”? The present studies represent an initial scientific exploration into the phenomenon of word aversion by investigating its prevalence and cause. Results of five experiments indicate that about 10-20% of the population is averse to the word “moist.” This population often speculates that phonological properties of the word are the cause of their displeasure. However, data from the current studies point to semantic features of the word-namely, associations with disgusting bodily functions-as a more prominent source of peoples' unpleasant experience. “Moist,” for averse participants, was notable for its valence and personal use, rather than imagery or arousal-a finding that was confirmed by an experiment designed to induce an aversion to the word. Analyses of individual difference measures suggest that word aversion is more prevalent among younger, more educated, and more neurotic people, and is more commonly reported by females than males.
The meaning of language is represented in regions of the cerebral cortex collectively known as the ‘semantic system’. However, little of the semantic system has been mapped comprehensively, and the semantic selectivity of most regions is unknown. Here we systematically map semantic selectivity across the cortex using voxel-wise modelling of functional MRI (fMRI) data collected while subjects listened to hours of narrative stories. We show that the semantic system is organized into intricate patterns that seem to be consistent across individuals. We then use a novel generative model to create a detailed semantic atlas. Our results suggest that most areas within the semantic system represent information about specific semantic domains, or groups of related concepts, and our atlas shows which domains are represented in each area. This study demonstrates that data-driven methods–commonplace in studies of human neuroanatomy and functional connectivity–provide a powerful and efficient means for mapping functional representations in the brain.
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 2 years ago
How universal is human conceptual structure? The way concepts are organized in the human brain may reflect distinct features of cultural, historical, and environmental background in addition to properties universal to human cognition. Semantics, or meaning expressed through language, provides indirect access to the underlying conceptual structure, but meaning is notoriously difficult to measure, let alone parameterize. Here, we provide an empirical measure of semantic proximity between concepts using cross-linguistic dictionaries to translate words to and from languages carefully selected to be representative of worldwide diversity. These translations reveal cases where a particular language uses a single “polysemous” word to express multiple concepts that another language represents using distinct words. We use the frequency of such polysemies linking two concepts as a measure of their semantic proximity and represent the pattern of these linkages by a weighted network. This network is highly structured: Certain concepts are far more prone to polysemy than others, and naturally interpretable clusters of closely related concepts emerge. Statistical analysis of the polysemies observed in a subset of the basic vocabulary shows that these structural properties are consistent across different language groups, and largely independent of geography, environment, and the presence or absence of a literary tradition. The methods developed here can be applied to any semantic domain to reveal the extent to which its conceptual structure is, similarly, a universal attribute of human cognition and language use.
Causal inference is a core task of science. However, authors and editors often refrain from explicitly acknowledging the causal goal of research projects; they refer to causal effect estimates as associational estimates. This commentary argues that using the term “causal” is necessary to improve the quality of observational research. Specifically, being explicit about the causal objective of a study reduces ambiguity in the scientific question, errors in the data analysis, and excesses in the interpretation of the results. (Am J Public Health. Published online ahead of print March 22, 2018: e1-e4. doi:10.2105/AJPH.2018.304337).
The psychological state of love is difficult to define, and we often rely on metaphors to communicate about this state and its constituent experiences. Commonly, these metaphors liken love to a physical force-it sweeps us off our feet, causes sparks to fly, and ignites flames of passion. Even the use of “attraction” to refer to romantic interest, commonplace in both popular and scholarly discourse, implies a force propelling two objects together. The present research examined the effects of exposing participants to a physical force (magnetism) on subsequent judgments of romantic outcomes. Across two studies, participants exposed to magnets reported greater levels of satisfaction, attraction, intimacy, and commitment.
- Proceedings of the National Academy of Sciences of the United States of America
- Published 8 months ago
Understanding how and why language subsystems differ in their evolutionary dynamics is a fundamental question for historical and comparative linguistics. One key dynamic is the rate of language change. While it is commonly thought that the rapid rate of change hampers the reconstruction of deep language relationships beyond 6,000-10,000 y, there are suggestions that grammatical structures might retain more signal over time than other subsystems, such as basic vocabulary. In this study, we use a Dirichlet process mixture model to infer the rates of change in lexical and grammatical data from 81 Austronesian languages. We show that, on average, most grammatical features actually change faster than items of basic vocabulary. The grammatical data show less schismogenesis, higher rates of homoplasy, and more bursts of contact-induced change than the basic vocabulary data. However, there is a core of grammatical and lexical features that are highly stable. These findings suggest that different subsystems of language have differing dynamics and that careful, nuanced models of language change will be needed to extract deeper signal from the noise of parallel evolution, areal readaptation, and contact.