Discover the most talked about and latest scientific content & concepts.

Concept: Similarity


Human social networks are overwhelmingly homophilous: individuals tend to befriend others who are similar to them in terms of a range of physical attributes (e.g., age, gender). Do similarities among friends reflect deeper similarities in how we perceive, interpret, and respond to the world? To test whether friendship, and more generally, social network proximity, is associated with increased similarity of real-time mental responding, we used functional magnetic resonance imaging to scan subjects' brains during free viewing of naturalistic movies. Here we show evidence for neural homophily: neural responses when viewing audiovisual movies are exceptionally similar among friends, and that similarity decreases with increasing distance in a real-world social network. These results suggest that we are exceptionally similar to our friends in how we perceive and respond to the world around us, which has implications for interpersonal influence and attraction.

Concepts: Psychology, Brain, Sociology, Magnetic resonance imaging, Similarity, Friendship, Social network


Species exposed to extreme environments often exhibit distinctive traits that help meet the demands of such habitats. Such traits could evolve independently, but under intense selective pressures of extreme environments some existing structures or behaviors might be coopted to meet specialized demands, evolving via the process of exaptation. We evaluated the potential for exaptation to have operated in the evolution of novel behaviors of the waterfall-climbing gobiid fish genus Sicyopterus. These fish use an “inching” behavior to climb waterfalls, in which an oral sucker is cyclically protruded and attached to the climbing surface. They also exhibit a distinctive feeding behavior, in which the premaxilla is cyclically protruded to scrape diatoms from the substrate. Given the similarity of these patterns, we hypothesized that one might have been coopted from the other. To evaluate this, we filmed climbing and feeding in Sicyopterus stimpsoni from Hawai'i, and measured oral kinematics for two comparisons. First, we compared feeding kinematics of S. stimpsoni with those for two suction feeding gobiids (Awaous guamensis and Lentipes concolor), assessing what novel jaw movements were required for algal grazing. Second, we quantified the similarity of oral kinematics between feeding and climbing in S. stimpsoni, evaluating the potential for either to represent an exaptation from the other. Premaxillary movements showed the greatest differences between scraping and suction feeding taxa. Between feeding and climbing, overall profiles of oral kinematics matched closely for most variables in S. stimpsoni, with only a few showing significant differences in maximum values. Although current data cannot resolve whether oral movements for climbing were coopted from feeding, or feeding movements coopted from climbing, similarities between feeding and climbing kinematics in S. stimpsoni are consistent with evidence of exaptation, with modifications, between these behaviors. Such comparisons can provide insight into the evolutionary mechanisms facilitating exploitation of extreme habitats.

Concepts: Natural selection, Evolution, Symbiosis, Difference, Behavior, Similarity, Gobiidae, Climbing


BACKGROUND: Semantic similarity measures estimate the similarity between concepts, and play an important role in many text processing tasks. Approaches to semantic similarity in the biomedical domain can be roughly divided into knowledge based and distributional based methods. Knowledge based approaches utilize knowledge sources such as dictionaries, taxonomies, and semantic networks, and include path finding measures and intrinsic information content (IC) measures. Distributional measures utilize, in addition to a knowledge source, the distribution of concepts within a corpus to compute similarity; these include corpus IC and context vector methods. Prior evaluations of these measures in the biomedical domain showed that distributional measures outperform knowledge based path finding methods; but more recent studies suggested that intrinsic IC based measures exceed the accuracy of distributional approaches. Limitations of previous evaluations of similarity measures in the biomedical domain include their focus on the SNOMED CT ontology, and their reliance on small benchmarks not powered to detect significant differences between measure accuracy. There have been few evaluations of the relative performance of these measures on other biomedical knowledge sources such as the UMLS, and on larger, recently developed semantic similarity benchmarks. RESULTS: We evaluated knowledge based and corpus IC based semantic similarity measures derived from SNOMED CT, MeSH, and the UMLS on recently developed semantic similarity benchmarks. Semantic similarity measures based on the UMLS, which contains SNOMED CT and MeSH, significantly outperformed those based solely on SNOMED CT or MeSH across evaluations. Intrinsic IC based measures significantly outperformed path-based and distributional measures. We released all code required to reproduce our results and all tools developed as part of this study as open source, available under We provide a publicly-accessible web service to compute semantic similarity, available under CONCLUSIONS: Knowledge based semantic similarity measures are more practical to compute than distributional measures, as they do not require an external corpus. Furthermore, knowledge based measures significantly and meaningfully outperformed distributional measures on large semantic similarity benchmarks, suggesting that they are a practical alternative to distributional measures. Future evaluations of semantic similarity measures should utilize benchmarks powered to detect significant differences in measure accuracy.

Concepts: Evaluation, Difference, Similarity, Medical classification, Systematized Nomenclature of Medicine, WordNet, SNOMED CT, Semantic similarity


Humans tend to form social relationships with others who resemble them. Whether this sorting of like with like arises from historical patterns of migration, meso-level social structures in modern society, or individual-level selection of similar peers remains unsettled. Recent research has evaluated the possibility that unobserved genotypes may play an important role in the creation of homophilous relationships. We extend this work by using data from 5,500 adolescents from the National Longitudinal Study of Adolescent to Adult Health (Add Health) to examine genetic similarities among pairs of friends. Although there is some evidence that friends have correlated genotypes, both at the whole-genome level as well as at trait-associated loci (via polygenic scores), further analysis suggests that meso-level forces, such as school assignment, are a principal source of genetic similarity between friends. We also observe apparent social-genetic effects in which polygenic scores of an individual’s friends and schoolmates predict the individual’s own educational attainment. In contrast, an individual’s height is unassociated with the height genetics of peers.

Concepts: DNA, Gene, Genetics, Sociology, Adolescence, Similarity


It is assumed that synaptic strengthening and weakening balance throughout learning to avoid runaway potentiation and memory interference. However, energetic and informational considerations suggest that potentiation should occur primarily during wake, when animals learn, and depression should occur during sleep. We measured 6920 synapses in mouse motor and sensory cortices using three-dimensional electron microscopy. The axon-spine interface (ASI) decreased ~18% after sleep compared with wake. This decrease was proportional to ASI size, which is indicative of scaling. Scaling was selective, sparing synapses that were large and lacked recycling endosomes. Similar scaling occurred for spine head volume, suggesting a distinction between weaker, more plastic synapses (~80%) and stronger, more stable synapses. These results support the hypothesis that a core function of sleep is to renormalize overall synaptic strength increased by wake.

Concepts: Electron, Mass, Learning, Long-term potentiation, Chemical synapse, Similarity, Interference theory


In this paper we explore the results of a large-scale online game called ‘the Great Language Game’, in which people listen to an audio speech sample and make a forced-choice guess about the identity of the language from 2 or more alternatives. The data include 15 million guesses from 400 audio recordings of 78 languages. We investigate which languages are confused for which in the game, and if this correlates with the similarities that linguists identify between languages. This includes shared lexical items, similar sound inventories and established historical relationships. Our findings are, as expected, that players are more likely to confuse two languages that are objectively more similar. We also investigate factors that may affect players' ability to accurately select the target language, such as how many people speak the language, how often the language is mentioned in written materials and the economic power of the target language community. We see that non-linguistic factors affect players' ability to accurately identify the target. For example, languages with wider ‘global reach’ are more often identified correctly. This suggests that both linguistic and cultural knowledge influence the perception and recognition of languages and their similarity.

Concepts: Cognition, Linguistics, Language, Similarity, Semiotics, Translation, Word game


This paper reports an analysis and comparison of the use of 51 different similarity coefficients for computing the similarities between binary fingerprints for both simulated and real chemical data sets. Five pairs and a triplet of coefficients were found to yield identical similarity values, leading to the elimination of seven of the coefficients. The remaining 44 coefficients were then compared in two ways: by their theoretical characteristics using simple descriptive statistics, correlation analysis, multidimensional scaling, Hasse diagrams, and the recently described atemporal target diffusion model; and by their effectiveness for similarity-based virtual screening using MDDR, WOMBAT, and MUV data. The comparisons demonstrate the general utility of the well-known Tanimoto method but also suggest other coefficients that may be worthy of further attention.

Concepts: Scientific method, Difference, Computer, Order theory, Similarity, Diagrams, Mathematical diagram, Hasse diagram


The aim of this paper was to find possible link between molecular and morphological similarities of 38 Hungarian white grape varieties. Three aspects of morphological and molecular similarity were assessed in the study: comparison of the ordered variety pairs, assessment of molecular and morphological mean similarity differences and separation of varieties into similar groups by divisive cluster analysis to define (DIANA). Molecular similarity was calculated from binary data based on allele sizes obtained in DNA analysis. DNA fingerprints were determined at 9 SSR loci recommended by the European GrapeGen06 project. Morphological similarity was calculated on the basis of quantitative morphological descriptors. Morphological and molecular similarity values were ordered and categorized after pairwise comparison. Overall correlation was found to be weak but case by case assessment of the variety pairs confirmed some coincidence of molecular and morphological similarity. General similarity position of each variety was characterized by Mean Similarity Index (MSI). It was calculated as the mean of n-1 pair similarity values of the variety concerned. Varieties were ordered and compared by the difference of the index. Five varieties had low morphological and high molecular MSI meaning that they share several SSR marker alleles with the others but seems relatively distinct according to the expression of their morphological traits. Divisive cluster analysis was carried out to find similar groups. Eight and twelve cluster solutions proved to be sufficient to distinct varieties. Morphological and molecular similarity groups partly coincided according to the results. Several clusters reflected parent offspring relations but molecular clustering gave more realistic results concerning pedigree.

Concepts: DNA, Molecular biology, Difference, Vitis vinifera, Similarity, Vitis, Semantic similarity, Vitis labrusca


Opposing forces influence assortative mating so that one seeks a similar mate while at the same time avoiding inbreeding with close relatives. Thus, mate choice may be a balancing of phenotypic similarity and dissimilarity between partners. In the present study, we assessed the role of resemblance to Self’s facial traits in judgments of physical attractiveness. Participants chose the most attractive face image of their romantic partner among several variants, where the faces were morphed so as to include only 22% of another face. Participants distinctly preferred a “Self-based morph” (i.e., their partner’s face with a small amount of Self’s face blended into it) to other morphed images. The Self-based morph was also preferred to the morph of their partner’s face blended with the partner’s same-sex “prototype”, although the latter face was (“objectively”) judged more attractive by other individuals. When ranking morphs differing in level of amalgamation (i.e., 11% vs. 22% vs. 33%) of another face, the 22% was chosen consistently as the preferred morph and, in particular, when Self was blended in the partner’s face. A forced-choice signal-detection paradigm showed that the effect of self-resemblance operated at an unconscious level, since the same participants were unable to detect the presence of their own faces in the above morphs. We concluded that individuals, if given the opportunity, seek to promote “positive assortment” for Self’s phenotype, especially when the level of similarity approaches an optimal point that is similar to Self without causing a conscious acknowledgment of the similarity.

Concepts: Sexual selection, Choice, Similarity, Physical attractiveness, Assortative mating


Hashing is emerging as a powerful tool for building highly efficient indices in large-scale search systems. In this paper, we study spectral hashing (SH), which is a classical method of unsupervised hashing. In general, SH solves for the hash codes by minimizing an objective function that tries to preserve the similarity structure of the data given. Although computationally simple, very often SH performs unsatisfactorily and lags distinctly behind the state-of-the-art methods. We observe that the inferior performance of SH is mainly due to its imperfect formulation; that is, the optimization of the minimization problem in SH actually cannot ensure that the similarity structure of the high-dimensional data is really preserved in the low-dimensional hash code space. In this paper, we, therefore, introduce reversed SH (ReSH), which is SH with its input and output interchanged. Unlike SH, which estimates the similarity structure from the given high-dimensional data, our ReSH defines the similarities between data points according to the unknown low-dimensional hash codes. Equipped with such a reversal mechanism, ReSH can seamlessly overcome the drawback of SH. More precisely, the minimization problem in our ReSH can be optimized if and only if similar data points are mapped to adjacent hash codes, and mostly important, dissimilar data points are considerably separated from each other in the code space. Finally, we solve the minimization problem in ReSH by multilayer neural networks and obtain state-of-the-art retrieval results on three benchmark data sets.

Concepts: Similarity, Input, Output, Hash table, Hash function, Cryptographic hash function