Discover the most talked about and latest scientific content & concepts.

Concept: Cdx protein family


The role of protein-lipid interactions is increasingly recognized to be of importance in numerous biological processes. Bioinformatics is being increasingly used as a helpful tool in studying protein-lipid interactions. Especially recently developed approaches recognizing lipid binding regions in proteins can be implemented. In this study one of those bioinformatics approaches specialized in identifying lipid binding helical regions in proteins is expanded. The approach is explored further by features which can be easily obtained manually. Some interesting examples of members of the amphitropic protein family have been investigated in order to demonstrate the additional features of this bioinformatics approach. The results in this study seem to indicate interesting characteristics of amphitropic proteins and provide insight into the mechanistic functioning and overall understanding of this intriguing class of proteins. Additionally, the results demonstrate that the presented bioinformatics approach might be either an interesting starting point in protein-lipid interactions studies or a good tool for selecting new focus points for more detailed experimental research of proteins with known overall protein-lipid binding abilities.

Concepts: DNA, Proteins, Protein, Bioinformatics, Molecular biology, Metabolism, Proteome, Cdx protein family


Protein domains are commonly used to assess the functional roles and evolutionary relationships of proteins and protein families. Here, we use the Pfam protein family database to examine a set of candidate partial domains. Pfam protein domains are often thought of as evolutionarily indivisible, structurally compact, units from which larger functional proteins are assembled; however, almost 4% of Pfam27 PfamA domains are shorter than 50% of their family model length, suggesting that more than half of the domain is missing at those locations. To better understand the structural nature of partial domains in proteins, we examined 30,961 partial domain regions from 136 domain families contained in a representative subset of PfamA domains (RefProtDom2 or RPD2).

Concepts: Protein structure, Bioinformatics, Evolution, Biology, Phylogenetic tree, Protein domain, Cdx protein family, Protein family


From yeast to mammals, autophagy is an important mechanism for sustaining cellular homeostasis through facilitating the degradation and recycling of aged and cytotoxic components. During autophagy, cargo is captured in double-membraned vesicles, the autophagosomes, and degraded through lysosomal fusion. In yeast, autophagy initiation, cargo recognition, cargo engulfment, and vesicle closure is Atg8 dependent. In higher eukaryotes, Atg8 has evolved into the LC3/GABARAP protein family consisting of 7 family proteins [LC3A (2 splice variants), LC3B, LC3C, GABARAP, GABARAPL1, and GABARAPL2]. LC3B, the most studied family protein, is associated with autophagosome development and maturation and is used to monitor autophagic activity. Given the high homology, the other LC3/GABARAP family proteins are often presumed to fulfill similar functions. Nevertheless, substantial evidence shows that the LC3/GABARAP family proteins are unique in function and important in autophagy-independent mechanisms. In this review, we discuss the current knowledge and function(s) of the LC3/GABARAP family proteins. We focus on processing of the individual family proteins and their role in autophagy initiation, cargo recognition, vesicle closure, and trafficking, a complex and tightly regulated process that requires selective presentation and recruitment of these family proteins. In addition, functions unrelated to autophagy of the LC3/GABARAP protein family members are discussed.-Schaaf, M. B. E., Keulers, T. G, Vooijs, M. A., Rouschop, K. M. A. LC3/GABARAP family proteins: autophagy-(un)related functions.

Concepts: DNA, Protein, Cell, Bioinformatics, Metabolism, Endoplasmic reticulum, Cdx protein family, Protein family


HAMAP (High-quality Automated and Manual Annotation of Proteins-available at is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profiles.

Concepts: DNA, Proteins, Protein, Protein structure, Bioinformatics, Curator, Cdx protein family, Protein family


Pentatricopeptide repeat proteins are one of the major protein families in flowering plants, containing around 450 members. They participate in RNA editing and are related to plant growth, development and reproduction, as well as to responses to ABA and abiotic stresses. Their characteristics have been described in silico; however, relatively little is known about their biochemical properties. Different types of PPR proteins, with different tasks in RNA editing, have been suggested to interact in an editosome to complete RNA editing. Other non-PPR editing factors, such as the multiple organellar RNA editing factors and the organelle RNA recognition motif-containing protein family, for example, have also been described in plants. However, while evidence on protein interactions between non-PPR RNA editing proteins is accumulating, very few PPR protein interactions have been reported; possibly due to their high instability. In this manuscript, we aimed to optimize the conditions for non-denaturing protein extraction of PPR proteins allowing in vivo protein analyses, such as interaction assays by co-immunoprecipitation. The unusually high protein degradation rate, the aggregation properties and the high pI, as well as the ATP-dependence of some PPR proteins, are key aspects to be considered when extracting PPR proteins in a non-denatured state. During extraction of PPR proteins, the use of proteasome and phosphatase inhibitors is critical. The use of the ATP-cofactor reduces considerably the degradation of PPR proteins. A short centrifugation step to discard cell debris is essential to avoid PPR precipitation; while in some cases, addition of a reductant is needed, probably caused by the pI/pH context. This work provides an easy and rapid optimized non-denaturing total protein extraction protocol from plant tissue, suitable for polypeptides of the PPR family.

Concepts: DNA, Proteins, Protein, Cell nucleus, Cell, Amino acid, Ribosome, Cdx protein family


Active molecules among numerous chemical structures in a chemical database can be searched easily by statistical prediction of compound-protein interactions. However, constructing a simple prediction model against one protein does not aid drug design, because detecting chemical structures that act similarly against multiple proteins is necessary for preventing side effects of the potential drug. To tackle this problem, we propose a new method that visualizes chemical and protein spaces. For simultaneous visualization of both spaces, we employ a counterpropagation neural network (CPNN) and develop a new visualization method named multi-input CPNN (MICPNN). In a case study of the kinase protein family, the MICPNN model predicted accurately the complex relationships between compounds and proteins. The proposed method identified chemical structures with promising activity against kinases. Our proposed method is also applicable to other protein families, such as G-protein coupled receptors, ion channels and transporters.

Concepts: Scientific method, Protein, Bioinformatics, Signal transduction, Enzyme, Cell membrane, Receptor, Cdx protein family


The retinoblastoma protein (Rb) and the homologous pocket proteins p107 and p130 negatively regulate cell proliferation by binding and inhibiting members of the E2F transcription factor family. The structural features that distinguish Rb from other pocket proteins have been unclear but are critical for understanding their functional diversity and determining why Rb has unique tumor suppressor activities. We describe here important differences in how the Rb and p107 C-terminal domains (CTDs) associate with the coiled-coil and marked-box domains (CMs) of E2Fs. We find that although CTD-CM binding is conserved across protein families, Rb and p107 CTDs show clear preferences for different E2Fs. A crystal structure of the p107 CTD bound to E2F5 and its dimer partner DP1 reveals the molecular basis for pocket protein-E2F binding specificity and how cyclin-dependent kinases differentially regulate pocket proteins through CTD phosphorylation. Our structural and biochemical data together with phylogenetic analyses of Rb and E2F proteins support the conclusion that Rb evolved specific structural motifs that confer its unique capacity to bind with high affinity those E2Fs that are the most potent activators of the cell cycle.

Concepts: DNA, Proteins, Protein, Cell nucleus, Molecular biology, Signal transduction, Cell cycle, Cdx protein family


Nucleotide-binding domain and leucine-rich repeat domain-containing (NLR) proteins are sentinels of plant immunity that monitor host proteins for perturbations induced by pathogenic effector proteins. Here we show that the Arabidopsis ZAR1 NLR protein requires the ZRK3 kinase to recognize the Pseudomonas syringae type III effector (T3E) HopF2a. These results support the hypothesis that ZAR1 associates with an expanded ZRK protein family to broaden its effector recognition spectrum.

Concepts: Proteins, Protein structure, Signal transduction, Enzyme, Cdx protein family, Pseudomonas syringae


In functionally diverse protein families, conservation in short signature regions may outperform full-length sequence comparisons for identifying proteins that belong to a subgroup within which one specific aspect of their function is conserved. The SIMBAL workflow (Sites Inferred by Metabolic Background Assertion Labeling) is a data-mining procedure for finding such signature regions. It begins by using clues from genomic context, such as co-occurrence or conserved gene neighborhoods, to build a useful training set from a large number of uncharacterized but mutually homologous proteins. When training set construction is successful, the YES partition is enriched in proteins that share function with the user’s query sequence, while the NO partition is depleted. A selected query sequence is then mined for short signature regions whose closest matches overwhelmingly favor proteins from the YES partition. High-scoring signature regions typically contain key residues critical to functional specificity, so proteins with the highest sequence similarity across these regions tend to share the same function. The SIMBAL algorithm was described previously, but significant manual effort, expertise, and a supporting software infrastructure were required to prepare the requisite training sets. Here, we describe a new, distributable software suite that speeds up and simplifies the process for using SIMBAL, most notably by providing tools that automate training set construction. These tools have broad utility for comparative genomics, allowing for flexible collection of proteins or protein domains based on genomic context as well as homology, a capability that can greatly assist in protein family construction. Armed with this new software suite, SIMBAL can serve as a fast and powerful in silico alternative to direct experimentation for characterizing proteins and their functional interactions.

Concepts: DNA, Gene, Genetics, Bioinformatics, Evolution, Enzyme, Cdx protein family, Sequence clustering


Solving the structure of Pla l 1 elucidated the preserved fold of Ole e 1-like proteins while IgE cross-reactivity in this family is limited to molecules with high sequence identity. Diagnostic accuracy using source-specific Ole e 1-like molecules is essential for discriminating plantain from other pollen allergies.

Concepts: Immune system, DNA, Protein, Bioinformatics, Asthma, Allergy, Sequence alignment, Cdx protein family