Journal: Journal of applied genetics


Cornelia de Lange syndrome (CdLS) is a rare multi-system genetic disorder characterised by growth and developmental delay, distinctive facial dysmorphism, limb malformations and multiple organ defects. The disease is caused by mutations in genes responsible for the formation and regulation of cohesin complex. About half of the cases result from mutations in the NIPBL gene coding delangin, a protein regulating the initialisation of cohesion. To date, approximately 250 point mutations have been identified in more than 300 CdLS patients worldwide. In the present study, conducted on a group of 64 unrelated Polish CdLS patients, 25 various NIPBL sequence variants, including 22 novel point mutations, were detected. Additionally, large genomic deletions on chromosome 5p13 encompassing the NIPBL gene locus were detected in two patients with the most severe CdLS phenotype. Taken together, 42 % of patients were found to have a deleterious alteration affecting the NIPBL gene, by and large private ones (89 %). The review of the types of mutations found so far in Polish patients, their frequency and correlation with the severity of the observed phenotype shows that Polish CdLS cases do not significantly differ from other populations.

Concepts: DNA, Gene, Genetics, Mutation, Evolution, Chromosome, Locus, Cornelia de Lange Syndrome


We applied, for the first time, next-generation sequencing (NGS) technology on Egyptian mummies. Seven NGS datasets obtained from five randomly selected Third Intermediate to Graeco-Roman Egyptian mummies (806 BC-124AD) and two unearthed pre-contact Bolivian lowland skeletons were generated and characterised. The datasets were contrasted to three recently published NGS datasets obtained from cold-climate regions, i.e. the Saqqaq, the Denisova hominid and the Alpine Iceman. Analysis was done using one million reads of each newly generated or published dataset. Blastn and megablast results were analysed using MEGAN software. Distinct NGS results were replicated by specific and sensitive polymerase chain reaction (PCR) protocols in ancient DNA dedicated laboratories. Here, we provide unambiguous identification of authentic DNA in Egyptian mummies. The NGS datasets showed variable contents of endogenous DNA harboured in tissues. Three of five mummies displayed a human DNA proportion comparable to the human read count of the Saqqaq permafrost-preserved specimen. Furthermore, a metagenomic signature unique to mummies was displayed. By applying a “bacterial fingerprint”, discrimination among mummies and other remains from warm areas outside Egypt was possible. Due to the absence of an adequate environment monitoring, a bacterial bloom was identified when analysing different biopsies from the same mummies taken after a lapse of time of 1.5 years. Plant kingdom representation in all mummy datasets was unique and could be partially associated with their use in embalming materials. Finally, NGS data showed the presence of Plasmodium falciparum and Toxoplasma gondii DNA sequences, indicating malaria and toxoplasmosis in these mummies. We demonstrate that endogenous ancient DNA can be extracted from mummies and serve as a proper template for the NGS technique, thus, opening new pathways of investigation for future genome sequencing of ancient Egyptian individuals.

Concepts: DNA, Polymerase chain reaction, Molecular biology, Apicomplexa, DNA sequencing, Toxoplasmosis, Ancient Egypt, Mummy


A number of imprinted genes have been observed in plants, animals and humans. They not only control growth and developmental traits, but may also be responsible for survival traits. Based on the Cox proportional hazards (PH) model, we constructed a general parametric model for dissecting genomic imprinting, in which a baseline hazard function is selectable for fitting the effects of imprinted quantitative trait loci (iQTL) genotypes on the survival curve. The expectation-maximisation (EM) algorithm is derived for solving the maximum likelihood estimates of iQTL parameters. The imprinting patterns of the detected iQTL are statistically tested under a series of null hypotheses. The Bayesian information criterion (BIC) model selection criterion is employed to choose an optimal baseline hazard function with maximum likelihood and parsimonious parameterisation. We applied the proposed approach to analyse the published data in an F(2) population of mice and concluded that, among five commonly used survival distributions, the log-logistic distribution is the optimal baseline hazard function for the survival time of hyperoxic acute lung injury (HALI). Under this optimal model, five QTL were detected, among which four are imprinted in different imprinting patterns.

Concepts: Scientific method, Gene, Genetics, Biology, Proportional hazards models, Survival analysis, Hypothesis, Akaike information criterion


The genetic improvement of reproductive traits such as the number of teats is essential to the success of the pig industry. As opposite to most SNP association studies that consider continuous phenotypes under Gaussian assumptions, this trait is characterized as a discrete variable, which could potentially follow other distributions, such as the Poisson. Therefore, in order to access the complexity of a counting random regression considering all SNPs simultaneously as covariate under a GWAS modeling, the Bayesian inference tools become necessary. Currently, another point that deserves to be highlighted in GWAS is the genetic dissection of complex phenotypes through candidate genes network derived from significant SNPs. We present a full Bayesian treatment of SNP association analysis for number of teats assuming alternatively Gaussian and Poisson distributions for this trait. Under this framework, significant SNP effects were identified by hypothesis tests using 95 % highest posterior density intervals. These SNPs were used to construct associated candidate genes network aiming to explain the genetic mechanism behind this reproductive trait. The Bayesian model comparisons based on deviance posterior distribution indicated the superiority of Gaussian model. In general, our results suggest the presence of 19 significant SNPs, which mapped 13 genes. Besides, we predicted gene interactions through networks that are consistent with the mammals known breast biology (e.g., development of prolactin receptor signaling, and cell proliferation), captured known regulation binding sites, and provided candidate genes for that trait (e.g., TINAGL1 and ICK).

Concepts: DNA, Scientific method, Gene, Biology, Organism, Prediction interval, Bayes' theorem, Bayesian statistics


Dysfunctions of RNA processing and mutations of RNA binding proteins (RBPs) play a fundamental role in the pathogenesis of many neurodegenerative diseases. To elucidate the function of RNA processing and RBPs mutations in neuronal cells and to increase our understanding on the pathogenic mechanisms of neurodegeneration, I have reviewed recent advances on RNA processing-associated molecular mechanisms of neurodegenerative diseases, including RBPs-mediated dysfunction of RNA processing, dysfunctional microRNA (miRNA)-based regulation of gene expression, and oxidative RNA modification. I have focused on neurodegeneration induced by RBPs mutations, by dysfunction of miRNA regulation, and by the oxidized RNAs within neurons, and discuss how these dysfunctions have pathologically contributed to neurodegenerative diseases. The advances overviewed above will be valuable to basic investigation and clinical application of target diagnostic tests and therapies.

Concepts: DNA, Gene, Genetics, Cell nucleus, Gene expression, Molecular biology, RNA, Messenger RNA


Parkinson’s disease (PD) is a common neurodegenerative disorder affecting mostly elderly people, although there is a group of patients developing so-called early-onset PD (EOPD). Mutations in the PARK2 gene are a common cause of autosomal recessive EOPD. PARK2 belongs to the family of extremely large human genes which are often localised in genomic common fragile sites (CFSs) and exhibit gross instability. PARK2 is located in the centre of FRA6E, the third most mutation-susceptible CFS of the human genome. The gene encompasses a region of 1.3 Mbp and, among its mutations, large rearrangements of single or multiple exons account for around 50 %. We performed an analysis of the PARK2 gene in a group of 344 PD patients with EOPD and classical form of the disease. Copy number changes were first identified using multiplex ligation probe amplification (MLPA), with their ranges characterised by array comparative genomic hybridisation (aCGH). Exact breakpoints were mapped using direct sequencing. Rearrangements were found in eight subjects, including five deletions and three duplications. Rearrangements were mostly non-recurrent and no repetitive sequences or extended homologies were identified in the regions flanking breakpoint junctions. However, in most cases, 1-3 bp microhomologies were present, strongly suggesting that microhomology-mediated mechanisms, specifically non-homologous end joining (NHEJ) and fork stalling and template switching (FoSTeS)/microhomology-mediated break-induced replication (MMBIR), are predominantly involved in the rearrangement processes in this genomic region.

Concepts: DNA, Gene, Genetics, Copy number variation, Human genome, Human Genome Project, Genome, Genomics


For the last 40 years, “Sanger sequencing” allowed to unveil crucial secrets of life. However, this method of sequencing has been time-consuming, laborious and remains expensive even today. Human Genome Project was a huge impulse to improve sequencing technologies, and unprecedented financial and human effort prompted the development of cheaper high-throughput technologies and strategies called next-generation sequencing (NGS) or whole genome sequencing (WGS). This review will discuss applications of high-throughput methods to study bacteria in a much broader context than simply their genomes. The major goal of next-generation sequencing for a microbiologist is not really resolving another circular genomic sequence. NGS started its infancy from basic structural and functional genomics, to mature into the molecular taxonomy, phylogenetic and advanced comparative genomics. Today, the use of NGS expended capabilities of diagnostic microbiology and epidemiology. The use of RNA sequencing techniques allows studying in detail the complex regulatory processes in the bacterial cells. Finally, NGS is a key technique to study the organization of the bacterial life-from complex communities to single cells. The major challenge in understanding genomic and transcriptomic data lies today in combining it with other sources of global data such as proteome and metabolome, which hopefully will lead to the reconstruction of regulatory networks within bacterial cells that allow communicating with the environment (signalome and interactome) and virtual cell reconstruction.


Nearly all bacterial species, including pathogens, have the ability to form biofilms. Biofilms are defined as structured ecosystems in which microbes are attached to surfaces and embedded in a matrix composed of polysaccharides, eDNA, and proteins, and their development is a multistep process. Bacterial biofilms constitute a large medical problem due to their extremely high resistance to various types of therapeutics, including conventional antibiotics. Several environmental and genetic signals control every step of biofilm development and dispersal. From among the latter, quorum sensing, cyclic diguanosine-5'-monophosphate, and small RNAs are considered as the main regulators. The present review describes the control role of these three regulators in the life cycles of biofilms built by Pseudomonas aeruginosa, Staphylococcus aureus, Salmonella enterica serovar Typhimurium, and Vibrio cholerae. The interconnections between their activities are shown. Compounds and strategies which target the activity of these regulators, mainly quorum sensing inhibitors, and their potential role in therapy are also assessed.

Concepts: Bacteria, Microbiology, Antibiotic resistance, Pseudomonas aeruginosa, Polysaccharide, Biofilm, Quorum sensing, Salmonella enterica


Leaf rust caused by Puccinia triticina belongs to one of the most dangerous fungal diseases of wheat (Triticum aestivum L.) and is the cause of large yield losses every year. Here we report a multiplex polymerase chain reaction (PCR) assay, which was developed for detection of two important wheat slow rust resistance genes Lr34 and Lr46, using two molecular markers: csLV34 and Xwmc44, respectively. The presence of genes was analyzed in one winter wheat variety TX89D6435 and five spring wheat varieties: Pavon F76, Parula ’S', Rayon 89, Kern, Mochis 88. Both Lr34 and Lr46 genes were identified in variety TX89D6435, gene Lr34 was also identified in Parula ’S' and Kern varieties, and gene L46 occurs in Pavon F76 and Mochis 88 variety. None of the resistance genes tested was detected in the Rayon 89 variety. The use of the multiplex PCR method allowed to shorten the analysis time, reduce costs of analyses, and reduce the workload.


Agriculture will benefit from a rigorous characterization of genes for adult plant resistance (APR) since this gene class was recognized to provide more durable protection from plant diseases. The present study reports the identification of APR loci to powdery mildew in German winter wheat cultivars Cortez and Atlantis. Cortez was previously shown to carry all-stage resistance gene Pm3e. To avoid interference of Pm3e in APR studies, line 6037 that lacked Pm3e but showed field resistance from doubled-haploid (DH) population Atlantis/Cortez was used in two backcrosses to Atlantis for the establishment of DH population 6037/Atlantis//Atlantis. APR was assessed in the greenhouse 10, 15, and 20 days after inoculation (dai) from the 4-leaf stage onwards and combined with single-nucleotide polymorphism data in a genome-wide association study (GWAS) and a linkage map-based quantitative trait loci (QTL) analysis. In GWAS, two QTL were detected: one on chromosome 1BL 10 dai, the other on chromosome 2BL 20 dai. In conventional QTL analysis, both QTL were detected with all three disease ratings: the QTL on chromosome 1BL explained a maximum of 35.2% of the phenotypic variation 10 dai, whereas the QTL on chromosome 2BL explained a maximum of 43.5% of the phenotypic variation 20 dai. Compared with GWAS, linkage map-based QTL analysis allowed following the dynamics of QTL action. The two large-effect QTL for APR to powdery mildew with dynamic gene action can be useful for the enhancement of wheat germplasm.