Concept: Variable number tandem repeat


Nine burials excavated from the Magdalen Hill Archaeological Research Project (MHARP) in Winchester, UK, showing skeletal signs of lepromatous leprosy (LL) have been studied using a multidisciplinary approach including osteological, geochemical and biomolecular techniques. DNA from Mycobacterium leprae was amplified from all nine skeletons but not from control skeletons devoid of indicative pathology. In several specimens we corroborated the identification of M. leprae with detection of mycolic acids specific to the cell wall of M. leprae and persistent in the skeletal samples. In five cases, the preservation of the material allowed detailed genotyping using single-nucleotide polymorphism (SNP) and multiple locus variable number tandem repeat analysis (MLVA). Three of the five cases proved to be infected with SNP type 3I-1, ancestral to contemporary M. leprae isolates found in southern states of America and likely carried by European migrants. From the remaining two burials we identified, for the first time in the British Isles, the occurrence of SNP type 2F. Stable isotope analysis conducted on tooth enamel taken from two of the type 3I-1 and one of the type 2F remains revealed that all three individuals had probably spent their formative years in the Winchester area. Previously, type 2F has been implicated as the precursor strain that migrated from the Middle East to India and South-East Asia, subsequently evolving to type 1 strains. Thus we show that type 2F had also spread westwards to Britain by the early medieval period.

Whole-genome sequencing is performed routinely as a means to identify polymorphic genetic loci such as short tandem repeat loci. We have developed a simple tool, called pSTR Finder, which is freely available as a means of identifying putative polymorphic short tandem repeat (STR) loci from data generated from genome-wide sequences. The program performs cross comparisons on the STR sequences generated using the Tandem Repeats Finder based on multiple-genome samples in a FASTA format. These comparisons generate reports listing identical, polymorphic, and different STR loci when comparing two samples.

Brucellosis is the most common bacterial zoonoses worldwide. Bovine brucellosis caused by Brucella abortus has far reaching animal health and economic impacts at both the local and national levels. Alongside traditional veterinary epidemiology, the use of molecular typing has recently been applied to inform on bacterial population structure and identify epidemiologically-linked cases of infection. Multi-locus variable number tandem repeat VNTR analysis (MLVA) was used to investigate the molecular epidemiology of a well-characterised Brucella abortus epidemic in Northern Ireland involving 387 herds between 1991 and 2012.

Nine-banded armadillos (Dasypus novemcinctus) are naturally infected with Mycobacterium leprae and have been implicated in zoonotic transmission of leprosy. Early studies found this disease mainly in Texas and Louisiana, but armadillos in the southeastern United States appeared to be free of infection. We screened 645 armadillos from 8 locations in the southeastern United States not known to harbor enzootic leprosy for M. leprae DNA and antibodies. We found M. leprae-infected armadillos at each location, and 106 (16.4%) animals had serologic/PCR evidence of infection. Using single-nucleotide polymorphism variable number tandem repeat genotyping/genome sequencing, we detected M. leprae genotype 3I-2-v1 among 35 armadillos. Seven armadillos harbored a newly identified genotype (3I-2-v15). In comparison, 52 human patients from the same region were infected with 31 M. leprae types. However, 42.3% (22/52) of patients were infected with 1 of the 2 M. leprae genotype strains associated with armadillos. The geographic range and complexity of zoonotic leprosy is expanding.

Tandemly repeated sequences are a common feature of vertebrate mitochondrial DNA control regions. However, questions still remain about their mode of evolution and function. To better understand patterns of variation in length and to explore the existence of previously described domain, we have characterized the control region structure of the Amazonian ornamental fish Nannostomus eques and Nannostomus unifasciatus. The control region ranged from 1121 to 1142 bp in length and could be separated into three domains: the domain associated with the extended terminal associated sequences, the central conserved domain, and the conserved sequence blocks domain. In the first domain, we encountered a sequence repeated 10 times in tandem (variable number tandem repeat (VNTR)) that could adopt an “inverted repetitions” type structural conformation. The results suggest that the VNTR pattern encountered in both N. eques and N. unifasciatus is consistent with the prerequisites of the illegitimate elongation model in which the unequal pairing of the chains near the 5'-end of the control region favors the formation of repetitions.

Many episodes of canine brucellosis in dog kennels have been reported but recently an outbreak that involved pets and their owners has been described. The purpose of this study was to confirm that the outbreak had a common source and evaluate the evolution of 4 dogs involved in this outbreak after the measures implemented that included a survey of 41 animals from the same area. The variable number of tandem repeat (VNTR) analysis indicated that the B. canis isolated from the human clustered together with the isolates collected from the canine pups. Two dogs continued with bacteremia after the first antibiotic therapy and from one of them B. canis was also isolated from urine showing the importance of the later in the infection dissemination. In an effort to protect the public, stray dogs should be controlled and educational programs about the risk of this zoonotic disease should be implemented.

Motivation: Microsatellites are among the most useful genetic markers in population biology. High-throughput sequencing of microsatellite-enriched libraries dramatically expedites the traditional process of screening recombinant libraries for microsatellite markers. However, sorting through millions of reads to distill high-quality polymorphic markers requires special algorithms tailored to tolerate sequencing errors in locus reconstruction, distinguish paralogous loci, rarify raw reads originating from the same amplicon and sort out various artificial fragments resulting from recombination or concatenation of auxiliary adapters. Existing programs warrant improvement. Results: We describe a microsatellite prediction framework named HighSSR for microsatellite genotyping based on high-throughput sequencing. We demonstrate the utility of HighSSR in comparison to Roche gsAssembler on two Roche 454 GS FLX runs. The majority of the HighSSR-assembled loci were reliably mapped against model organism reference genomes. HighSSR demultiplexes pooled libraries, assesses locus polymorphism and implements Primer3 for the design of PCR primers flanking polymorphic microsatellite loci. As sequencing costs drop and permit the analysis of all project samples on next-generation platforms, this framework can also be used for direct simple sequence repeats genotyping. Availability: Contact: SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

In this work, we report a novel polymerase chain reaction (PCR)-free variable number tandem repeat (VNTR) typing method using a T-shape gold nanoparticle-DNA monoconjugate, called “watching-gene assay”. The T-shape DNA probe was synthesized by “click” chemistry, and linked with the gold nanoparticle to form the gold nanoparticle -DNA monoconjugate (a VNTR probe). Through a simple annealing and ligation reaction of the VNTR probe on a synthetic DNA template mimicking the human D1S80 VNTR locus, the number of a tandem repeat unit could be deciphered by counting the self-assembled gold nanoparticles. The number of the tandem repeat unit could be identified with more than 50% yield if the repeat number was less than four. In the case of the real human genomic DNA, the 18 repeat number could be successfully revealed by observing the 18-gold nanoparticle cluster which was exactly correspondent to the number of the tandem repeat of the real sample. Our “watching-gene assay” is rapid, simple, and direct for data interpretation, thereby providing an advanced PCR-free genetic polymorphism analysis platform.

Match probability calculation is deemed much more intricate for lineage genetic markers, including Y-chromosomal short tandem repeats (Y-STRs), than for autosomal markers. This is because, owing to the lack of recombination, strong interdependence between markers is likely, which implies that haplotype frequency estimates cannot simply be obtained through the multiplication of allele frequency estimates. As yet, however, the practical relevance of this problem has not been studied in much detail using real data. In fact, such scrutiny appears well warranted because the high mutation rates of Y-STRs and the possibility of backward mutation should have worked against the statistical association of Y-STRs. We examined haplotype data of 21 markers included in the PowerPlex(®)Y23 set (PPY23, Promega Corporation, Madison, WI) originating from six different populations (four European and two Asian). Assessing the conditional entropies of the markers, given different subsets of markers from the same panel, we demonstrate that the PowerPlex(®)Y23 set cannot be decomposed into smaller marker subsets that would be (conditionally) independent. Nevertheless, in all six populations, >94% of the joint entropy of the 21 markers is explained by the seven most rapidly mutating markers. Although this result might render a reduction in marker number a sensible option for practical casework, the partial haplotypes would still be almost as diverse as the full haplotypes. Therefore, match probability calculation remains difficult and calls for the improvement of currently available methods of haplotype frequency estimation.

Whole genome sequencing (WGS) technology holds great promise as a tool for the forensic epidemiology of bacterial pathogens. It is likely to be particularly useful for studying the transmission dynamics of an observed epidemic involving a largely unsampled ‘reservoir’ host, as for bovine tuberculosis (bTB) in British and Irish cattle and badgers. BTB is caused by Mycobacterium bovis, a member of the M. tuberculosis complex that also includes the aetiological agent for human TB. In this study, we identified a spatio-temporally linked group of 26 cattle and 4 badgers infected with the same Variable Number Tandem Repeat (VNTR) type of M. bovis. Single-nucleotide polymorphisms (SNPs) between sequences identified differences that were consistent with bacterial lineages being persistent on or near farms for several years, despite multiple clear whole herd tests in the interim. Comparing WGS data to mathematical models showed good correlations between genetic divergence and spatial distance, but poor correspondence to the network of cattle movements or within-herd contacts. Badger isolates showed between zero and four SNP differences from the nearest cattle isolate, providing evidence for recent transmissions between the two hosts. This is the first direct genetic evidence of M. bovis persistence on farms over multiple outbreaks with a continued, ongoing interaction with local badgers. However, despite unprecedented resolution, directionality of transmission cannot be inferred at this stage. Despite the often notoriously long timescales between time of infection and time of sampling for TB, our results suggest that WGS data alone can provide insights into TB epidemiology even where detailed contact data are not available, and that more extensive sampling and analysis will allow for quantification of the extent and direction of transmission between cattle and badgers.

