Concept: Short tandem repeat
Insertion-deletion polymorphisms (INDELs) are diallelic markers derived from a single mutation event. Their low mutation frequency makes them suitable for forensic and parentage testing. The examination of INDELs thus combines advantages of both short tandem repeats (STR) and single nucleotide polymorphisms (SNP). This type of polymorphisms may be examined using as small amplicon size as SNP (about 100 bp) but could be analyzed by techniques used for routine STR analysis. For our population study, we genotyped 55 unrelated Czech individuals. We also genotyped 11 trios to analyze DIPplex Kit (QIAGEN, Germany) suitability for parentage testing. DIPplex Kit contains 30 diallelic autosomal markers. INDELs in DIPplex Kit were tested with linkage disequilibrium test, which showed that they could be treated as independent markers. All 30 loci fulfill Hardy-Weinberg equilibrium. There were several significant differences between Czech and African populations, but no significant ones within European population. Probability of a match in the Czech population was 1 in 6.8 × 10(12); combined power of discrimination was 99.9999999999%. Average paternity index was 1.13-1.77 for each locus; combined paternity index reached about 27,000 for a set of 30 loci. We can conclude that DIPplex kit is useful as an additional panel of markers in paternity cases when mutations in STR polymorphisms are present. For application on degraded or inhibited samples, further optimization of buffer and primer concentrations is needed.
Whole-genome sequencing is performed routinely as a means to identify polymorphic genetic loci such as short tandem repeat loci. We have developed a simple tool, called pSTR Finder, which is freely available as a means of identifying putative polymorphic short tandem repeat (STR) loci from data generated from genome-wide sequences. The program performs cross comparisons on the STR sequences generated using the Tandem Repeats Finder based on multiple-genome samples in a FASTA format. These comparisons generate reports listing identical, polymorphic, and different STR loci when comparing two samples.
DNA testing is an established part of the investigation and prosecution of sexual assault. The primary purpose of DNA evidence is to identify a suspect and/or to demonstrate sexual contact. However, due to highly uneven proportions of female and male DNA in typical stains, routine autosomal analysis often fails to detect the DNA of the assailant. To evaluate the forensic efficiency of the combined application of autosomal and Y-chromosomal short tandem repeat (STR) markers, we present a large retrospective casework study of probative evidence collected in sexual-assault cases. We investigated up to 39 STR markers by testing combinations of the 16-locus NGMSElect kit with both the 23-locus PowerPlex Y23 and the 17-locus Yfiler kit. Using this dual approach we analyzed DNA extracts from 2077 biological stains collected in 287 cases over 30 months. To assess the outcome of the combined approach in comparison to stand-alone autosomal analysis we evaluated informative DNA profiles. Our investigation revealed that Y-STR analysis added up to 21% additional, highly informative (complete, single-source) profiles to the set of reportable autosomal STR profiles for typical stains collected in sexual-assault cases. Detection of multiple male contributors was approximately three times more likely with Y-chromosomal profiling than with autosomal STR profiling. In summary, 1/10 cases would have remained inconclusive (and could have been dismissed) if Y-STR analysis had been omitted from DNA profiling in sexual-assault cases.
Sharing sequencing data sets without identifiers has become a common practice in genomics. Here, we report that surnames can be recovered from personal genomes by profiling short tandem repeats on the Y chromosome (Y-STRs) and querying recreational genetic genealogy databases. We show that a combination of a surname with other types of metadata, such as age and state, can be used to triangulate the identity of the target. A key feature of this technique is that it entirely relies on free, publicly accessible Internet resources. We quantitatively analyze the probability of identification for U.S. males. We further demonstrate the feasibility of this technique by tracing back with high probability the identities of multiple participants in public sequencing projects.
Mosquitoes occur almost worldwide, and females of some species feed on blood from humans and other animals to support ovum maturation. In warm and hot seasons, such as the summer in Japan, fed mosquitoes are often observed at crime scenes. The current study attempted to estimate the time that elapsed since feeding from the degree of human DNA digestion in mosquito blood meals and also to identify the individual human sources of the DNA using genotyping in two species of mosquito: Culex pipiens pallens and Aedes albopictus. After stereomicroscopic observation, the extracted DNA samples were quantified using a human DNA quantification and quality control kit and were genotyped for 15 short tandem repeats using a commercial multiplexing kit. It took about 3 days for the complete digestion of a blood meal, and genotyping was possible until 2 days post-feeding. The relative peak heights of the 15 STRs and DNA concentrations were useful for estimating the post-feeding time to approximately half a day between 0 and 2 days. Furthermore, the quantitative ratios derived from STR peak heights and the quality control kit (Q129/Q41, Q305/Q41, and Q305/Q129) were reasonably effective for estimating the approximate post-feeding time after 2-3 days. We suggest that this study may be very useful for estimating the time since a mosquito fed from blood meal DNA, although further refinements are necessary to estimate the times more accurately.
This study focuses on the descendants of the royal Inka family. The Inkas ruled Tawantinsuyu, the largest pre-Columbian empire in South America, which extended from southern Colombia to central Chile. The origin of the royal Inkas is currently unknown. While the mummies of the Inka rulers could have been informative, most were destroyed by Spaniards and the few remaining disappeared without a trace. Moreover, no genetic studies have been conducted on present-day descendants of the Inka rulers. In the present study, we analysed uniparental DNA markers in 18 individuals predominantly from the districts of San Sebastian and San Jerónimo in Cusco (Peru), who belong to 12 families of putative patrilineal descent of Inka rulers, according to documented registries. We used single-nucleotide polymorphisms and short tandem repeat (STR) markers of the Y chromosome (Y-STRs), as well as mitochondrial DNA D-loop sequences, to investigate the paternal and maternal descent of the 18 alleged Inka descendants. Two Q-M3* Y-STR clusters descending from different male founders were identified. The first cluster, named AWKI-1, was associated with five families (eight individuals). By contrast, the second cluster, named AWKI-2, was represented by a single individual; AWKI-2 was part of the Q-Z19483 sub-lineage that was likely associated with a recent male expansion in the Andes, which probably occurred during the Late Intermediate Period (1000-1450 AD), overlapping the Inka period. Concerning the maternal descent, different mtDNA lineages associated with each family were identified, suggesting a high maternal gene flow among Andean populations, probably due to changes in the last 1000 years.
- Proceedings of the National Academy of Sciences of the United States of America
- Published over 2 years ago
Combining genotypes across datasets is central in facilitating advances in genetics. Data aggregation efforts often face the challenge of record matching-the identification of dataset entries that represent the same individual. We show that records can be matched across genotype datasets that have no shared markers based on linkage disequilibrium between loci appearing in different datasets. Using two datasets for the same 872 people-one with 642,563 genome-wide SNPs and the other with 13 short tandem repeats (STRs) used in forensic applications-we find that 90-98% of forensic STR records can be connected to corresponding SNP records and vice versa. Accuracy increases to 99-100% when ∼30 STRs are used. Our method expands the potential of data aggregation, but it also suggests privacy risks intrinsic in maintenance of databases containing even small numbers of markers-including databases of forensic significance.
Major depressive disorder (MDD) is a leading contributor to global disease burden. Recent studies have shown that genetic factors play significant roles in the susceptibility to this condition; however, the underlying genetic basis currently remains largely unknown. Short tandem repeat (STR) has been proposed as an explanatory factor in the “missing heritability” of complex diseases or traits.
Dipterous fly larvae (maggots) are frequently collected from a corpse during a criminal investigation. Previous studies showed that DNA analysis of the gastrointestinal contents of maggots might be used to reveal the identity of a victim. However, this approach has not been used to date in legal investigations, and thus its practical usefulness is unknown. A badly burned body was discovered with its face and neck colonized by fly larvae. Given the condition of the body, identification was not possible. Short tandem repeat (STR) typing was performed using the gastrointestinal contents of maggots collected from the victim and was compared to STR profiles obtained from the alleged father. The probability of paternity was 99.685%. Thus, this comparative DNA test enabled the conclusive identification of the remains. This is the first reported case of analysis of human DNA isolated from the gastrointestinal tract of maggots used to identify a victim in a criminal case.
Tandemly repeated sequences are a common feature of vertebrate mitochondrial DNA control regions. However, questions still remain about their mode of evolution and function. To better understand patterns of variation in length and to explore the existence of previously described domain, we have characterized the control region structure of the Amazonian ornamental fish Nannostomus eques and Nannostomus unifasciatus. The control region ranged from 1121 to 1142 bp in length and could be separated into three domains: the domain associated with the extended terminal associated sequences, the central conserved domain, and the conserved sequence blocks domain. In the first domain, we encountered a sequence repeated 10 times in tandem (variable number tandem repeat (VNTR)) that could adopt an “inverted repetitions” type structural conformation. The results suggest that the VNTR pattern encountered in both N. eques and N. unifasciatus is consistent with the prerequisites of the illegitimate elongation model in which the unequal pairing of the chains near the 5'-end of the control region favors the formation of repetitions.