Discover the most talked about and latest scientific content & concepts.

Concept: Hidden Markov model


Smartphone positioning is an enabling technology used to create new business in the navigation and mobile location-based services (LBS) industries. This paper presents a smartphone indoor positioning engine named HIPE that can be easily integrated with mobile LBS. HIPE is a hybrid solution that fuses measurements of smartphone sensors with wireless signals. The smartphone sensors are used to measure the user’s motion dynamics information (MDI), which represent the spatial correlation of various locations. Two algorithms based on hidden Markov model (HMM) problems, the grid-based filter and the Viterbi algorithm, are used in this paper as the central processor for data fusion to resolve the position estimates, and these algorithms are applicable for different applications, e.g., real-time navigation and location tracking, respectively. HIPE is more widely applicable for various motion scenarios than solutions proposed in previous studies because it uses no deterministic motion models, which have been commonly used in previous works. The experimental results showed that HIPE can provide adequate positioning accuracy and robustness for different scenarios of MDI combinations. HIPE is a cost-efficient solution, and it can work flexibly with different smartphone platforms, which may have different types of sensors available for the measurement of MDI data. The reliability of the positioning solution was found to increase with increasing precision of the MDI data.

Concepts: Measurement, Machine learning, Hidden Markov model, Markov models, Dynamic programming, Location-based service, Viterbi algorithm, Forward-backward algorithm


The locomotion behavior of Caenorhabditis elegans has been studied extensively to understand the respective roles of neural control and biomechanics as well as the interaction between them. Constructing a mathematical model is helpful to understand the locomotion behavior in various surrounding conditions that are difficult to realize in experiments. In this study, we built three hidden Markov models (HMMs) for the crawling behavior of C. elegans in a controlled environment with no chemical treatment and in a formaldehyde-treated environment (0.1 and 0.5 ppm). The organism’s crawling activity was recorded using a digital camcorder for 20 min at a rate of 24 frames per second. All shape patterns were quantified by branch length similarity (BLS) entropy and classified into four groups using the self-organizing map (SOM). Comparison of the simulated behavior generated by HMMs and the actual crawling behavior demonstrated that the HMM coupled with the SOM was successful in characterizing the crawling behavior. In addition, we briefly discussed the possibility of using the HMM together with BLS entropy to develop bio-monitoring systems to determine water quality.

Concepts: Nervous system, Neuron, Caenorhabditis elegans, Caenorhabditis, Model organism, Rhabditidae, Hidden Markov model, Caenorhabditis briggsae


The use of accelerometers to objectively measure physical activity (PA) has become the most preferred method of choice in recent years. Traditionally, cutpoints are used to assign impulse counts recorded by the devices to sedentary and activity ranges. Here, hidden Markov models (HMM) are used to improve the cutpoint method to achieve a more accurate identification of the sequence of modes of PA.

Concepts: Markov chain, Hidden Markov model, Choice, Accelerometer, Topological space, Andrey Markov


A key challenge in contemporary ecology and conservation is the accurate tracking of the spatial distribution of various human impacts, such as fishing. While coastal fisheries in national waters are closely monitored in some countries, existing maps of fishing effort elsewhere are fraught with uncertainty, especially in remote areas and the High Seas. Better understanding of the behavior of the global fishing fleets is required in order to prioritize and enforce fisheries management and conservation measures worldwide. Satellite-based Automatic Information Systems (S-AIS) are now commonly installed on most ocean-going vessels and have been proposed as a novel tool to explore the movements of fishing fleets in near real time. Here we present approaches to identify fishing activity from S-AIS data for three dominant fishing gear types: trawl, longline and purse seine. Using a large dataset containing worldwide fishing vessel tracks from 2011-2015, we developed three methods to detect and map fishing activities: for trawlers we produced a Hidden Markov Model (HMM) using vessel speed as observation variable. For longliners we have designed a Data Mining (DM) approach using an algorithm inspired from studies on animal movement. For purse seiners a multi-layered filtering strategy based on vessel speed and operation time was implemented. Validation against expert-labeled datasets showed average detection accuracies of 83% for trawler and longliner, and 97% for purse seiner. Our study represents the first comprehensive approach to detect and identify potential fishing behavior for three major gear types operating on a global scale. We hope that this work will enable new efforts to assess the spatial and temporal distribution of global fishing effort and make global fisheries activities transparent to ocean scientists, managers and the public.

Concepts: Machine learning, Hidden Markov model, Overfishing, Fishing techniques, Ship, Fisherman, Seine fishing, Fishing vessel


Diving behaviour of short-finned pilot whales is often described by two states; deep foraging and shallow, non-foraging dives. However, this simple classification system ignores much of the variation that occurs during subsurface periods. We used multi-state hidden Markov models (HMM) to characterize states of diving behaviour and the transitions between states in short-finned pilot whales. We used three parameters (number of buzzes, maximum dive depth and duration) measured in 259 dives by digital acoustic recording tags (DTAGs) deployed on 20 individual whales off Cape Hatteras, North Carolina, USA. The HMM identified a four-state model as the best descriptor of diving behaviour. The state-dependent distributions for the diving parameters showed variation between states, indicative of different diving behaviours. Transition probabilities were considerably higher for state persistence than state switching, indicating that dive types occurred in bouts. Our results indicate that subsurface behaviour in short-finned pilot whales is more complex than a simple dichotomy of deep and shallow diving states, and labelling all subsurface behaviour as deep dives or shallow dives discounts a significant amount of important variation. We discuss potential drivers of these patterns, including variation in foraging success, prey availability and selection, bathymetry, physiological constraints and socially mediated behaviour.

Concepts: Psychology, Markov chain, Hidden Markov model, Whale, Markov decision process, Markov models, Andrey Markov, Pilot whale


Analyses of metagenome data (MG) and metatranscriptome data (MT) are often challenged by a paucity of complete reference genome sequences and the uneven/low sequencing depth of the constituent organisms in the microbial community, which respectively limit the power of reference-based alignment and de novo sequence assembly. These limitations make accurate protein family classification and abundance estimation challenging, which in turn hamper downstream analyses such as abundance profiling of metabolic pathways, identification of differentially encoded/expressed genes, and de novo reconstruction of complete gene and protein sequences from the protein family of interest. The profile hidden Markov model (HMM) framework enables the construction of very useful probabilistic models for protein families that allow for accurate modeling of position specific matches, insertions, and deletions. We present a novel homology detection algorithm that integrates banded Viterbi algorithm for profile HMM parsing with an iterative simultaneous alignment and assembly computational framework. The algorithm searches a given profile HMM of a protein family against a database of fragmentary MG/MT sequencing data and simultaneously assembles complete or near-complete gene and protein sequences of the protein family. The resulting program, HMM-GRASPx, demonstrates superior performance in aligning and assembling homologs when benchmarked on both simulated marine MG and real human saliva MG datasets. On real supragingival plaque and stool MG datasets that were generated from healthy individuals, HMM-GRASPx accurately estimates the abundances of the antimicrobial resistance (AMR) gene families and enables accurate characterization of the resistome profiles of these microbial communities. For real human oral microbiome MT datasets, using the HMM-GRASPx estimated transcript abundances significantly improves detection of differentially expressed (DE) genes. Finally, HMM-GRASPx was used to reconstruct comprehensive sets of complete or near-complete protein and nucleotide sequences for the query protein families. HMM-GRASPx is freely available online from

Concepts: DNA, Protein, Gene, Bioinformatics, Metabolism, Hidden Markov model, Protein family, Gene family


The HMMER website, available at, provides access to the protein homology search algorithms found in the HMMER software suite. Since the first release of the website in 2011, the search repertoire has been expanded to include the iterative search algorithm, jackhmmer. The continued growth of the target sequence databases means that traditional tabular representations of significant sequence hits can be overwhelming to the user. Consequently, additional ways of presenting homology search results have been developed, allowing them to be summarised according to taxonomic distribution or domain architecture. The taxonomy and domain architecture representations can be used in combination to filter the results according to the needs of a user. Searches can also be restricted prior to submission using a new taxonomic filter, which not only ensures that the results are specific to the requested taxonomic group, but also improves search performance. The repertoire of profile hidden Markov model libraries, which are used for annotation of query sequences with protein families and domains, has been expanded to include the libraries from CATH-Gene3D, PIRSF, Superfamily and TIGRFAMs. Finally, we discuss the relocation of the HMMER webserver to the European Bioinformatics Institute and the potential impact that this will have.

Concepts: Algorithm, Bioinformatics, European Bioinformatics Institute, World Wide Web, Hidden Markov model, Topological space, Searching, Search algorithm


Brain activity is a dynamic combination of the responses to sensory inputs and its own spontaneous processing. Consequently, such brain activity is continuously changing whether or not one is focusing on an externally imposed task. Previously, we have introduced an analysis method that allows us, using Hidden Markov Models (HMM), to model task or rest brain activity as a dynamic sequence of distinct brain networks, overcoming many of the limitations posed by sliding window approaches. Here, we present an advance that enables the HMM to handle very large amounts of data, making possible the inference of very reproducible and interpretable dynamic brain networks in a range of different datasets, including task, rest, MEG and fMRI, with potentially thousands of subjects. We anticipate that the generation of large and publicly available datasets from initiatives such as the Human Connectome Project and UK Biobank, in combination with computational methods that can work at this scale, will bring a breakthrough in our understanding of brain function in both health and disease.

Concepts: Epidemiology, Disease, Cognitive science, Electroencephalography, Unified Modeling Language, Hidden Markov model, Standard Model, Computational neuroscience


A standard method for the identification of novel RNAs or proteins is homology search via probabilistic models. One approach relies on the definition of families, which can be encoded as covariance models (CMs) or Hidden Markov Models (HMMs). While being powerful tools, their complexity makes it tedious to investigate them in their (default) tabulated form. This specifically applies to the interpretation of comparisons between multiple models as in family clans. The Covariance model visualization tools (CMV) visualize CMs or HMMs to: I) Obtain an easily interpretable representation of HMMs and CMs; II) Put them in context with the structural sequence alignments they have been created from; III) Investigate results of model comparisons and highlight regions of interest.

Concepts: DNA, Protein, Gene, Bioinformatics, RNA, Ribosome, Hidden Markov model, Sequence alignment


The recent release of the gene-targeted metagenomics assembler Xander has demonstrated that using the trained Hidden Markov Model (HMM) to guide the traversal of de Bruijn graph gives obvious advantage over other assembly methods. Xander, as a pilot study, indeed has a lot of room for improvement. Apart from its slow speed, Xander uses only 1 k-mer size for graph construction and whatever choice of k will compromise either sensitivity or accuracy. Xander uses a Bloom-filter representation of de Bruijn graph to achieve a lower memory footprint. Bloom filters bring in false positives, and it is not clear how this would impact the quality of assembly. Xander does not keep track of the multiplicity of k-mers, which would have been an effective way to differentiate between erroneous k-mers and correct k-mers.

Concepts: Bioinformatics, Model theory, Graph theory, Hidden Markov model, Experimental uncertainty analysis, De Bruijn graph, De Bruijn sequence, Nicolaas Govert de Bruijn