Recent advances in deep learning and specifically in generative adversarial networks have demonstrated surprising results in generating new images and videos upon request even using natural language as input. In this paper we present the first application of generative adversarial autoencoders (AAE) for generating novel molecular fingerprints with a defined set of parameters. We developed a 7-layer AAE architecture with the latent middle layer serving as a discriminator. As an input and output the AAE uses a vector of binary fingerprints and concentration of the molecule. In the latent layer we also introduced a neuron responsible for growth inhibition percentage, which when negative indicates the reduction in the number of tumor cells after the treatment. To train the AAE we used the NCI-60 cell line assay data for 6252 compounds profiled on MCF-7 cell line. The output of the AAE was used to screen 72 million compounds in PubChem and select candidate molecules with potential anti-cancer properties. This approach is a proof of concept of an artificially-intelligent drug discovery engine, where AAEs are used to generate new molecular fingerprints with the desired molecular properties.
We propose and develop a Lexicocalorimeter: an online, interactive instrument for measuring the “caloric content” of social media and other large-scale texts. We do so by constructing extensive yet improvable tables of food and activity related phrases, and respectively assigning them with sourced estimates of caloric intake and expenditure. We show that for Twitter, our naive measures of “caloric input”, “caloric output”, and the ratio of these measures are all strong correlates with health and well-being measures for the contiguous United States. Our caloric balance measure in many cases outperforms both its constituent quantities; is tunable to specific health and well-being measures such as diabetes rates; has the capability of providing a real-time signal reflecting a population’s health; and has the potential to be used alongside traditional survey data in the development of public policy and collective self-awareness. Because our Lexicocalorimeter is a linear superposition of principled phrase scores, we also show we can move beyond correlations to explore what people talk about in collective detail, and assist in the understanding and explanation of how population-scale conditions vary, a capacity unavailable to black-box type methods.
There are many challenges to measuring power input and force output from a flapping vertebrate. Animals can vary a multitude of kinematic parameters simultaneously, and methods for measuring power and force are either not possible in a flying vertebrate or are very time and equipment intensive. To circumvent these challenges, we constructed a robotic, multi-articulated bat wing that allows us to measure power input and force output simultaneously, across a range of kinematic parameters. The robot is modeled after the lesser dog-faced fruit bat, Cynopterus brachyotis, and contains seven joints powered by three servo motors. Collectively, this joint and motor arrangement allows the robot to vary wingbeat frequency, wingbeat amplitude, stroke plane, downstroke ratio, and wing folding. We describe the design, construction, programing, instrumentation, characterization, and analysis of the robot. We show that the kinematics, inputs, and outputs demonstrate good repeatability both within and among trials. Finally, we describe lessons about the structure of living bats learned from trying to mimic their flight in a robotic wing.
A new form of augmentative and alternative communication (AAC) device for people with severe speech impairment-the voice-input voice-output communication aid (VIVOCA)-is described. The VIVOCA recognizes the disordered speech of the user and builds messages, which are converted into synthetic speech. System development was carried out employing user-centered design and development methods, which identified and refined key requirements for the device. A novel methodology for building small vocabulary, speaker-dependent automatic speech recognizers with reduced amounts of training data, was applied. Experiments showed that this method is successful in generating good recognition performance (mean accuracy 96%) on highly disordered speech, even when recognition perplexity is increased. The selected message-building technique traded off various factors including speed of message construction and range of available message outputs. The VIVOCA was evaluated in a field trial by individuals with moderate to severe dysarthria and confirmed that they can make use of the device to produce intelligible speech output from disordered speech input. The trial highlighted some issues which limit the performance and usability of the device when applied in real usage situations, with mean recognition accuracy of 67% in these circumstances. These limitations will be addressed in future work.
Differential equation models can be used to describe the relationships between the current state of a system of constructs (e.g., stress) and how those constructs are changing (e.g., based on variable-like experiences). The following article describes a differential equation model based on the concept of a reservoir. With a physical reservoir, such as one for water, the level of the liquid in the reservoir at any time depends on the contributions to the reservoir (inputs) and the amount of liquid removed from the reservoir (outputs). This reservoir model might be useful for constructs such as stress, where events might “add up” over time (e.g., life stressors, inputs), but individuals simultaneously take action to “blow off steam” (e.g., engage coping resources, outputs). The reservoir model can provide descriptive statistics of the inputs that contribute to the “height” (level) of a construct and a parameter that describes a person’s ability to dissipate the construct. After discussing the model, we describe a method of fitting the model as a structural equation model using latent differential equation modeling and latent distribution modeling. A simulation study is presented to examine recovery of the input distribution and output parameter. The model is then applied to the daily self-reports of negative affect and stress from a sample of older adults from the Notre Dame Longitudinal Study on Aging. (PsycINFO Database Record © 2013 APA, all rights reserved).
A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. To exploit the typical cluster-type supercomputers, thousands of docking calculations were dispatched by the master process to run simultaneously on thousands of slave processes, where each docking calculation takes one slave process on one node, and within the node each docking calculation runs via multithreading on multiple CPU cores and shared memory. Input and output of the program and the data handling within the program were carefully designed to deal with large databases and ultimately achieve HPC on a large number of CPU cores. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94%. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re-docking of X-ray ligands and an enrichment study, 64.4% of the top scoring poses have RMSD values under 2.0 Å. The program has been demonstrated to have good enrichment performance on 70% of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives. © 2013 Wiley Periodicals, Inc.
DNA circuits have been widely used to develop biological computing devices because of their high programmability and versatility. Here, we propose an architecture for the systematic construction of DNA circuits for analog computation based on DNA strand displacement. The elementary gates in our architecture include addition, subtraction, and multiplication gates. The input and output of these gates are analog, which means that they are directly represented by the concentrations of the input and output DNA strands respectively, without requiring a threshold for converting to Boolean signals. We provide detailed domain designs and kinetic simulations of the gates to demonstrate their expected performance. Based on these gates, we describe how DNA circuits to compute polynomial functions of inputs can be built. Using Taylor Series and Newton Iteration methods, functions beyond the scope of polynomials can also be computed by DNA circuits built upon our architecture.
The rodent somatosensory cortex includes well-defined examples of cortical columns-the barrel columns-that extend throughout the cortical depth and are defined by discrete clusters of neurons in layer 4 (L4) called barrels. Using the cell-type-specific Ntsr1-Cre mouse line, we found that L6 contains infrabarrels, readily identifiable units that align with the L4 barrels. Corticothalamic (CT) neurons and their local axons cluster within the infrabarrels, whereas corticocortical (CC) neurons are densest between infrabarrels. Optogenetic experiments showed that CC cells received robust input from somatosensory thalamic nuclei, whereas CT cells received much weaker thalamic inputs. We also found that CT neurons are intrinsically less excitable, revealing that both synaptic and intrinsic mechanisms contribute to the low firing rates of CT neurons often reported in vivo. In summary, infrabarrels are discrete cortical circuit modules containing two partially separated excitatory networks that link long-distance thalamic inputs with specific outputs.
Alcohol-to-jet (ATJ) is one of the technical feasible biofuel technologies. It produces jet fuel from sugary, starchy, and lignocellulosic biomass, such as sugarcane, corn grain, and switchgrass, via fermentation of sugars to ethanol or other alcohols. This study assesses the ATJ biofuel production pathway for these three biomass feedstocks, and advances existing techno-economic analyses of biofuels in three ways. First, we incorporate technical uncertainty for all by-products and co-products though statistical linkages between conversion efficiencies and input and output levels. Second, future price uncertainty is based on case-by-case time-series estimation, and a local sensitivity analysis is conducted with respect to each uncertain variable. Third, breakeven price distributions are developed to communicate the inherent uncertainty in breakeven price. This research also considers uncertainties in utility input requirements, fuel and by-product outputs, as well as price uncertainties for all major inputs, products, and co-products. All analyses are done from the perspective of a private firm.
Many recent models study the downstream projection from grid cells to place cells, while recent data has pointed out the importance of the feedback projection. We thus asked how grid cells are affected by the nature of the input from the place cells.We propose a single-layer neural network with feedforward weights connecting place-like input cells to grid cell outputs. Place-to-grid weights were learned via a generalized Hebbian rule. The architecture of this network highly resembles neural networks used to perform Principal Component Analysis (PCA). Both numerical results and analytic considerations indicate that if the components of the feedforward neural network were non-negative, the output converged to a hexagonal lattice. Without the non-negativity constraint the output converged to a square lattice. Consistent with experiments, grid spacing ratio between the first two consecutive modules was ~1.4. Our results express a possible linkage between place cell to grid cell interactions and PCA.