Читать книгу Informatics and Machine Learning - Stephen Winters-Hilt - Страница 23
1.9.1 Stochastic Carrier Wave (SCW) Analysis – Nanoscope Signal Analysis
ОглавлениеThe Nanoscope described in Chapter 14 builds from nanopore detection with introduction of reporter molecules to arrive at a nanopore transduction detection paradigm. By engineering reporter molecules that produce stationary statistics (a SCW) together with ML signal analysis methods designed for rapid analysis of such signals, we arrive at a functioning “nanoscope.”
Nanopore detection is made possible by the following well‐established capabilities: (i) classic electrochemistry; (ii) pore‐forming protein toxin in a bilayer; and (iii) patch clamp amplifier. Nanopore transduction detection leverages the above detection platform with (iv) an event‐transducer pore‐blockader that has stationary statistics and (v) ML tools for real‐time SCW signal analysis. The meaning of “real‐time” is dependent on the application. In the Nanoscope implementation discussed in Chapter 14, each signal is usually identified in less than 100 ms, where calling accuracy is 99.9% if rejection is employed, and improved even further if signal sample duration, when a call is forced, is used with duration greater than 100 ms.
Nanopore transduction detection offers prospects for highly sensitive and discriminative biosensing. The NTD “Nanoscope” functionalizes a single nanopore with a channel current modulator that is designed to transduce events, such as binding to a specific target. Nanopore event transduction involves single‐molecule biophysics, engineered information flows, and nanopore cheminformatics. In the NTD functionalization the transducer molecule is drawn into the channel by an applied potential but is too big to translocate, instead becoming stuck in a bistable capture such that it modulates the channel’s ion‐flow with stationary statistics in a distinctive way. If the channel modulator is bifunctional in that one end is meant to be captured and modulated while the other end is linked to an aptamer or antibody for specific binding, then we have the basis for a remarkably sensitive and specific biosensing capability.
In the NTD Nanoscope experiments [2] , the molecular dynamics of a (single) captured non‐translocating transducer molecule provide a unique stochastic reference signal with stable statistics on the observed, single‐molecule blockaded channel current, somewhat analogous to a carrier signal in standard electrical engineering signal analysis. Discernible changes in blockade statistics, coupled with SSA signal processing protocols, enable the means for a highly detailed characterization of the interactions of the transducer molecule with binding targets (cognates) in the surrounding (extra‐channel) environment.
The transducer molecule is engineered to generate distinct channel blockade signals depending on its interaction with target molecules [2] . Statistical models are trained for each binding mode, bound and unbound, for example, by exposing the transducer molecule to zero or high (excess) concentrations of the target molecule. The transducer molecule is engineered so that these different binding states generate distinct signals with high resolution. Once the signals are characterized, the information can be used in a real‐time setting to determine if trace amounts of the target are present in a sample through a serial, high‐frequency sampling, and pattern recognition, process.
Thus, in Nanoscope applications of the SSA Protocol, due to the molecular dynamics of the captured transducer molecule, a unique reference signal with strongly stationary (or weakly, or approximately stationary) signal statistics is engineered to be generated during transducer blockade, analogous to a carrier signal in standard electrical engineering signal analysis. In these applications a signal is deemed “strongly” stationary if the EM/EVA projection (HMM method from Chapter 6) on the entire dataset of interest produces a discrete set of separable (non‐fuzzy domain) states. A signal is deemed “weakly” stationary if the EM/EVA projection can only produce a discrete set of states on subsegments (windowed sections) of the data sequence, but where state‐tracking is possible across windows (i.e. the non‐stationarity is sufficiently slow to track states – similar to the adiabatic criterion in statistical mechanics). A signal is approximately stationary, in a general sense, if it is sufficiently stationary to still benefit, to some extent, from the HMM‐based signal processing tools (that assume stationarity).
The adaptive SSA ML algorithms, for real‐time analysis of the stochastic signal generated by the transducer molecule can easily offer a “lock and key” level of signal discrimination. The heart of the signal processing algorithm is a generalized Hidden Markov Model (gHMM)‐based feature extraction method, implemented on a distributed processing platform for real‐time operation. For real‐time processing, the gHMM is used for feature extraction on stochastic sequential data, while classification and clustering analysis are implemented using a SVM. In addition, the design of the ML‐based algorithms allow for scaling to large datasets, via real‐time distributed processing, and are adaptable to analysis on any stochastic sequential dataset. The ML software has also been integrated into the NTD Nanoscope [2] for “real‐time” pattern‐recognition informed (PRI) feedback [1–3] (see Chapter 14 for results). The methods used to implement the PRI feedback include distributed HMM and SVM implementations, which enable the processing speedup that is needed.