Читать книгу Informatics and Machine Learning - Stephen Winters-Hilt - Страница 25
1.10 Deep Learning using Neural Nets
ОглавлениеML provides a solution to the “Big Data” problem, whereby a vast amount of data is distilled down to its information essence. The ML solution sought is usually required to perform some task on the raw data, such as classification (of images) or translation of text from one language to another. In doing so, ML solutions are strongly favored where a clear elucidation of the features used in the classification are also revealed. This then allows a more standard engineering design cycle to be accessed, where the stronger features thereby identified may play a stronger role, or guide the refinement of related strong features, to arrive at an improved classifier. This is what is accomplished with the previously mentioned SSA Protocol.
So, given the flexibility of the SSA Protocol to “latch on” to signal that has a reasonable set of features, you might ask what is left? (Note that, all communication protocols, both natural (genomic) and man‐made, have a “reasonable” set of features.) The answer is simply when the number of features is “unreasonable” (with enumeration not even known, typically). So instead of 100 features, or maybe 1000, we now have a situation with 100 000 to 100s of millions of features (such as with sentence translation or complex image classification). Obviously Big Data is necessary to learn with such a huge number of features present, so we are truly in the realm of Big Data to even begin with such problems, but now have the Big Features issue (e.g. Big Data with Big Features, or BDwBF). What must occur in such problems is a means to wrangle the almost intractable large feature set of information to a much smaller feature set of information, e.g. an intial layer of processing is needed just to compress the feature data. In essence, we need a form of compressive feature extraction at the outset in order to not overwhelm the acquisition process. An example from the biology of the human eye, is the layer of local neural processing at the retina before the nerve impulses even travel on to the brain for further layers of neural processing.
For translation we have a BDwBF problem. The feature set is so complex the best approach is NN Deep Learning where we assume no knowledge of the features but rediscover/capture those features in compressed feature groups that are identified in NN learning process at the first layer of the NN architecture. This begins a process of tuning over NN architectures to arrive at a compressive feature acquisitiuon with strong classification performance (or translation accuracy, in this example). This learning approach began seeing widespread application in 2006 and is now the core method for handling the Big Feature Set (BFS) problem. The BFS problem may or may not exist at the initial acquisition (“front‐end”) of your signal processing chain. NN Deep Learning to solve the BFS problem will be described in detail in Chapter 13, where examples using a Python/TensorFlow application to translation will be given. In the NN Deep Learning approach, the features are not implicitly resolvable, so improvements are initially brute force (even bigger data) since an engineering cycle refinement would involve the enormous parallel task of explicitly resolving the feature data to know what to refine.