How effective is the current deep learning

Artificial Intelligence

There are various myths surrounding the topic of Artificial Intelligence (AI). While some paint horror scenarios on the wall, according to which machines will take control of humanity, others hope for a paradise on earth where intelligent machines take over all the work and people only do what they enjoy.

German Research Center for Artificial Intelligence (DFKI) in Kaiserslautern.

"AI systems lack awareness. For example, today they translate better in real time than most simultaneous interpreters, but have no idea what language they are dealing with, what the text means and what effect it has on the reader." Human intelligence, roughly divided into the areas of perception, thinking, knowledge and learning / training, is far from being reached in its complexity and performance by AI systems, is the conclusion of the scientist.

The terms AI, machine learning and deep learning are often mixed up and used synonymously. But there are differences. Roughly speaking, machine learning is a subset of AI, and deep learning is a subset of ML. Artificial intelligence is when machines take on tasks that imitate human intelligence, for example by planning, learning and solving problems. But this also applies to understanding language or recognizing objects, images or sounds.

A rough distinction is made between two types of AI: Artificial General Intelligence (AGI) corresponds to a machine alternative to human intelligence - with all its senses and abilities. Such systems are often referred to as cognitive systems. Artificial Narrow Intelligence (ANI) contains only certain aspects of human perception and intelligence, for example the recognition of images or faces. ANI is usually designed to solve specific tasks. Most of the AI ​​applications currently used in companies fall into the ANI category. AGI scenarios, on the other hand, still mostly belong to the field of science fiction.

Machine learning and deep learning - the tools for AI

When machines can learn and perform certain tasks with the help of algorithms without having been explicitly programmed for them beforehand, this is known as machine learning. The underlying models are the key to this. Classic statistical models from the business intelligence era require a mathematical equation and a clear understanding of the various variables.

It takes a comparatively large amount of time and data work to create such static models. If the connections are recognized and the equation is established, they can indeed provide valuable insights for your own business. But they have one disadvantage: If the data basis changes, the statistical model must also be built at least in part or even completely.

Machine learning / deep learning study 2019

Machine learning models, on the other hand, do not require any rigid rules. Put simply, they observe input and output and build their own correlations and equations from them. Such models can be developed comparatively quickly and without much effort. After all, there is no need to pack reliably reliable data correlations into the model - the model is supposed to find that out for itself.

The disadvantage here: The results that such an ML model spits out at the beginning are imprecise, sometimes difficult to interpret and therefore not necessarily target-oriented for the business. For this, several different models can be developed and tested in parallel with comparatively little effort. Another advantage is that ML models simply have to be retrained if the database changes.

An example: As school students we learned what handwritten numbers look like - in all kinds of variations, by teachers, parents, other children, etc. An oval, circular structure is a 0, a vertical line with a shorter line at the top at the bottom left is a 1, etc. What is relatively easy for our abstract brain can be infinitely difficult for a machine.

The myriad of variations in how numbers can be written make it nearly impossible to write a dedicated program to accurately identify handwritten numbers. If, for example, the circle of zero is not completely closed or the side bar of 1 is almost horizontal, the rules of a software program must capitulate.

Machine learning models are shown in the most varied of spellings for numbers and are told what to look out for. The number of lines as well as the number and position of the intersection points can be important clues for the algorithm. Unimportant information is, for example, the color or the size of the numbers.

If you now feed the ML model with the relevant features that should be paid attention to, as well as a sufficiently large number of examples, the machine can learn to recognize handwritten numbers. However, there is one restriction: the model is only as good as the properties entered apply, i.e. how well the person concerned has identified the characteristics of the object to be recognized.

Probably each of us has helped at some point: when a website asks us to mark pictures with traffic lights, shops or animals in order to determine whether a person or a machine is making a request, an algorithm is trained in the background.

Deep learning - learning without rules

Deep learning is a special type of machine learning. This is where so-called neural networks come into play. These allow the algorithms even more freedom. If you have to tell the ML algorithms in a dedicated manner how data are structured and what they have to pay attention to, you can let DL algorithms go into all possible data without any set of rules - at least in theory.

Deep learning models promise that the underlying algorithms and neural networks can do without prior definition of the relevant characteristics of the objects to be recognized. The models are trained with generally available data. The algorithm learns whether it was right or wrong and, based on this, defines its own criteria which, from the point of view of the model, are relevant for correct detection. The disadvantage of this method: It takes a lot of data and a lot of time as well as a high computing capacity for training deep learning models.

Models can abstract what they have learned

The basic idea behind ML and DL is that the models learn about data, generalize what they have learned and, ideally, can also apply it to other, previously unknown data. Models consist of different components: data, algorithms and so-called hyperparameters - higher-level key figures that say something about the learning process.

An example: In order to learn to play the piano, you need notes as well as information on certain styles of music and composition - that is the data. The algorithm states how hands and fingers should strike the piano keys in correlation to notes and other specifications such as the beat. Hyperparameters are exercise intervals and duration, place and time of the exercises, type of piano, etc. If you put all this together, you get a piano-learning model. If it is trained sufficiently, it can be assumed that it will also be able to play previously untrained, unknown pieces of music.

Machines also learn differently

Like humans, machines learn differently. There are various approaches how ML models can be developed:

  • Supervised: In so-called supervised learning, a 'teacher' tells the algorithm whether it was right or wrong. The aim is to predict a certain development, such as the cancellation of a subscriber. Or individual objects can be recognized, such as handwriting. To do this, the algorithm is trained with parameters and data until the model has achieved the desired performance.

  • Unsupervised: In the case of unsupervised training, the algorithm does not receive any information as to whether it is right or wrong. It is left to the machine to independently establish correlations and relationships between the data. In the end, however, an analyst has to decide whether the results make sense and help the business move forward. Unsupervised learning usually comes into play when the answers are not yet known (this is not the case with supervised learning: the customer cancels - which parameters indicated in advance?). A typical area of ​​application is the segmentation of customer groups, which can then be addressed with specific advertising messages and products.

The advantage of unsupervised learning is that the models are practically automatically created by themselves. Manual intervention is not necessary. In addition, this approach often offers surprising new insights into data that may open up new business options for companies. With this approach, however, it is difficult to correctly assess whether the model is working properly. Here it is important to keep an eye on various effects.

Sometimes an unsupervised model works very well with regard to a certain data category - because the model has been trained for a long time and intensively - but cannot cope with new types of data that are added later. In this case, one speaks of overfitting. Underfitting occurs when too little data is available and the model only spits out imprecise classifications. Such characteristics are sometimes difficult to recognize. Assessing and testing how well unsupervised models work can therefore be time-consuming.

In contrast, a supervised model is transparent and comprehensible. The data is structured, the result is clear. The interpretation effort remains low. But this supervised learning requires a lot of effort in order to prepare the required data and to train the model.

Machine learning - these are the basics:

There are also other approaches to learning. In the semi-supervised process, the algorithm receives some information about the data and its structure, but the model then has to assemble itself to a large extent. In "reinforcement learning", the algorithm receives information about certain steps as to whether it is correct or incorrect. Here, too, the answer is known, but there is not enough labeled data for a 'teacher' to accompany the algorithm through the entire learning process. This empowering method is closest to human learning.

With active learning, the algorithm is given the opportunity to ask for the correct answers for some of the inputs. However, the machine has to decide for itself which questions promise the most information gain. Transfer learning is used when you try out an existing model in a different area of ​​application and data. This can save time and effort in developing a suitable model. To come back to the example of the piano playing model: This could be used as a basis for developing a learning model for the accordion. Knowledge of grades is available as a basis for data, as is the ability to use fingers on a keyboard. The handling of the bass keys and the bellows has to be learned again.

Deep learning mimics the human brain

Deep learning takes a slightly different approach than classic ML methods. The basis here are so-called neural networks, which are based on the functioning of the human brain - it should be noted that the actual processes in the human brain are much more complex than a neural network could map or even imitate these nerve connections and interactions. However, the basic principle works in a similar way: In the brain, the neurons are connected to one another via synapses. Depending on the activity, these links are stronger or weaker. Individual neurons receive signals, evaluate and process them, and pass on a response signal to other neurons.

Even in an artificial neural network, individual computing units (neurons) are networked with one another in order to process information as intelligently as possible - according to the theory. In practice, deep learning architectures consist of several layers of neurons, the so-called layers. There is an input layer, several hidden layers in which the information is processed, and an output layer. The actual deep learning takes place in the hidden intermediate layers. They form the core of deep learning models.

New information is created between the individual hidden layers. These are different representations of the original input information, for example a face in an image. At this point, one also speaks of representation learning. These representations each represent an abstraction of the preceding input signal. Different degrees of simplifications of the input are therefore filtered out from an originally complex input information item through the various layers. Basically, to stay with the face example, the picture is reduced and simplified step by step to certain shapes, lines and colors.

In this way, the algorithm independently learns the characteristics that are important and is able to correctly classify new input data using this simplified, generalized information. In the training and learning process of a deep learning model, the main thing is to adjust the weighting of how certain features are to be assessed in such a way that the error rate becomes lower and lower.

Deep learning models - the more layers, the more complex

The basic structure of neural networks is always the same. However, users can definitely specify how complex such a model should function. This depends on the number of hidden layers and the number of neurons in these layers as well as their activation function, i.e. how they are networked with the neurons in the next layer. Depending on the form, there are different types of neural networks that are also aimed at special purposes.

  • The simplest form of neural networks are Feedforward Neural Networks (FNNs). They consist of an input layer, one or more hidden layers and an output layer. The information runs in one direction from input to output. There are no feedbacks or circles.

  • Convolutional Neural Networks (CNNs) are particularly suitable for the analysis of spatially arranged data, such as images or videos. Different types of layers are used here. The convolutional layers check certain areas of the input using a certain filter, for example color or brightness. As a result, so-called pooling layers discard superfluous information and thus compress the amount of information to be processed. This reduces the memory requirement and increases the processing speed.

  • In Recurrent Neural Networks (RNNs) the neurons are organized in closed circles. This means that outputs are fed back to the same neurons as inputs. This architecture is therefore particularly suitable for processing information that follows one another in time, such as time data series and language.

  • Generative Adversarial Networks (GANs)