Variations of SVM

The use of artificial intelligence to evaluate measurement data from intelligent pigs from pipelines

Konrad Reber, Herbert, Willems, Alfred Otto Barbian NDT Systems & Services AG, Stutensee
Marius Zoellner, Marco Ziegenmeyer Research Center for Computer Science, IDS Group, Karlsruhe
Contact: Dr.-Ing. Konrad Reber


With the increasing automation of NDT measurements and the associated need to search large amounts of data quickly for advertisements, the need for computer-aided expert knowledge increases. For some time now, methods of artificial intelligence have been used to represent the expert knowledge of experienced employees. Examples are collected that have been assessed manually beforehand and then trained. At NDT Systems & Services AG there is a need to evaluate inspection data from pipelines recorded with intelligent pigs as quickly as possible. Here the problem is exacerbated by the fact that the data are available in one fell swoop, but the entire data must be sifted for major line defects after a short period of time.
As a rule, the problem is solved by first searching the data for defective candidates. In a second step, the defect candidates are differentiated into relevant error displays and irrelevant signals with the help of artificial intelligence. The performance of the system must therefore consist in securely suppressing irrelevant advertisements. The interaction between defect search and classification must be taken into account.
In the past, neural networks were often used for the classification task. This article presents results that show how the use of so-called support vector machines (SVM) can facilitate handling and subsequent use. A comparison of SVM with neural networks is presented. The topic of retraining, i.e. the expansion of expert knowledge, is discussed, especially with regard to the selection of learning examples and their archiving. How can the scope of the learning examples be reduced without the knowledge decreasing again? How can it be shown that knowledge really increases with relearning and in what way does it do that?

1 Introduction

For the reliability of non-destructive testing, a distinction should be made between the two areas of data recording and data interpretation. While improvements on the side of data recording are primarily based on technological advances in hardware, improvements in the area of ​​data interpretation are also largely characterized by the influence of human factors.
Although this human factor can hardly be neglected in the future, it should be placed on a basis of higher reproducibility through the use of so-called artificial intelligence. The decision-making in the interpretation becomes comprehensible and also predictable.
The use of artificial intelligence is often viewed with mixed feelings. The risk of human work becoming superfluous or at least less important often leads to a lack of acceptance. The term "artificial intelligence" certainly contributes to this. If it is viewed more pragmatically, it is actually a progressive tool and a standardization of decision-making processes in which a rule-based decision is not possible.
Artificial intelligence was introduced to the field of non-destructive testing some time ago. The degree of automation can be further increased, particularly in the area of ​​automated processes in production. The aim of this article is to describe an area where the amount of data is so large that only the automation of the processes can lead to a reasonable data evaluation time.
The inspection of pipelines is now usually carried out with so-called intelligent pigs. These are test robots that are conveyed through the pipe with the medium and carry out various types of tests. We'll focus here on ultrasonic inspection for corrosion testing. The intelligent pigs collect measurement data independently over several days and save them on board on mass storage devices [There are also cable-guided pigs that send their measurement data through a cable to a measuring station outside the pipeline. Here we will limit ourselves to free-moving newts]. When the device is received, a large amount of data is suddenly available for evaluation. Nevertheless, a quick and reliable evaluation of the results is required. Because pipeline operators want a quick overview of the condition of the pipeline, it has become customary to produce a preliminary report listing the most dangerous faults. The full report will then be delivered a few weeks later. However, there is a contradiction in identifying the most dangerous errors immediately without having evaluated all the recorded signals. A dangerous error will not necessarily be noticed by a high signal amplitude. As long as a signal has not been evaluated, it cannot be said that it does not belong to the set of dangerous errors. Therefore, most of the evaluation has to be carried out for the preliminary report.
The need for help in quickly analyzing the large amount of data is therefore not diminished by the possibility of a preliminary report. Rather, this help is needed at an early stage.
The first solutions in this context using artificial intelligence can be found in [1]. In this case, neural networks are used. In a final step, the signals found are classified. It is typical of ultrasonic inspection that classification is the final step in the evaluation. The actual size of the error (e.g. the residual wall thickness in the case of corrosion) is not the subject of interpretation in the ultrasonic test, in contrast to the flux leakage test. In ultrasonic testing, however, it is essential to distinguish corrosion from other types of defects.

2. The advantage of automatic signal classification

The automation of the evaluation can be divided into different steps. As soon as the entire amount of data is available, the first task is to create a pipe book, that is, a list of all circumferential seams. The second step consists of a search for defect candidates. Areas in which the measured wall thickness deviates from the nominal are marked by creating a rectangle (box) that describes the position in the pipe. This process, also known as boxing, is carried out by practically all operators of intelligent pigs. In the language of pattern recognition, segmentation is used here. Areas of no interest are separated from areas of interest.
The second step in automation is classification. In this step, the ads in the boxes are divided into different classes. Obviously, the quality of this level also depends largely on the success of the first step (boxing). So there is an important interaction between the two processes. The classification depends on the boxes generated being suitable for classification. Boxing, on the other hand, pays less attention to the suppression of irrelevant signals, because this is then carried out in the second step. If the suppression of irrelevant signals in boxing is already very successful, the classification step saves little time and can possibly be omitted.
If the density of the signals is very high (e.g. because there is a lot of corrosion), the automatic system can no longer contribute much to acceleration because all relevant signals are checked manually anyway.
Errors occur in both steps of the evaluation. For the following considerations, the errors in the classification step should be examined further. A first type of error is that an error that is actually relevant is mistakenly regarded as irrelevant and is therefore discarded. The error is missing from the report at the end. A second type of error is that a signal that is actually irrelevant is mistaken for a defect. It is only after a check that there is no error and that the excavation was unnecessary. Obviously, you can hardly minimize both errors together. Errors of the first type are to be avoided in any case, while errors of the second type only undermine the efficiency of the test.

3. The advantages of support vector machines

Support vector machines belong to a relatively new family of kernel methods that combine the simplicity and efficiency of linear algorithms such as the Perceptron algorithm with the flexibility of nonlinear systems such as neural networks and the rigor of statistical considerations . By reducing the learning step to a convex optimization problem, which is always possible in polynomial time, the problem of local minima, which is typical for neural networks, decision trees and other non-linear approaches, can be avoided. Therefore, teaching in support vector machines is deterministic and relearning is faster and easier. Furthermore, due to their origins in the principles of statistical learning theory, they are surprisingly insensitive to overtraining, especially under circumstances where other methods are impaired by the "curse of the high dimension". [With an objective function to be estimated and an accuracy threshold the amount of data required for the estimate increases exponentially with the dimension of the data.]
The basic idea of ​​the kernel method is to first embed the data in a suitable vector space in order to then identify relevant patterns in the resulting set using simple linear methods. If the embedding map is not linear, nonlinear relationships can also be recognized with linear algorithms. This mapping alone does not solve the problem, but it can be used very effectively in conjunction with the following two observations.

  • The support vector algorithm only needs the information about the relative positions of the data vectors in the embedding space, which is given by their linear product.
  • The projection of the linear products from the data vectors into the higher-dimensional embedding space can be calculated directly from the input data using the so-called kernel function.

Support vector classification offers an efficient way to find good separating hyperbolas in a high-dimensional vector space, where good stands for an optimal generalization limit and efficient for the possibility of being able to handle data sets of hundreds of thousands of vectors. Due to the clear specifications of generalization theory for controlling power, overtraining can be prevented by controlling the limits in the hyperplanes. The actual value of the resulting decision functions is called activation. It is a measure of the distance between the projected data vector and the separating hyperplane. It can therefore be regarded as a good thing for the classification result. A lower value indicates an unsafe classification, a higher value indicates a higher degree of security.

4. The implementation and quality tests

In the actual implementation, classes for metal loss, lamination, dent, installation and inclusion were introduced. To complete this, there is also a class for displays that cannot be clearly assigned (ambiguous). Finally, there is a class that includes indications that are not the result of an error, but represent meaningless signals and are therefore irrelevant.
The process after which a defect candidate is sorted into the class "ambiguous" is important for the concept. For all implementations of artificial intelligence, just as in real life, there are decisions that cannot be made clearly. In the case of ultrasonic testing of pipelines, there are often cases of doubt in which z. For example, it cannot be clearly stated whether an indication is an inclusion or a lamination. Laminations that are short or interrupted can just as easily be viewed as inclusions. Likewise, metal leaks that are very shallow are unlikely to be corrosion, but natural variations in wall thickness and are therefore irrelevant. A line must be drawn that is arbitrary.
With SVMs, this ambiguity becomes apparent in various ways. If the learning process is not yet completely finished, there are vectors that do not generate an activation for any of the classes. In this case, as well as in the case that all activations are below a defined threshold, the class "ambiguous" is assigned to the feature vector. This assignment reflects the incompleteness of learning, which of course can still occur after some learning time. When the learning is well advanced, several classes can be activated. In such a case, one could think of rules that then dictate which class is to be finally entered. It seemed clearer, however, to classify these cases as undecidable as well. These ads will be classified by hand later anyway. There are no "ambiguous" learning examples.

Quality check

In order to be able to check the quality of the classification as training progresses, the quality parameters must first be determined. The two errors of the first and second kind mentioned above can of course be used for this purpose. With the existing implementation, however, other values ​​come into question. In order to be able to compare the classification results with the correct results of an independent validation, the so-called confusion matrix is ​​used. In this matrix, the number of defect candidates that have been assigned to a class is compared with the number of those actually belonging to this class. A set of boxes from a specific inspection that has been classified by hand is chosen as the validation set. This classification is believed to be correct. The table below is the associated confusion matrix that resulted after classification with an SVM. For example, you can see that the SVM has classified 32 displays as lamination. Of these, 8 are really laminations, while 24 are considered irrelevant and must be discarded. All type 2 errors are highlighted with a blue frame. Together there are 1007 errors out of a total of 15777 ads. Type 1 errors are indicated by a red frame. There are seven in total.

Other characteristics are the number of boxes that have to be checked again, the number of ads that have been classified correctly or incorrectly. The number of errors that have to be checked again is given by the number of entries that have not been classified as irrelevant. The number of advertisements that have been correctly classified is given by the trace. For these displays, the classification result corresponds to the default.
Five SVMs were trained, each with extended learning data sets. The previous data record is therefore a subset of the previous data record. The learning data was taken from six different data sets. They were chosen to maximize the variation within the crowd. The first data set consists of 1635 examples, the last one consists of 8273 examples. Several aspects will be checked as training progresses.

Consistency with previous training sets

The ads that were initially correctly classified should not be classified incorrectly as learning progresses. A set of indicators (204 pieces) was selected that were correctly classified on the smallest training set. Since similar advertisements can already be found in this sentence, there are a sufficient number of such advertisements. The correctness of the classification is now checked with all other training sets. The quality is described with the parameters mentioned. The results are shown in Figure 1.
The behavior was expected to the extent that the number of correctly classified advertisements would decrease slightly again. A further increase is not possible due to the selection.With just a little more learning, the quality decreases somewhat, but then remains constant. It is important that further learning does not continue to worsen the results of examples trained at once. Otherwise, further learning would not always have a positive effect and the new learning examples would first have to be checked for their effect. The deterioration measured here is small.

Fig. 1: Change in the performance parameters as learning progresses for a data set that was correctly classified with the first learning data set.

Progress of learning

A set of displays was selected that are new to the SVM. This means that none of the examples are known to the SVM as learning examples. Usually, the quality with the smallest learning data set is the worst. As learning increases, the quality should increase continuously. Figure 2 shows the number of Type 2 errors and the number of displays still to be checked. Both values ​​decrease. The quality becomes significantly better, especially after the first relearning. Many displays are assigned an error class instead of the "ambiguous" class. In this way you can see how the progress in learning is also reflected in the quality of the result.

Fig. 2: Performance parameters for progressive learning on an independent validation set.

Two further parameters are shown in Figure 3. The number of displays without activation shows a similar behavior to the number of displays still to be checked above. After a steep drop after the first relearning, the curve drops slightly further. In particular, the effect of the first relearning can easily be explained if one takes a closer look at the learning sentence.

Fig 3: Further performance parameters for progressive learning on an independent validation set.

In the second training data set, ads were added that are very similar to the validation set. The ads come from pipelines that are both approximately 40 years old, made of similar pipes, and both convey the same product. Therefore, the degree of loss of echo, noise, and other influences are similar. The feature vectors in the feature space are very close to the vectors from the learning data set.
The other curve in Figure 3 shows the course of the first type of errors. Instead of an expected decrease, an increase can be observed. Although the increase occurs at a low level, it rises to almost one percent, the behavior is unsatisfactory and should be investigated further.
A statistical analysis was performed on all advertisements that were incorrectly discarded. That was 134 displays for the last learning data set. Three laminations and two inclusions were discarded. This is not a problem for smaller sizes. There remain 129 metal losses. It must be noted that it is an arbitrary determination from which size an ad can still be considered relevant. Often this limit is negotiated with the client. As a rule, metal losses must be stated if they exceed 1 mm. Figure 4 shows a depth histogram of these 129 displays.

Fig. 4: Depth histogram of displays with an error of the first type in the classification. The ambiguity of the learning data set is responsible for the errors.

The displays are treated separately depending on whether they are in the base material or in the weld seam area. The defects are very flat, especially in the base material. The fact that these errors are increasingly being sorted out must be attributed to the fact that learning examples made from seamless pipes are increasingly being added. There are many manufacturing-related changes in wall thickness in this type of pipe. A reduction in wall thickness is therefore less unusual than in welded pipes. In welded pipes, small changes can already indicate the onset of corrosion, which is then of interest. In welded pipes there are many learning examples of wall weaknesses of only 0.6 mm. Although this size of error will usually not appear in a report, it was considered useful if the decision "discard" or "keep" is made by hand later and the machine first switches to "metal loss" should decide.
In seamless pipes, however, errors of 0.6 mm are irrelevant signals that should be sorted out from the start. As it appears, the classifier cannot adequately infer one of the two pipe types on the basis of the characteristic data. It is therefore planned to set up the SVM models separately for the two pipe types.

The influence of boxing

As was already known, boxing has a significant influence on the result of the evaluation. In the case of magnetic leakage flux inspections, the size and position of the box can even have an influence on the calculated error depth. For the ultrasonic inspection dealt with here, the aim is to examine what influence boxing can have on the classification result. Of course, only the selection of the boxes for the learning examples is reflected here. The boxes are created with an algorithm which in turn has parameters that influence the number, size and degree of jaggedness of the boxes. On the left-hand side of Figure 5, a metal loss-like defect is shown, which is described by a box with increasing size. A total of 10 boxes are available, with one box always fully enclosing the smaller one. On the right-hand side of Figure 5, the activation of the "metal loss" and "irrelevant" classes is shown for all boxes with increasing size.

Fig. 5: The effect of an increasing box size on the activations of the individual classes. There is an optimal mean size.

The middle size shows the highest activation for metal loss, while for the smaller and the very large boxes the activation for irrelevant predominates. The metal loss classification would be correct in this case. The boxes should therefore always be created in the same way as they were obtained for the learning examples. Alternatively, one would have to include learning examples for all possible variations of box sizes in the learning data set.

5. The handling of the learning data

How the knowledge base is expanded is a crucial question to the value of the system as a whole. So far, these systems for knowledge representation have always been taught in at the beginning and then left that way. A major aspect of this work is to make the expansion of knowledge as easy as possible. The relearning process must be precisely defined so that the previous results can be archived properly and verified. The new learning examples are then obtained from an upcoming inspection. As soon as the manual check of the relevant indications has been carried out, the "true" classification is known. Any ads that have now been classified as "ambiguous" or incorrectly will be considered as new learning examples. This relearning can be done with any inspection project. In practice, however, that would be a very common process. It makes sense to learn a little less often and then more intensively. This means that a new model has to be created and archived less frequently.
To make it easier to recognize the new learning data, a column for classification according to the automatic and a column according to the manual check is added to the table with the box information. An identifier can be set to mark important learning examples. In this way, the relearning can be carried out at any time.
As learning progresses, the data set becomes larger and larger. However, every new learning requires that the entire learning data set be taken into account. Due to the principle of SVM, only a few feature vectors (the support vectors) that are relevant for the decision are selected later. As a rule, this is significantly less than the total in the learning data set. It should therefore be possible to reduce the learning data set without reducing the capabilities of the classifier. Methods to reduce the learning data set in a targeted manner are currently the subject of research [3].

6. Summary

Support vector machines represent an efficient means of realizing artificial intelligence in the analysis of data from pipeline inspections. Relearning and thus expanding the knowledge base is significantly easier than with other comparable methods. With the implementation presented, it is now possible to retrain on special occasions. With the implementation presented, a significant acceleration of the data evaluation is expected. The decision-making becomes comprehensible and independent of human factors.


  1. R. Suna, K. Berns, K. Germerdonk, A.O. Barbian, Pipeline diagnosis using backpropagation networks, Neuro-Nimes, 1993
  2. Cristianini, Nello and John Shawe-Taylor (2003). Support Vector and Kernel Methods. In: Berthold, Michael and David J. Hands (Eds.): Intelligent data analysis. Springer-Verlag Berlin Heidelberg New York.]
  3. M. Ziegenmeyer, Optimization and adaptation of the support vector classification motivated by real diagnostic applications, Master's thesis, FZI, Karlsruhe 2003