Applications of Machine Learning methods

MLnetOiS Logo left

MLnetOiS Logo right

Applications of Machine Learning methods

Learning to fly

Symbolic machine learning methods were used to learn to control a (simulated) airplane. The conditions of the airplane, the information on the control panel and the actions were recorded of pilots who were flying an airplane in single flight plan. These were divided into situation-action pairs. A machine learning method was used to discover which action in general was appropriate for a given condition. It could be shown that the resulting system was able to follow the same flight plan even though conditions vary between flights as result of small disturbances. The first study (Sammut et al.) was done with a Cessna simulation and the second (Camacho) with an F-16 jet fighter.

More information:

C. Sammut, S. Hurst, D. Kedzier and D. Michie (1992) Learning to fly, in: D. Sleeman and P. Edwards (eds) proceedings of the Ninth International Workshop on Machine Learning

R. Camacho (1998): Inducing Models of Human Control Skills in the 10th European Conference on Machine Learning (ECML-98), Chemnitz, Germany

Learning to predict the chemical structure of diterpenes from magnetic resonance spectra

Diterpenes are organic compounds of low molecular weight with a skeleton of 20 carbon atoms. They are of significant chemical and commercial interest because of their use as lead compounds in the search for new pharmaceutical effectors. The interpretation of diterpene 13C NMR spectra normally requires specialists with detailed spectroscopic knowledge and substantial experience in natural products chemistry, more specifically knowledge on peak patterns and chemical structures. Given a database of peak patterns for diterpenes with known structure, we apply several ILP approaches to discover correlations between peak patterns and chemical structure. Performance close to the one of domain experts is achieved, which suffices for practical use.

More Information:

S. Dzeroski, S. Schulze-Kremer, K. Heidtke, K. Siems, D. Wettschereck and H. Blockeel (1998) Diterpene Structure Elucidation from 13C NMR Spectra with Inductive Logic Programming, Applied Artificial Intelligence, 12, p.363-384.

Learning to monitor intensive care patients

A number of measures are taken with varying intervals of patients in the intensive care units of hospitals. Alarming conditions are recognised by computer systems that then send a warning to nurses on guard. Current systems are not able to recognise a number of important conditions or rather, they recognise them later than necessary. A critical condition can often recognised early by noting changes in variables that cannot be explained from day-night rhythm, medication, etc. This required development of time-series methods for multivariate online data. Morik and her colleagues used a variety of machine learning methods in combination with "manual" knowledge acquisition to build a monitoring system that does a better job. The most important goals are early recognition of outlier data, both measurement errors and pathological phenomena, judgment of the effect of therapeutical measures, discovery of unknown relations between physiological variables.

http://www-ai.cs.uni-dortmund.de/FORSCHUNG/PROJEKTE/SFB475C4/

and:

K. Morik, M. Imboff, P. Brockhausen, T. Joachims and U. (2000) Knowledge discovery and knowledge validation in intensive care, Artificial Intelligence in Medicine, Vol. 19, 3, pp. 225 – 249.

Predicting customer loyalty

The purpose of this application is to understand, explain and predict customer behaviour, in particular loyalty to and satisfaction with a brand of cars and with a dealer. This understanding has a number of applications in marketing and general management of car sales. A study at Limburgs Universitair Centrum Diepenbeek, Belgium used machine learning methods (decision tree learning) instead of the statistical methods that are normally used, to construct a model of customer loyalty from data describing properties of customers, including their buying records. The resulting model showed better results than the regression models.

K. Vanhoof, J. Bloemer and K. Pauwels (1997) A Case Study in Loyalty and Satisfaction Research, in: M. van Someren and G. Widmer (eds.) Proceedings ECML, p.290-297

Learning to predict preferences for TV programmes

When more TV programmes become available via digital broadcasting, it becomes more difficult to choose between programmes. A team at University College Dublin used a variety of methods, including Machine Learning and Case-Based Reasoning to learn the preferences of individuals. The system records which programs are selected (and which are not), extracts the general preferences of users and uses this information to advice users about the available channels.

and or Smyth, B. & Cotter, P. (2000) A Personalised TV Listings Service for the Digital TV Age. Journal of Knowledge-Based Systems. vol. 13, pp. 53-59.

Discovering properties of chemicals

A number of studies performed by Computing Lab, Oxford University in colalboration with others address the problem of discovering unknown properties of chemicals. One such proprty is toxicity. For research and development of new medicine and also for other chemicals it is important to know in advance if the chemical will be toxic. One of the purposes of these studies is to construct a model that can predict from their chemical composition and structure if new chemicals will be toxic. This can be used to avoid expensive testing or even chemical engineering to construct it.

As input were used descriptions of the chemical structures of chemicals and their known toxicity. Inductive Logic Programming methods were used because often the structure of organic chemicals is important and this needs generalisation over structures besides values of attributes of chemicals.

For more information see:

Toxicology challenge: about a competition on learning predictive models

A. Srinivasan and R.D. King (1999). Feature construction with Inductive Logic Programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Mining and Knowledge Discovery, vol. 3, pp. 37-57.

R.D. King, A.Srinivasan, and M.J.E. Sternberg (1995). Relating chemical activity to structure: an examination of ILP successes. New Generation Computing, vol. 13, pp.411-433.

Learning to predict credit-worthiness

Assessing the credit-worthiness of an individual, a company or an organisation is a key activity of many financial institutions. Using records of clients with information about their financial status that was collected later, a model can be constructed to predict credit-worthiness. Machine Learning methods, including Bayesian Learning (e.g. in the showrisk project at GMD) and decision tree learning (at : Jozef Stefan Institute and University of Ljubljana) have been succesfully used for this problem.

for more information, see for example:

Gerhard Paaß, Jörg Kindermann: Bayesian Classification Trees with Overlapping Leaves Applied to Credit-Scoring, in: X. Wu , R. Kotagiri, K.B. Korb (eds.): Research and Development in Knowledge Discovery and Data Mining. Springer-Verlag, Berlin 1998 pp. 234 – 245

T. Urbancic, V. Krizman, I. Kononenko (eds) (1998)

Learning to predict and optimise pollution

Controls of production processes affect the output of chemicals like nitrogen into air or water. Due to regulations it is important to minimise this output. In several applications, data were collected on different control settings and other parameters of the process and waste chemical output. Several types of machine learning and case-based reasoning methods were used to learn a model that predicts waste chemical output. This model was then used to set controls such that waste chemical output was minimised. An example of how problem was solved by University of Catalunya (for waste water treatment)

Learning to recognise "spam" email messages

Email is used increasingly to distribute on a very wide scale messages in which most users are not interested: spam. A system can be trained to recognise spam messages and delete them or store them at a special place for later scanning. A team at NCSR Demokritos Greece collected a large number of messages labelled as spma or not-spam and used a bayesian approach to learn to recognise spam. The method was designed with a cost parameter so that the user can tune the system to avoid large "costs" as a result of missed emails.

More information:

I. Androtsopoulos, J. Koutsias, V. Chandrinos, G. Pailiouras and C.D. Spyropoulos (2000) An evaluation of naive bayesian anti-spam filtering, Procs. Machine Learning in the New Information Age.

Learning to filter advertisements on web pages

Many webpages contain advertisements that users may not want to download and watch. One approach is to train a system by pointing out advertisements so that the system can learn to recognise and filter advertisement. A team at University College Dublin realised this in a system, using Machine Learning methods.

Learning to detect intrusions in computer networks

Computer networks must be guarded against intruders, human users who are not entitled to use the network or software that performs operations that are illegal. A team at the University of Torino used records of network operations that we classified as legal and illegal to learn to recognise operations that are illegal.

More information:

M. Mischiati and F. Neri (2000) Applying local search and genetic evolution in concept learning systems to detect intrusion in computer networks, R. Lopez de Mantaras and E. Plaza (eds) Proceedings European Conference on Machine Learning ECML-2000, Springer Verlag.

Designing a classification for documents

Documents can be classified into groups on the basis of common content or common character (e.g. scientific articles, letters, newspaper articles). Attempts are made to construct such classifications automatically. Several projects have made significant progress on this problem. The result is a hierarchy of class definitions, an ontology, with associated documents. This ontology can support document retrieval and indexing of new documents.

More information, for example:

V.A. Tamma, P.R.S. Visser, D. Malerba and D.M. Jones (2000) Computer-assisted ontology clustering for knowledge sharing,

Yahoo Planet project

Medical diagnosis and prognosis

A large number of applications concerns recognising diseases and predicting the development of a disease. In these studies patient records are collected and used to learn a model that can recognise a disease or predict its development. Some examples are:

M.Kukar, I.Kononenko, C.Groselj, K.Kralj, J.Fettich (1999) Analysing and improving the diagnosis of ischaemic heart disease with machine learning, Artificial Intelligence in Medicine, 16:25-50

Learns a model for prediction of progression of coronary heart disease under different treatments done at : Jozef Stefan Institute and University of Ljubljana

B. Zupan, J. Demar, M. W. Kattan, J. R. Beck and I. Bratko (2000). Machine learning for survival analysis: a case study on recurrence of prostate cancer, Artificial Intelligence in Medicine, Vol. 20 , pp. 59 – 75.

Uses descriptions of patients with prostate cancer, treatment and process

to learn a model for prognosis of prostate cancer. A study by Jozef Stefan Institute and University of Ljubljana, Baylor College of Medicine, Houston, USA and Memorial Sloan Kettering Cancer Center, New York, USA

P J D Andrews, A McQuatt, P A Jones, T Howells & D H Sleeman, Decision Tree Analyses of Demographic, Time Series Physiological and Medical Complication Data After Traumatic Brain Injury, British Journal of Anaesthesia, 82, 455-456, 1998.

A model for diagnosis of head injury Dept. of Computing Science Aberdeen

Learning to recognise critical road sections

When a road is jammed, the cause can be elsewhere. A research group at Univ. Leuven used machine learning (Inductive Logic Programming) to learn, from accident descriptions and knowledge about the road system, to recognise the probable cause of a traffic problem.

S. Dzeroski, Nico Jacobs, M. Molina and C. Moure (1998) ILP experiments in detecting traffic problems, Proceedings of the 10th European Conference on Machine Learning (ECML'98) (C. Nédellec C. Rouveirol, ed.), LNAI, vol. 1398, Springer, 1998, pp. 61-66

Evaluating water quality

The ecological quality of water is measured by a combination of subjective and objective measures. To allow more objective measurement of the quality of water in rivers, data were collected on the water (e.g. amount of oxygen, variations in oxygen) and the location where it was sampled (e.g. flow, depth) and judgments of water quality.

A model was constructed by researchers at Jozef Stefan Institute and University of Ljubljana to predict human evaluation. The model relies on a combination of numerical regression methods and symbolic learning.

More information:

S. Dzeroski, J. Grbovic and W.J. Walley (1998) Machine Learning applications in biological classification of river water quality, in: R.S. Michalski, I. Bratko and M. Kubat (eds) (1998) Machine Learning and Data Mining: methods and applications, Chichester:John Wiley and Sons.

More applications

More descriptions of applications of Machine Learning can be found at:

R.S. Michalski, I. Bratko and M. Kubat (eds) (1998) Machine Learning and Data Mining: methods and applications, Chichester:John Wiley and Sons.

This book describes methods and applications of machine learning in several areas: design, medical diagnosis, reproducing human skills

Several companies have websites that summarise succesful applications of machine learning:

SPSS

Attar Software

WhiteCross

Deadalus (in spanish)

Enprotelligence