Products for data mining
Data analysis problems today have become ever more complex, with large numbers of data objects each represented by 10’s, 100’s or 1000’s of variables, as in functional genomics, spectroscopic outputs etc. Even with this data the fact is there are vast gaps of missing data through the last of understanding of any and all problem domains. Conventional methods using empirical interpretation are thwarted by the resulting combinatorial explosion of solutions to be evaluated.
Irrespective of the nature of the data mining problem, the procedure for solving the problem is the same. Consider the simple problem of deciding whether or not to use each of 100 variables in a predicitive model: this gives 2^100 (2 to the power of 100) possibilities, which is almost the same as 10^30. Consider this on the scale of the life time of the universe, which is ‘’only’’ 10^17 seconds! To find a solution for this comparatively simple solution by random search would therefore take an eternity... but fortunately nature has shown us a way – a process which is both incredibly simple & yet phenomenally powerful – natural selection.
Computing now has an equivalent approach to solving complex problems: Evolutionary computing or the evolution of computer programs by methods of Darwinian selection.
TheGmax utilizes the novel Genomic Computing (GC) technique to tackle the hugely complex data mining problems found today in life sciences. GC is a supervised method in which we use the known output properties of one dataset to learn the rules which generalize to predict from new samples in that problem domain.
Click image for a larger view
Download PowerPoint presentation of TheGMax (PDF)
Download table with typical application areas (PDF)
Download the following article "A good method has two key properties" (PDF) by professor Douglas Kell, one of the "inventors" of the genomic computing technique used in TheGMax.
Professor Douglas Kell suggests that genetic computing can offer a solution to the data mining and predictive modelling challenges of today. Download his article (PDF).
Douglas Kell further discusses the issue "Genotype–phenotype mapping: genes as computer programs" in the following publication (PDF)
You will find more references here.