Tuned Data Mining (TDM) and TDMR
NOTE 03/2020: TDMR 2.2 on CRAN is now available !!
The complex, often redundant and noisy data of real-world data mining (DM) applications frequently lead to inferior results when out-of-the-box DM models are applied. Tuning of parameters is essential to achieve high-quality results. We pursue in this project an approach to tune parameters of the preprocessing and the modelling phase conjointly. We propose the new framework TDM (Tuned Data Mining) which facilitates the search for good parameters and the comparison of different tuners by using mostly generic elements which are easily applied to new tasks.
The R-package TDMR (Tuned Data Mining in R) - freely available as open-source software from CRAN - is written with the aim to facilitate the setup, training and evaluation of data mining models. It puts special emphasis on tuning these data mining models as well as simultaneously tuning certain preprocessing options. TDMR is especially designed to work with SPOT as the preferred tuner, but it offers also the possibility to use other tuners (CMA-ES, LHD, direct-search optimizers) for comparison.
See the user manual TDMR-docu.pdf (CIOP-Report 02/2018 [Kone18a], last update March 2020) and the TDMR Tutorial (CIOP-Report 03/2018 [Kone18b], last update March 2020) for in-depth information on usage and development of the TDMR package.
TDMR 2.2, available on CRAN since March'2020, offers a simplified interface and integration with SPOT 2.0.
TDMR documentation and TDMR tutorials have been rewritten to account for the simpler interface.
Project Members
Dr. Patrick Koch, TH Köln | Prof. Dr. Wolfgang Konen, TH Köln |
Publications
Publications in the area of Tuned Data Mining (TDM) and TDMR:
2018
The TDMR 2.0 Package: Tuned Data Mining in R Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Science and Engineering Science, Nr. 02/2018, 2018, (Last update: April 2018 (original version: 2012)).
The TDMR 2.0 Tutorial: Examples for Tuned Data Mining in R Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Science and Engineering Science, Nr. 03/2018, 2018, (Last update: April 2018 (original version: 2012)).
2015
Efficient multi-criteria optimization on noisy machine learning problems Artikel
In: Applied Soft Computing, Bd. 29, S. 357-370, 2015.
2013
Subsampling strategies in SVM ensembles Proceedings Article
In: Hoffmann, Frank; Hüllermeier, Eyke (Hrsg.): Proceedings 23. Workshop Computational Intelligence, S. 119–134, Universitätsverlag Karlsruhe, 2013.
SVM ensembles are better when different kernel types are combined Proceedings Article
In: Lausen, Berthold (Hrsg.): European Conference on Data Analysis (ECDA13), (under review), 2013.
2012
The TDMR Package: Tuned Data Mining in R Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Science and Engineering Science, Nr. 02/2012, 2012, (Last update: June 2017).
The TDMR Tutorial: Examples for Tuned Data Mining in R Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Science and Engineering Science, Nr. 03/2012, 2012, (Last update: May, 2016).
Tuning and Evolution of Support Vector Kernels Artikel
In: Evolutionary Intelligence, Bd. 5, S. 153–170, 2012.
Efficient sampling and handling of variance in tuning data mining models Proceedings Article
In: Coello, Carlos A. Coello; Cutello, Vincenzo; others, (Hrsg.): PPSN'2012: 12th International Conference on Parallel Problem Solving From Nature, Taormina, S. 195–205, Springer, Heidelberg, 2012.
The TDMR Framework: Tuned Data Mining in R Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Science and Engineering Science, Nr. 02/2012, 2012.
2011
Ensemble Based Optimization and Tuning Algorithms Proceedings Article
In: Hoffmann, Frank; Hüllermeier, Eyke (Hrsg.): Proceedings 21. Workshop Computational Intelligence, S. 119–134, Universitätsverlag Karlsruhe, 2011.
On the Tuning and Evolution of Support Vector Kernels Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Scienceand Engineering Science, Nr. 04/11, 2011, ISSN: 2191-365X.
Tuned Data Mining in R Proceedings Article
In: Hoffmann, Frank; Hüllermeier, Eyke (Hrsg.): Proceedings 21. Workshop Computational Intelligence, S. 147–160, Universitätsverlag Karlsruhe, 2011.
On the Tuning and Evolution of Support Vector Kernels Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Scienceand Engineering Science, Nr. 04/11, 2011, ISSN: 2191-365X.
Tuned Data Mining: A Benchmark Study on Different Tuners Proceedings Article
In: Krasnogor, Natalio (Hrsg.): GECCO '11: Proceedings of the 13th Annual Conference on Genetic andEvolutionary Computation, S. 1995–2002, 2011.
Self-configuration from a Machine-Learning Perspective Forschungsbericht
Research Center CIOP (Computational Intelligence, Optimization and Data Mining) Cologne University of Applied Science, Faculty of Computer Science and Engineering Science, Nr. 05/11; arXiv: 1105.1951, 2011, ISSN: 2191-365X, (e-print published at http://arxiv.org/abs/1105.1951 and Dagstuhl Preprint Archive, Workshop 11181 "Organic Computing -- Design of Self-Organizing Systems").
2010
Optimizing Support Vector Machines for Stormwater Prediction Proceedings Article
In: Bartz-Beielstein, Thomas; Chiarandini,; Paquete,; Preuss, Mike (Hrsg.): Proceedings of Workshop on Experimental Methods for the Assessment of Computational Systems joint to PPSN2010, S. 47–59, TU Dortmund, 2010.
Optimization of Support Vector Regression Models for Stormwater Prediction Proceedings Article
In: Hoffmann, Frank; Hüllermeier, Eyke (Hrsg.): Proceedings 20. Workshop Computational Intelligence, S. 146–160, Universitätsverlag Karlsruhe, 2010.
2009
Optimized Modelling of Fill Levels in Stormwater Tanks Using CI-based Parameter Selection Schemes (in german) Artikel
In: at-Automatisierungstechnik, Bd. 57, Nr. 3, S. 155–166, 2009.