WIC 2014 Tutorial - Hot Ideas for Interactive Knowledge Discovery and Data Mining for Biomedical Informatics

Extravaganza Tutorial on Hot Ideas for Interactive Knowledge Discovery and Data Mining in Biomedical Informatics

Andreas Holzinger
Medical University Graz, Institute for Medical Informatics, Statistics and Documentation Research Unit HCI, Austrian IBM Watson Think Group
& Graz University of Technology, Institute of Information Systems and Computer Media, Graz, Austria

ABSTRACT: 
One of the grand challenges in our networked world are the large, complex, and often weakly structured data sets, and the massive amounts of unstructured information. This “big data” V4-challenge (Volume, Variety, Velocity, Veracity) is most evident in the biomedical domain: the trend towards precision P4-medicine (Predictive, Preventive, Participatory, Personalized) has resulted in an explosion in the amount of generated biomedical data sets – in particular -omics data, for example from genomics, proteomics, metabolomics, lipidomics, transcriptomics, epigenetics, microbiomics, fluxomics, phenomics, etc. No medical professional today is capable of memorizing all these data. A synergistic combination of methodologies and approaches of two areas offer ideal conditions towards solving these challenges towards new, efficient and user-centered algorithms and tools: Human–Computer Interaction (HCI) and Knowledge Discovery & Data Mining (KDD), with the goal of supporting human intelligence with machine learning – to interactively discover new, previously unknown insights into the data. This tutorial provides an overview about the HCI-KDD approach, outlines the knowledge discovery process chain and focuses then particularly on three promising hot topics: graph-based data mining, entropy-based data mining and topological data mining. The goal of this tutorial is to motivate and stimulate further research.
Keywords: Knowledge Discovery, Data Mining, Machine Learning, Biomedical Informatics, Big Data, Complex Data, Research Based Teaching.

Tutorial Audience
Attendees of the conferences WI, IAT, BIH and AMT with an interest in problems and challenges with “big data” in biomedicine are welcome, but also scientists from other domains dealing with big data sets are welcome (e.g. astrophysics, social media, telecommunications, meteorology, finance and business informatics, etc.). Beyond basic computer science knowledge, no particular background knowledge is necessary.

Content of the Tutorial

In this tutorial the HCI-KDD approach (Figure 1) [1-4] will be outlined and some examples from biomedical informatics will be presented [5, 6].

Whilst interactive knowledge discovery encompasses the horizontal process ranging from physical aspects of data (left in Figure 1) to the human aspects of information processing (right), data mining goes in depth and includes the methods, algorithms and tools for finding patterns in the data.

This tutorial will present seven (the magical number “7”) important research areas outlined in Figure 1, including: Area 1: Data integration, data fusion and data mapping; Area 2: mining algorithms, Area 6: data visualization [7-12]; and this tutorial will focus particularly on three hot topics:

Area 3: Graph-based Data Mining (GDM) [13-15],

Area 4: Entropy-based Data Mining (EDM) [13, 16-19] and

Area 5: Topological Data Mining (TDM) [20].

In the biomedical domain – as in some other domains – issues of Area 7: privacy, data protection, safety and security are mandatory [21].

Figure 1 above: The big picture of the HCI-KDD approach: KDD encompasses the whole horizontal process chain from data to information and knowledge; actually from physical aspects of raw data, to human aspects including attention, memory, vision, interaction etc. as core topics in HCI, whilst DM as a vertical subject focuses on the development of methods, algorithms and tools for data mining (Image taken from the hci4all.at website, as of May, 2014).

More information on this tutorial can be found in [22].

REFERENCES:

  1. Holzinger, A.: On Knowledge Discovery and Interactive Intelligent Visualization of Biomedical Data - Challenges in Human–Computer Interaction & Biomedical Informatics. In: DATA 2012, pp. 9-20. INSTICC,  (2012)

  2. Holzinger, A.: Human–Computer Interaction & Knowledge Discovery (HCI-KDD): What is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127, pp. 319-328. Springer, Heidelberg, Berlin, New York (2013)

  3. Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions. BMC Bioinformatics 15(Suppl 6), I1 (2014)

  4. Holzinger, A., Jurisica, I.: Knowledge Discovery and Data Mining in Biomedical Informatics: The future is in Integrative, Interactive Machine Learning Solutions In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics: State-of-the-Art and Future Challenges. Lecture Notes in Computer Science LNCS 8401, pp. 1-18. Springer, Heidelberg, Berlin (2014)

  5. Holzinger, A.: Biomedical Informatics: Computational Sciences meets Life Sciences. BoD, Norderstedt (2012)

  6. Holzinger, A.: Biomedical Informatics: Discovering Knowledge in Big Data. Springer, New York (2014)

  7.  Otasek, D., Pastrello, C., Holzinger, A., Jurisica, I.: Visual Data Mining: Effective Exploration of the Biological Universe. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics: State-of-the-Art and Future Challenges. Lecture Notes in Computer Science LNCS 8401, pp. 19–34. Springer, Heidelberg, Berlin (2014)

  8. Holzinger, A., Bruschi, M., Eder, W.: On Interactive Data Visualization of Physiological Low-Cost-Sensor Data with Focus on Mental Stress. In: Alfredo Cuzzocrea, C.K., Dimitris E. Simos, Edgar Weippl, Lida Xu (ed.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127, pp. 469–480. Springer, Heidelberg, Berlin (2013)

  9. Wong, B.L.W., Xu, K., Holzinger, A.: Interactive Visualization for Information Analysis in Medical Diagnosis. In: Holzinger, A., Simonic, K.-M. (eds.) Information Quality in e-Health, Lecture Notes in Computer Science, LNCS 7058, pp. 109-120. Springer Berlin Heidelberg (2011)
  10. Wiltgen, M., Holzinger, A.: Visualization in Bioinformatics: Protein Structures with Physicochemical and Biological Annotations. In: Zara, J., Sloup, J. (eds.) Central European Multimedia and Virtual Reality Conference (available in EG Eurographics Library), pp. 69-74. Czech Technical University (CTU), Prague (2005)
  11. Turkay, C., Jeanquartier, F., Holzinger, A., Hauser, H.: On Computationally-enhanced Visual Analysis of Heterogeneous Data and its Application in Biomedical Informatics. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining: State-of-the-Art and Future Challenges in Biomedical Informatics. Lecture Notes in Computer Science LNCS 8401, pp. 117-140. Springer, Berlin, Heidelberg (2014)
  12. Jeanquartier, F., Holzinger, A.: On Visual Analytics And Evaluation In Cell Physiology: A Case Study. In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127, pp. 495-502. Springer, Heidelberg, Berlin (2013)
  13. Holzinger, A., Ofner, B., Stocker, C., Valdez, A.C., Schaar, A.K., Ziefle, M., Dehmer, M.: On Graph Entropy Measures for Knowledge Discovery from Publication Network Data. In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) Multidisciplinary Research and Practice for Information Systems, Springer Lecture Notes in Computer Science LNCS 8127, pp. 354-362. Springer, Heidelberg, Berlin (2013)
  14. Holzinger, A., Malle, B., Giuliani, N.: On Graph Extraction from Image Data. In: Slezak, D., Peters, J.F., Tan, A.-H., Schwabe, L. (eds.) Brain Informatics and Health, BIH 2014, Lecture Notes in Artificial Intelligence, LNAI 8609, pp.557-568, in print. Springer, Heidelberg, Berlin (2014)
  15. Holzinger, A., Ofner, B., Dehmer, M.: Multi-touch Graph-Based Interaction for Knowledge Discovery on Mobile Devices: State-of-the-Art and Future Challenges. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining: State-of-the-Art and Future Challenges in Biomedical Informatics, Springer Lecture Notes in Computer Science LNCS 8401, pp. 241–254. Springer, Berlin, Heidelberg (2014)
  16. Holzinger, A., Hortenhuber, M., Mayer, C., Bachler, M., Wassertheurer, S., Pinho, A., Koslicki, D.: On Entropy-based Data Mining. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining: State-of-the-Art and Future Challenges in Biomedical Informatics. Lecture Notes in Computer Science LNCS 8401, pp. 209-226. Springer, Heidelberg, Berlin (2014)
  17. Mayer, C., Bachler, M., Hortenhuber, M., Stocker, C., Holzinger, A., Wassertheurer, S.: Selection of entropy-measure parameters for knowledge discovery in heart rate variability data. BMC Bioinformatics 15(Suppl 6), S2 (2014)
  18. Holzinger, A., Stocker, C., Peischl, B., Simonic, K.-M.: On Using Entropy for Enhancing Handwriting Preprocessing. Entropy 14(11), 2324-2350 (2012)
  19. Holzinger, A., Stocker, C., Bruschi, M., Auinger, A., Silva, H., Gamboa, H., Fred, A.: On Applying Approximate Entropy to ECG Signals for Knowledge Discovery on the Example of Big Sensor Data. In: Huang, R., Ghorbani, A., Pasi, G., Yamaguchi, T., Yen, N., Jin, B. (eds.) Active Media Technology, Lecture Notes in Computer Science, LNCS 7669, pp. 646-657. Springer, Berlin Heidelberg (2012)
  20. Holzinger, A.: On Topological Data Mining. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics: State-of-the-Art and Future Challenges. Lecture Notes in Computer Science LNCS 8401, pp. 331-356. Springer, Heidelberg, Berlin (2014)
  21. Kieseberg, P., Hobel, H., Schrittwieser, S., Weippl, E., Holzinger, A.: Protecting Anonymity in the Data-Driven Medical Sciences. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining: State-of-the-Art and Future Challenges in Biomedical Informatics, Springer Lecture Notes in Computer Science LNCS 8401, pp. 303-318. Springer, Berlin, Heidelberg (2014)
  22. Holzinger, A.: Extravaganza Tutorial on Hot Ideas for Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. In: Slezak, D., Peters, J.F., Tan, A.-H., Schwabe, L. (eds.) Brain Informatics and Health, BIH 2014, Lecture Notes in Artificial Intelligence, LNAI 8609, pp. 507-520, in print. Springer, Heidelberg, Berlin (2014)

ABOUT THE TUTOR

Andreas Holzinger is head of the Research Unit Human–Computer Interaction, Institute for Medical Informatics, Statistics and Documentation at the Medical University Graz, Lead at the HCI-KDD network, head of the first Austrian IBM Watson Think Group in close cooperation with the Software Group of IBM Austria, Associate Professor of Applied Informatics at the Faculty of Computer Science, Institute of Information Systems and Computer Media and Lecturer at the Bioinformatics Group at Graz University of Technology. He serves as consultant for the Canadian, Swiss, French and Dutch Government, for the German Excellence Initiative and as national expert in the European Commission (Lisbon Delegate). Andreas was Visiting Professor in Berlin, Innsbruck, Vienna, London, and Aachen.

Andreas and his group work consistently on a synergistic combination of methodologies and approaches of two areas that offer ideal conditions towards unraveling these problems: Human–Computer Interaction (HCI) and Knowledge Discovery/Data Mining (KDD), with the goal of supporting human intelligence with machine learning – to discover new, previously unknown insights into the data. He is passionate on extending advanced methods including time (e.g. information entropy) and space (e.g. computational topology), along with user-centered software engineering methods to create interactive software for mobile applications & content analytics techniques. Andreas follows three promising research streams: Graph-based Data Mining; Entropy-based Data Mining, & Topological Data Mining.

Since 1999 Andreas has participated in leading positions in 30+ R&D multi-national projects, budget 3+ MEUR, 300+ publications, 4300+ citations, h-index =29, g-index=90 (as of May, 26, 2014). He is founder and leader of the international Expert Network HCI-KDD.
More information: http://www.hci4all.at

 

 

Warsaw - The Old Town

Panorama of Warsaw

Chopin statue in Łazienki park

University of Warsaw - Library and gardens

Warsaw - Palace in Łazienki park

University of Warsaw - WIC 2014 venue

University of Warsaw - Central Campus

Glimpse of modern Warsaw

Warsaw - Castle Square

Warsaw - Downtown by night

Warsaw - Royal Castle seen from the river

Warsaw University of Technology