Data Mining Work Flow Ontology

, Last updated by juk, on Wed, 11/09/2011 - 13:45

eProPlan page moved to: http://www.e-lico.eu/?q=node/323.

A major challenge for third generation data mining and knowledge discovery systems is the integration of different data mining tools and services for data understanding, data integration, data preprocessing, data mining, evaluation and deployment, which are distributed across the network of computer systems. In e-Lico WP6 we are building an intelligent discovery assistant (IDA) that is intended to support end-users in the difficult and time consuming task of  designing KDD-Workflows out of these distributed services. The assistant will support the user in checking the correctness of workflows, understanding the goals behind given workflows, enumeration of AI planner generated workflow completions, storage, retrieval, adaptation and repair of previous workflows. It should also be an open easy extendable system. This is reached by basing the system on a data mining ontology (DMO) in which all the services (operators) together with their in-/output, conditions and effects are described.

This approach is described in:

The DMO for planning is divided into several parts:

These ontologies are developed using Protégé 4.0 (build 111 or higher).

To use Protégé 4.0 (build 111 or higher) for planning we are developing the eProPlan plug-in (To use a Protégé 4.0 Plugin, you need to first install Protégé and then simply place the jar-file you get via the links below into the folder named “Plugin” inside the protégé directory).

eProplan movies:

  • DMWF ontology link
  • Problem description in eProPlan link
  • Planning a data mining workflow with eProPlan link
  • eProPlan applied to a planning problem: missionaries and cannibals link

Comments, enhancement proposals and bug reports can be submitted in our bug-tracker.