By Hari Mailvaganam
Data mining has had a checkered history mainly due to technical constrains placed by limitations of software design and architecture. Most of the algorithms used in data mining are mature and have been around for over twenty years. The next challenges in data mining are not algorithmic but software design methodologies. Commonly used data mining algorithms are freely available and processes that optimize data mining computing speed are well documented.
Most early data mining software were spun off from academia and were built around an algorithm. The inability of early data mining software to integrate to external data sources and usability issues resulted in data mining being marginalized.
The cost associated with data mining is still unnecessarily high and often not cost effective. New standards in data extraction and better software platforms holds promise that the threshold barrier to entry will be reduced.
Data access standards such as OLE-DB, XML for Analysis and JSR will minimize the challenges for data access. Building a user friendly software interfaces for the end-user are the next steps in the evolution of data mining. A comparable analogy can be made with the increasing ease of use of OLAP client tools.
The J2EE and .NET software platforms offer a large spectrum of built-in APIs that enable smarter software applications.
DAT-A : Open Source Data Mining and OLAP on MySQL
DAT-A is an open source application that is built to allow intelligent data mining. By intelligent data mining, DAT-A's software architects are creating a highly decouple application that focuses the user's attention on the data mining results and not the data extraction or data modeling process. All data exchanges are in XML and SOAP to ensure interoperability.
An enterprise version is also being planned that is built on a BEA WebLogic Server that writes to a Web Services interface.
Presently MySQL does not have built-in data mining modules. DAT-A applies a data mining abstraction layer on MySQL. The business logic for controlling the data mining model and model training is written in the J2EE framework.
For the personal edition of DAT-A, the MySQL data mining application server is contained within the business logic developed on the J2EE framework layer. In the upcoming enterprise version, the business logic and data extraction controls will be hosted on BEA's WebLogic application server.
DAT-A will eventually be migrated to Source Forge.
Please contact us if you have any questions.