Skip to:

e-Science 2008 4th IEEE International Conference on e-Science

Exhibits, Demos & Posters

Rule-based Classification Systems for Informatics


  • B. Kirshnamurthy, Department of Chemical Engineering, Purdue University
  • T. Malik, Cyber Center, Purdue University
  • S. Stamatis, Department of Chemical Engineering, Purdue University
  • V. Venkatasubramanian, Department of Chemical Engineering, Purdue University
  • J. Caruthers, Department of Chemical Engineering, Purdue University


Classification of data is an important step in the knowledge evolution of sciences. Traditionally, in sciences, classification of data was performed by human experts. Human knowledge can recognize unique functional properties that are necessary and sufficient to place complex structures and phenomena into a particular class or group.

However, with the growth in scientific data and rapid changes in knowledge, it is no longer feasible for humans to classify objects. Automation of the classification process is necessary to cope with the growing amount of data. Otherwise, classification will become the rate-limiting step for scientific data analysis.

In this paper, we address the needs of such automation in the SciAEther project and develop ChES, a fast and reproducible framework for classifying molecules in chemical data. Our framework captures human understanding through an ontology and the diversity in classification types through a rule based system to classify complex molecular compounds.

We have tested our system with molecules from PubChem repository and found that our knowledge-based, automatic classification matches, and sometimes surpasses, that of the human experts.

More Information

Show your support for e-Science 2008

Add one of our badges to your site:

  • Teal eScience 2008 Web badge
  • Green eScience 2008 Web badge
  • Orange eScience 2008 Web badge