Valid Predictions with Insufficient Training Data


Good generalization of classifiers requires an excessively large training dataset that comprehensively covers the potential data domain. In the absence of such complete training dataset, the adaptive networks could be trained with any existing data. However, these training datasets may be incomplete, insufficient, or contain information gaps. Only if the testing data reflect states that have been previously captured by this incomplete training dataset, will the trained algorithms predict reliably. Even though a learning algorithm that is trained with insufficient training datasets cannot generalize universally, it can still predict selected testing instances. With insufficient training data, the problem is to find testing data that are relevant to the training dataset. To attain this goal, the proposed method works in three phases. First, it clusters the training dataset and ranks these clusters according to their information density. Then, a supervised algorithm is trained to classify any random testing data into the exemplary clusters of the training dataset. The third phase detects and selects the testing data that belong into information dense clusters, and rejects the unusual testing events. Our method improved the classifiers predictability.

  • Abstract
  • 1. Introduction
  • 2 Method
  • 3 Simulation
  • 4 Results
  • 5 Conclusions
  • References

Related Content

Customize your page view by dragging and repositioning the boxes below.

Related Journal Articles
Related eBook Content
Topic Collections

Sorry! You do not have access to this content. For assistance or to subscribe, please contact us:

  • TELEPHONE: 1-800-843-2763 (Toll-free in the USA)
  • EMAIL: asmedigitalcollection@asme.org
Sign In