Early Fault Classification in Rotating Machinery With Limited Data Using TabPFN

Abstract

Intelligent fault detection and classification is a cornerstone of prognostic and health management of rotating machinery (RM) research. Correctly classifying and predicting RM faults not only increases productivity in industrial plants but also reduces maintenance costs. The datasets from real facilities needed to train fault classifiers often have few samples due to the expense of provoking faults in real scenarios to obtain data. This article proposes the use of the tabular prior-data fit network (TabPFN) model for the classification of faults in RM. TabPFN is a model which has been pretrained with a large amount of synthetic data with many causal relationships. This allows the model to perform Bayesian inference on the data used for training. The advantages of this model are its ability to be trained with limited data without generating overfitting problems and its high speed (if a graphics processing unit (GPU) is available). To compare its performance with traditional algorithms for tabular classification such as XGboost and random forest, three public datasets were used. Results show that TabPFN performs more accurately than algorithms with limited data, so it is suitable to be deployed in real scenarios when the amount of data available from the monitored RM is limited.

Publication
IEEE Sensors Journal