
Addressing the Accuracy-Cost Tradeoff in Material Property Prediction: A Teacher-Student Strategy
Document Type
Working Paper
Condensed Matter - Materials Science
Computer Science - Machine Learning
Deep learning has revolutionized the process of new material discovery, with state-of-the-art models now able to predict material properties based solely on chemical compositions, thus eliminating the necessity for material structures. However, this cost-effective method has led to a trade-off in model accuracy. Specifically, the accuracy of Chemical Composition-based Property Prediction Models (CPMs) significantly lags behind that of Structure-based Property Prediction Models (SPMs). To tackle this challenge, we propose an innovative Teacher-Student (T-S) strategy, where a pre-trained SPM serves as the 'teacher' to enhance the accuracy of the CPM. Leveraging the T-S strategy, T-S CrabNet has risen to become the most accurate model among current CPMs. Initially, we demonstrated the universality of this strategy. On the Materials Project (MP) and Jarvis datasets, we validated the effectiveness of the T-S strategy in boosting the accuracy of CPMs with two distinct network structures, namely CrabNet and Roost. This led to CrabNet, under the guidance of the T-S strategy, emerging as the most accurate model among the current CPMs. Moreover, this strategy shows remarkable efficacy in small datasets. When predicting the formation energy on a small MP dataset comprising merely 5% of the samples, the T-S strategy boosted CrabNet's accuracy by 37.1%, exceeding the enhancement effect of the T-S strategy on the whole dataset.