This study explores a comprehensive machine learning (ML) framework for predicting the tensile strength of aluminum alloys using chemical composition and processing without the high cost of endless, trail-and-error experiments. An Extreme Gradient Boosting (XGBoost) algorithm, trained on a 1145 sample dataset we compiled from the Mendeley data repository. The data is a mix of chemical inputs (wt.% of Al, Cu, Zn) and processing history (like “Strain hardened” or Artificial Aged”). The goal was to see if the model could determine the complex, non-linear relationships that govern an alloy’s final strength. The results were very strong. On unseen data the model achieved an R2 of 0.9416, and the error (RSME) 37.30 MPa on the test set, demonstrating very good prediction accuracy across a wide range of tensile strength values (100-800 MPa). We were run a permutation importance analysis to see what it learned. We found the three most influential alloying elements were Copper (Cu), Aluminum (Al), and Zinc (Zn) respectively in determining tensile strength. We also run diagnostic check to confirmed the model is statistically sound and unbiased. The study yields significant insights into composition-property correlation and offers a reliable tool for accelerate aluminum alloy design and optimization.
Keywords: Aluminum Alloys, Tensile Strength, Machine Learning, Extreme Gradient Boosting, Alloy Design