Manual lifting tasks are a leading cause of workplace injuries, necessitating reliable methods to evaluate and mitigate associated risks. Accurately identifying lifting stages is a critical step in ergonomic risk assessment tools such as the Revised NIOSH Lifting Equation (RNLE), which requires precise determination of the lifting action's origin and destination points. This study addresses the challenge of lifting stage identification by developing a hybrid deep learning model that integrates Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks. Using a dataset of 117 videos, representing single lifting actions, 75,920 labeled frames were generated and classified into four stages: "ready," "start," "processing," and "end." Data augmentation and class balancing techniques were employed to handle label imbalances. The proposed model achieved a robust 99% accuracy in 5-fold cross-validation, effectively distinguishing between lifting stages, even in asymmetric lifting scenarios. Misclassification primarily occurred between visually similar stages such as "ready" and "processing". The results highlight the model's potential to automate lifting stage identification, a key requirement for implementing ergonomic assessments like RNLE. By providing accurate and scalable predictions, this approach offers a practical solution for improving workplace safety. Future work could expand this model's applicability to diverse lifting conditions and real-time analysis.