In the context of smart reverse engineering and robotic warehousing, efficient and accurate 3D model generation remains a persistent challenge, especially in cluttered and noisy environments. This paper presents an AI-powered retrofitting framework that reconstructs 3D digital twins from monocular video using Neural Radiance Fields (NeRF), offering a cost-effective alternative to traditional hardware-intensive 3D scanning. The proposed pipeline integrates NeRF-based volumetric reconstruction with HSV-based background filtering, DBSCAN clustering, and KD-tree-based spatial recovery to recognize multiple objects from complex, real-world scenes. Demonstrated through an automotive case study, this approach enables component-level recognition and digital modeling without the need for physical tags or high-end scanners. The resulting point clouds are compact, pose-consistent, and directly applicable to downstream tasks such as CAD alignment and robotic manipulation. By addressing key limitations of conventional reverse engineering methods, this work offers a robust and scalable solution for 3D modeling under real-world industrial constraints.