With the increasing real-time demand in the internet era, especially dynamic requests for last-mile delivery, route planning is becoming more computationally expensive than ever before. Many supply chains (SCs) choose the joint distribution of multiple depots to reduce transportation costs and delivery times. However, providing real-time and high-quality solutions for such complex routing problems remains challenging. Current solution methods like mathematical programming and heuristics suffer from scalability issues and long computation times. In contrast, artificial intelligence, especially Deep Reinforcement Learning (DRL), provides a general-purpose framework for sequential decision-making that has produced good results for many challenging real-life problems. However, applying DRL to route multiple vehicles is nontrivial, as the joint distribution requires an effective method that facilitates collaboration and communication among all agents while they carry out the delivery mission. In this research, a collaborative Multi-Agent Deep Reinforcement Learning (MADRL) approach is proposed for routing multiple vehicles in the SC. The proposed MADRL model leverages the power of two frameworks, deep learning and reinforcement learning, to generate routing policies for all agents in real time. Experimental results show the ability of the proposed learning model to obtain fast and quality solutions for complex delivery problems. Furthermore, the generalization ability of MADRL is also validated by testing the well-trained model on different scale problems.