In the evolving landscape of global trade and digital finance, optimizing liquidity and managing credit risk in supply chain finance (SCF) have become pressing challenges. This study proposes a novel reinforcement learning (RL)-based framework designed to optimize cash flow strategies within SCF ecosystems dynamically. By integrating machine learning-driven credit evaluation, multi-agent decision modeling, and adaptive policy learning, the framework addresses long-standing inefficiencies in conventional SCF practices, including rigid pricing structures, static credit scoring methodologies, and misaligned stakeholder incentives. The system formulates SCF interactions as a stochastic Markov Decision Process (MDP) and incorporates game-theoretic mechanisms to capture negotiation dynamics between borrowers and lenders. Numerical experiments, conducted using both synthetic data and empirical records from multinational corporations, demonstrate that the RL-based policy significantly reduces borrower financing costs, enhances default risk management, and improves platform profitability across a range of macro-financial conditions. The proposed approach provides a scalable, interpretable, and resilient decision-support system, underscoring the transformative potential of artificial intelligence in shaping the future of SCF.