Serverless computing has reshaped the cloud computing landscape by offering benefits such as auto-scalability, streamlined operational management, and granular billing. As its adoption grows, challenges related to performance and cost optimization in hybrid architectures combining private servers and public cloud clusters have emerged. Central to these challenges are achieving optimal response latency and balancing performance and cost. To address these challenges, this paper introduces an adaptive routing service specifically designed for hybrid environments, proficient in leveraging real-time function metrics. Our proposed service pivots on three integral components: a monitor that captures performance metrics and raises alarms for predefined anomalies; a forecaster that predicts function latency across clusters, which includes wait and execution times and produces request distributions for each cluster to equalize the overall function latency; and a router then processes incoming requests, taking cues from the forecaster’s predictions. Notably, based on user-defined objectives, the forecaster can be directed to either minimize latency or optimize execution costs through trading off wait or execution time. Comprehensive evaluations on AWS and Azure clusters using the open source FaaS framework Apache OpenWhisk showcase our approach’s effectiveness, yielding a 9% improvement in average latency, a 45% decrease in standard deviation latency and a 17% cost reduction compared to conventional 50-50 routing. The advantages of elevated monitoring frequency are also illuminated, emphasizing quicker convergence times.