In the realm of Internet of Things (IoT) systems, accurately forecasting the runtime of Inferencing models on heterogeneous devices is paramount for optimizing resource allocation, particularly in the contexts of Model Distributed Inferencing (MDI) and Data Distributed Inferencing (DDI). This paper delves into the application of Gradient Boosting Regression (GBR) as a predictive modeling technique for estimating runtime in both MDI and DDI scenarios. GBR presents an equitable trade-off between interpretability, robustness against noise, and suitability for moderately sized datasets. The study reviews previous research on IoT inference optimization. It accentuates the multifaceted intricacies of device diversity and the significance of model interpretability within MDI and DDI setups. The primary contribution of this research is the novel application of the GBR model to predict machine learning inferencing runtime in MDI and DDI contexts. This approach is invaluable when empirical data is limited and characterizing the behavior of newly introduced devices is imperative. The paper elaborates on the GBR algorithm’s utilization, hyperparameters, and custom loss functions tailored explicitly for MDI and DDI. The results section exemplifies GBR’s performance across various computational regimes, including MDI and DDI. It offers insights into the model’s balance between accuracy and complexity. A performance comparison with prior models underscores GBR’s efficacy in predicting runtime in MDI and DDI. This work contributes to the ongoing discourse on IoT optimization and predictive modeling.