This article introduces a versatile multi-task learning framework (UMT-Net) and an adaptive task weighting (ATW) training method, specifically designed for resource-constrained scenarios that demand parameter-efficient networks. The adaptable UMT-Net architecture includes a global-shared backbone based on an encoder, task-specific self-attention modules, inter-task joint-attention fusion modules, and feature-aggregating decoders. The ATW technique accounts for both short-term variation and long-term statistics of task losses, leading to a more stable training process. Extensive experiments on CityScapes and NYUv2 datasets reveal that UMT-Net outperforms baseline methods while requiring fewer computations, model parameters, and inference latency. In addition, we also conducted experiments on the autonomous driving dataset BDD100 K and achieved state-of-the-art performance. Furthermore, we deployed the model and carried out tests for generalization in real-world scenarios. Finally, our network architecture possesses the capability to be designed as a compact model with much fewer parameters, computational requirements, and inference time, while maintaining competitive performance, making it suitable for deployment on mobile devices.