Temporal action localization (TAL) plays a crucial role in video understanding, aiming to determine the class of each action instance and pinpoint its starting and ending frames in untrimmed videos. This paper introduces an anchor-free TAL approach using a U-shaped network. Through backbone and feature pyramid networks, the input video is transformed into a 1D feature sequence encapsulating the video's spatial-temporal information. By leveraging the U-shaped network-based feature pyramid, spatial dimensions are merged across various temporal convolutional layers, and temporal dimensions are aggregated. Preliminary proposal sequences, which include a basic regressor and classifier, are generated. For each of these proposals, predicted temporal regions are extracted and key boundary features are acquired using a boundary pooling network. These boundary features, in conjunction with the feature pyramid, serve to output refined temporal regression and action classification predictions. Our approach, devoid of predefined anchors, reduces the output volume and complexities of hyperparameter tuning. Experiments on the THUMOSl4 dataset showcase the evident advantages and commendable performance of our anchor-free TAL method.