To reduce the reliance of existing mainstream vision-based crack detection algorithms on annotated datasets, this paper presents an unsupervised semantic segmentation-based approach. The proposed method first employs the Felzenszwalb-Huttenlocher algorithm for pre-segmenting the image, generating superpixels. Subsequently, a with an autoencoder model is designed to progressively approximate the superpixel segmentation results, and the optimal model is obtained by Bayesian optimization. Through comparative experiments with existing algorithms, it has been demonstrated that the proposed method performance comparable to supervised algorithms, even without the need for labeled data. As a result, the deployment complexity of the algorithm is significantly reduced, while expanding its applicability. [ABSTRACT FROM AUTHOR]