In order to improve the autonomy and intelligence level of express UAV by using vision information, the research on the cognitive understanding of delivery scenarios was carried out, especially in the face of the difficulties of small-scale target detection under complex scenarios. In this paper, a target detection method based on conventional scale is proposed, firstly existing small target detection architectures problems are analyzed; then a small target detection method through setting backbone network and constructing model of generating candidate box is proposed, which improves the accuracy of small-scale target detection and recognition; finally, the experiments of express UAV using vision to identify landmark, package and obstacles were carried out, which verifies that visual information are helpful to improve the effect of express UAV in accurately grasping package, delivering packages, correcting course, and avoiding obstacles. The final results show that visual information is beneficial to improve the autonomy level when express UAV performs tasks.