• Ignorance scores are a novel algorithm to explore digital biodiversity knowledge. • We used ignorance scores to explore GBIF species records for the Caatinga. • Our analysis reveals taxonomic and spatial biases in available species records. • Accessibility and convenience are associated with recording effort in the Caatinga. • Ignorance scores are a simple but useful indicator of species recording effort. The availability of quality information about species distributions is clearly central to the development of successful conservation efforts. Digital records of species occurrences are increasingly available and have been used in a number of conservation applications, such as species distribution models and conservation prioritization efforts. However, our knowledge of species distributions is still affected by several shortfalls which limit our capacity for effective action if not properly scrutinized. Ignorance scores have been recently proposed as an intuitive and straightforward indicator of biodiversity knowledge availability, but to date their usefulness in assessing biases in species occurrence data has been poorly explored in the scientific literature. We used ignorance scores to characterize and identify the factors driving the availability of recent species occurrence records in the Global Biodiversity Information Facility (GBIF) for multiple taxa in the Caatinga ecoregion, the largest seasonally dry tropical forest in the world. Specifically, we calculated ignorance scores based on species records within 10 × 10 km cells covering the Caatinga region and modelled the relationship between ignorance scores and a set of socio-geographical variables using generalized additive models for location, scale and shape (gamlss). Most studied taxa had high ignorance scores across the Caatinga, indicating a low availability of recent species records in GBIF for this region. Our results also suggest that factors associated with accessibility and convenience are the main correlates of species recording effort in this region. Ignorance scores were lower at intermediate values of road and human population density, indicating that observers tend to avoid urban and inaccessible areas. We also found evidence of increased recording effort in areas close to universities and protected areas while vegetation cover seemingly had little effect on ignorance scores. Overall, our results suggest that efforts to compile and digitize recent species occurrence records should be encouraged in order to improve our knowledge of this regions' unique biodiversity and the efforts to preserve it. Furthermore, ignorance scores are a useful indicator of the availability and distribution of species occurrence records in the Caatinga. We discuss a range of potential extensions to this indicator that could expand its scope for future applications. [ABSTRACT FROM AUTHOR]