Taxonomic instability in (multi)sets of phylogenetic trees is often caused by missing data, analytical artefacts and/or data incongruence due to homoplasy or loci with different evolutionary histories. This thesis focuses, primarily, on methods to subset and summarise heterogeneous (multi)sets of trees, and on an approach to mitigate the effects of non-effective overlap caused by non-random patterns of missing data. A generalised definition of tree islands to any tree-to-tree distance metric is provided, which allows these heterogeneous tree subsets to be easily identified from any tree distribution, and not just as a byproduct of heuristic parsimony tree searches. Expanding on earlier studies, partitioned-by-island, weighted- and rarefied-by-island-size consensus methods are proposed, and the effect of islands on topology-based taxonomic instability tests explored. An R package to extract islands from trees on the same leaf set, islandNeighbours, is described and applied to a Bayesian tree distribution. For trees on non-identical leaf sets, a new subsetting strategy based on tree-to-supertree distances, clumps of trees, is proposed and applied to multiple tree (multi)sets with the newly developed clumpy Python pipeline. An approach combining (gene-)tree jackknifing on matrix representation of splits with Concatabominations (a heuristic compatibility-based taxonomic instability test, Siu-Ting et al. 2015) is proposed to identify instances of non-effective overlap on a newly inferred caecilian Tree of Life, and also candidate loci for targeted taxon sampling with the aim of ameliorating taxonomic overlap. This approach is also compared to the mathematical gene sampling sufficiency approach. Lastly, a morphological dataset used to illustrate the presence and effects of islands, and the effects of focal tree choice on clumps, is thoroughly reanalysed and an easily implementable tool for comparison of branch support measures across trees with identical leaf sets described and illustrated with trees inferred from a hypothetical dataset.