Data forPan-cancer classification of single cells in the tumour microenvironment. Most files are saved in .RDS format and can be loaded in R using the readRDS function main_figure_scripts.zipcontains the plotting scripts for all main figures and thescript to generategold_standard_cells_matrix_for_fig2bc.RDS run_scATOMIC_on_query_data.Rcontains the script used to run scATOMIC on query data. training_files_for_scATOMIC_core.tar.gzcontains the count matrices, metadata files, and scripts used to train the core scATOMIC model. The reference_datasets folder contains training matrices and metadata that were used to derive marker genes andbalanced training matrices using theget_markers_per_layer.R andgenerate_balanced_training.R scripts. The markers folder contains the results of differential gene expression between cell types at each layer. Thebalanced_trainingfolder contains the class balanced matrices used to train each layer in thetrain_classifiers.Rscript. Theclassifier_outputsfolder contains the output models and gene lists used. harmonized_results_matrix_and_metadata_all_patients_all_cells.RDSis a matrix of all cells with their associated metadata, scATOMIC annotations and annotations with other tools. gold_standard_cells_matrix_for_fig2bc.RDSis a filtered versionof cells with a more confident, gold standard ground truth that were used for external validation inplot_Figure_2BC.R. We also provide count matrices foreach patient in:split_by_patient_matrices.tar.gz prediction_list_files.tar.gzcontains the results of run_scATOMIC() for each patient. scATOMIC_annotations_per_sample.tar.gzcontains the summarized results matrix for each individual patient. Pal_et_al_breast_cancer_data.tar.gzcontains the results of scATOMIC in each breast cancer subtype sample. kfold_results_and_scripts.tar.gzcontains the results and scripts used for k fold cross validation. scATOMIC_1.1.0.tar.gzcontains the R package (version used in manuscript). scripts_run_other_methods.zipcontain scripts used to train/make reference and run other annotation methods compared to in Fig 2C. metastatic_summary.RDScontains the file used for plotting figure 6.