Benchmarking computational methods for single-cell chromatin data analysis

The analysis of single-cell ATAC-seq data is challenging due to its high-dimensionality and sparsity. In our recent preprint, we benchmarked current scATAC-seq data processing pipelines, and evaluate their strengths and weaknesses within different representative datasets. We identified specific approaches to be superior for specific dataset types.

Our assessments spanned various stages of the data processing workflow, using existing and novel metrics developed by us to provide insights into the individual steps, including using genomic peaks vs tiles, different peak calling methods, and latent space dimension selection.

We found that SnapATAC, and SnapATAC2 are preffered for datasets with complex cell-type structures; a feature aggregation strategy performs the best for simpler datasets; while the widely used latent sementic indexing (LSI)-based methods tend to underperform, and show strong library size bias.

Check out the publication here: external pagehttps://doi.org/10.1101/2023.08.04.552046

JavaScript has been disabled in your browser