Overview

In this course we are going to introduce basic analysis for single-cell RNAseq, with a specific focus on the 10X system. The course is divided into three sessions. In the first session, we will introduce how to interpret the Cell Ranger QC report and how to do analysis with the LOUPE Browser. We will also demonstrate the customized analysis we can support right know. In the Second session, we will demonstrate how to process scRNAseq data and make QC reports with Seurat. In the third session, we will discuss how to to do more advanced analysis and QC with Bioconductor packages.

Cell Ranger results


Cell Ranger

Cell Ranger is a widely used software for single-cell sequencing data analysis. It supports various kinds of analysis for 10X single-cell sequencing data. - Alignment and counting - Analysis of VDJ changes - Datasets aggregation - Convert BCL files (raw files) into FASTQ files - Generation of customized GTF and Genome files

Basic reports of Cell Ranger

Session information

Sequencing QC

Mapping QC

General information

  • How many cells should we get?
  • Is the throughput enough?

It depends on the sample characteristics and library complexities

How to determin cell numbers?

Knee plot

  • Knee plot is applied to determine real cell numbers in a single-cell cohort.
  • The x-axis represented the barcodes ranked by UMI counts inside the cell barcodes. Y-axis represents the UMI counts detected per cell barcode.
  • A knee plot usually contains two bumps. We would take the mid-point of the first bump as a cut-off to differentiate real cells and backgrounds (empty droplets).
  • As we defined the real cells and empty droplets, we could estimate the transcripts detected in real cell or empty droplets.
  • The transcripts in empty droplet, so call ambient RNA, are cell-free RNAs falsely included in droplets. We will demonstrate how to remove interference of ambient RNAs in session 3.

How to determine throughput?

  • Downsampling and evaluate median gene per cells in different throughput (Mean reads per cell)
  • Downsampling and evaluate proportion of unique UMIs in different throughput (Mean reads per cell)

genes vs throughput

Seems not saturated yet

Saturation vs throughput

  • Saturation: uniquely detected UMIs in total UMI
  • Can evaluate both saturation and library complexity, like PCR duplication

Seems still not saturated and with lower library complexity, probably higher PCR duplicates.

tSNE plot and clustering

Seems the cells were grouped by UMI counts.

Data visualization with LOUPE Browser

  • Please install the latest version
  • How to access the expression of particular genes
  • How to use split view
  • How to define sub-groups
  • How to calculate differential gene expression
  • Could I import projections and more information generated by external programs?

Customized Analysis


Demonstration

Here we use an article published by Fuchs Lab as an example to demonstrate the customized analysis we have done here at the BRC. Please access this article. - General QC plots - Data integration and normalization - Clustering - Annotate cell types - Pseudotime analysis - Differential gene expression and single-cell pathway analysis

The next two sessions

We will also use the data of the eLife paper in next two sessions. You can download the data from GEO link or from DropBox link.