Loupe Browser


Course materials

Links to material and slides for this course can be found on github.

Or can be downloaded as a zip archive from here.

Course materials

Once the zip file in unarchived. All presentations as HTML slides and pages will be available in the directories underneath.

  • r_course/presentations/slides/ Presentations as an HTML slide show.
  • r_course/presentations/singlepage/ Presentations as an HTML single page.

10X scRNAseq datasets


Sample Prep

There are several single cell sequencing technologies. The most common is from 10X Genomics.

overview
overview

A scRNAseq workflow

We often need many tools across several computational languages to analyze these complex experiments. Deciding what is appropriate often depends on the data set and its QC metrics. Our first step is nearly always to run Cell Ranger. overview

.cloupe file

The product of this QC report, count matrices for downstream analysis and a Loupe file. The .cloupe extension denotes the Loupe file; a specialized file which can be opened by Loupe Browser.

At this point the sequenced reads have been mapped to genes and empty/bad droplets have been filtered out.

Loupe Browser

Loupe Browser is a visualization and analysis software for scRNAseq. It plays a key role in exploratory analysis and QC allowing us to:

  • Check QC markers
  • Review clustering
  • Check marker genes
  • Run differentials
  • Subset the data

Install Loupe Browser

Loupe Browser can be installed from the Loupe Browser website. The current version is 8, though most older versions are still available.

Opening Loupe Browser

When we open Loupe Browser we are greeted by a Home page.

overview
overview

Opening Loupe Browser

When we open Loupe Browser we are greeted by a Home page.

overview
overview

Opening Loupe Browser

When we open Loupe Browser we are greeted by a Home page.

  • Center

  • Tutorial Files - Contains several example data sets.

  • Recent Files - Contains prior used data sets.

  • Top Right

  • Open Loupe Files - Open a .cloupe file you have locally

  • Left

  • Documentation and Tutorials - Guides for using the browser

  • Public Datasets - 10X’s repository of datasets

Lets start

We will start out using the data set for AML in the Tutorial Files section. This data set is from 3 donors. 2 healthy bone marrow samples and 1 from a patient with AML. There are 8,414 cells.

Zheng, G., Terry, J., Belgrader, P. et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun 8, 14049 (2017). https://doi.org/10.1038/ncomms14049

Panels

The data will take a few seconds to load in and then we can see our data and the main control panels.

  • View (Our Data) [Middle]
  • Projections [Top]
  • Toolbar [Left]
  • Differential Expression [Bottom Right]
overview
overview

Data and Projections

By default our data is shown as a t-SNE. This is a popular type of Dimension Reduction. This simplifies our very complex highly dimensional data into a 2D space. In this space each dot is a cell and the local relationship between cells is preserved.

  • We can Zoom in using the Zoom icon [Bottom left].
  • We can drag and drop to navigate around
  • We can alter the dot size using projection settings [Top]
  • We can hide most of the UI with corner square button [Top]

Toolbar

Bulk of the activities we will want to do is contained within the left toolbar.

  • Clusters
  • Features
  • Reanalyze
  • Advanced Selection
  • V(D)J Clonotypes
  • Search for Features
overview
overview

Clustering


Clustering

Lets start with clustering. Cell Ranger does some basic clustering for you. It performs clustering in two ways:

  • Graph-Based
  • K-means

The cells are colored by the Graph-Based clustering method be default. We can use the toolbar to switch easily between the two.

Clustering

overview
overview

Graph-Based vs K-means

  • Graph-based: This is built on relationship between cells, grouping cells together based on a nearest-neighbor model.

  • K-means: This assigns cells to a cluster based on their distance to a pre-determined number of cluster centers. This typically leads to round clusters.

Generally graph-based is better for identifying more complex spatial patterns. K-means is typically simpler.

Clustering

All clusters are shown. We can focus on just a few quite easily by clicking on the check box. We can also edit color and name by clicking on the three dots.

overview
overview

Clustering

Alternatively, we can use the split view to help clarify the distribution of the different plots.

overview
overview

Clustering and QC

This is our first chance to really assess the quality of our data.

We are looking for clearly defined clusters. For this we need easily distinguishable groups of cells that have classified correctly into clusters.

We don’t want an inkblot with indistinguishable/overlapping cluster assignments.

Clustering

You can download the cluster information easily by clicking the download button. This gives you a table containing every Cell Barcode and the cluster it is assigned to. This is a .csv file which can be opened in Excel/Numbers/Sheets or by Python/R.

overview
overview

Custom Groups

This data has several custom markers associated with it: AMLStatus and LibraryID.

We can also check the distribution of these identities across our tSNE in much the same way we can review the clusters.

Adding Custom Markers

We can add our own groups. We can do this in two ways.

  1. Upload a file .csv containing Cell Barcodes and group information.
  • If you download the course material there is an example files in: LoupeBrowser/r_course/data/smoke_status_metadata.csv
  1. Manually create rules to create groups in Loupe Browser (we will show you this later)

Diffential Expression


Diffential Expression

We can also get differential expression results from our analysis. This can be another important step in the QC of our data and start to provide insight into the biological findings.

The are various ways we can use the differential analysis:

  • Markers
  • Pairwise Comparisons
  • Pseudobulk

Markers

By default differential gene expression is calculated for each cluster. It does this by comparing each cluster to everything else using an exact negative binomial test.

The result are a set of marker genes for that cluster.

We can view the results as a Feature Table or a Heatmap.

In this case we can see markers like CD3D, IL32, CD3E up-regulated in Clusters 1, 6 and 12, suggesting they are some kind of Immune Cells.

Markers

overview
overview

Pairwise Comparisons

Once we have some understanding of our data we may want to start do some specific comparison between clusters i.e. Cluster 1 vs Cluster 6, or use some of the group information i.e. AMLstatus.

Let’s look at Cluster 1 vs Cluster 6. To do this we select just those clusters.

Pairwise Comparisons

Once we have some understanding of our data we may want to start do some specific comparison between clusters i.e. Cluster 1 vs Cluster 6, or use some of the group information i.e. AMLstatus.

Let’s look at Cluster 1 vs Cluster 6. To do this we select just those clusters and click Run Differential then Between Selected Clusters.

Pairwise Comparisons

Again we can review results as a Feature Table or a Heatmap. This has some interesting results. There are several RPS and RPL genes. These are associated with the ribosome and could be a sign of QC issues.

We want to keep an eye out for problem gene families like mtRNA and rRNA as they could be symptomatic of technical issues with the data. We will come back to this.

Pairwise Comparisons

overview
overview

Pseudobulk

The last method is only applicable if you have replicates. Pseudobulk leverages your replicates and aggregates the information across cells to allow a more traditional bulk RNAseq statistical determination.

A pseudobulk approach can help deal with noise and variability scRNAseq and negate some of the overpowering you can have with the high n of cells.

Though not the best experimental design, we can run pseudobulk on this dataset as we have 2 control samples (even though we only have 1 AML).

Let’s start by clicking Run Differential then Across Multiple Samples.

Pseudobulk

A separate window will pop up. We will then provide our differential design.

  1. Select our group. This will be the subset of cells that you want to focus your differential on i.e. Cluster 1. We can run it on all our clusters at once.

  2. Experimental comparison. What do you want to compare? In this case we will look at AML status. We can then select which will be A or B in our comparison. We will pick patient as A; traditionally perturbations are used as the numerator for fold change comparisons.

  3. Select samples. We will just include all our samples.

Pseudobulk

overview
overview

Pseudobulk

When we review the heatmap we can see Cluster 14 has a lot of changes (typically the loss of expression). We can also see a few genes very up-regulated. We can see in Cluster 2 HBG1 and HBG2 which are known to be involved in AML.

overview
overview

Features


Features

Once we have reviewed our clusters we may next want to check the distribution of expression of specific genes or features. This can useful for several reasons:

  • Check QC markers - i.e. mtRNA or rRNAs
  • Check cluster markers
  • Check differential results
  • Check genes of interest

All we have to do is click on the Feature panel and type in our gene of interest.

QC Features

Here we check the ribosomal small subunit RPS4Y1. There is clear asymmetric distribution of expression. This would be a red flag, but AML is known to have abnormal expression of ribosomal subunits.

overview
overview

QC Features

We can also check which clusters are gene is expressed in by going down to the differential expression panel and clicking on Expression Distribution. This shows us violin plots, broken down into whatever groups we have selected in the Cluster panel.

overview
overview

Cluster Features

We can check our cluster specific features. Earlier we saw CD3D as a top cluster marker.

overview
overview

Differential Features

Likewise we can follow up with our differential results. Lets look at HBG2.

overview
overview

Custom Features

NPM1 is a well known gene which is often mutant in AML.

overview
overview

Customise Feature Plot

We can customize the feature plot a lot using the tool panel in the top right.

We can alter the scale/transformation of the counts, color scheme and add filters. We can also export cells that express our gene into custom groups (more on that soon).

overview
overview

Feature Lists

The genes we have looked at have been automatically added to a gene list. We can generate several of these list, rename them or modify them easily in Loupe Browser. Alternatively we can export/import them.

overview
overview

Multiple Feature Lists

If you have multiple gene lists you can start to parse out complex interactions by making a co-expression plot. This shows the distribution of multiple gene lists across your tSNE.

Feature Plot

We can also cross-reference individual genes in a similar way by looking at the Feature Plot. This is hidden in the Projection Type selection drop down.

We can use this to find sub populations of cells dependent on multiple markers for example: CD8+ T-cells. These will have CD3D and CD8 expression. We can also subset the cells just to clusters 1, 6 and 12 to help focus on these cells.

Custom Groups


Custom Groups

So far we have shown you how to make custom groups by importing gene lists into Loupe Browser. We can also create custom groups while in Loupe Browser.

We have seen allusions to this while looking at filtering of Features based on expression.

Let’s filter CD3D based on a minimum expression value of 1.5.

Save Barcodes

After we have filtered our cells we can then save these barcodes. To save them we need to: - Name the selected cluster: What is the name of these cells? i.e. Immune - Add to new Group: What is the type of labeling we are doing? i.e. Annotation

Save Barcodes

We can now just go to the Clusters section and we will see our new Group. We can now select this. Just like we would any other cluster.

Lets try to add more cell types of interest to our Annotation group. Lets start by looking for fibroblasts using the VIM marker gene.

Save Barcodes

Several clusters seem to correspond to fibroblasts. We could use our threshold filtering to select these cells or we could instead use point and click selection using the Freehand Selection tool. There is also a Draw Selection depending on your preference. Lets start by grabbing one of these clusters.

overview
overview

Save Barcodes

We can add a selection to an existing group, so lets use Freehand Selection again to highlight the other major fibroblast group.

Custom Groups

When we go back to the Cluster tab and check our new groups we can see the two approaches have slightly different results.

As scRNAseq has dropouts the labeling of the Immune cells is a lot patchier, that said it maybe is more of an accurate representation of the original data.

overview
overview

Advanced Selection

As you may have noticed there is also a dedicated pane to filtering and selecting cells based on various criteria: expression level, cluster identity etc.

To create new selection we create Rules.

Our Immune Cell Cd3d+ group is specifically T Cells. We could subset this using rules to find CD8+ T-cells. First we will check CD8B expression levels in the feature tab. It seems a 0.5 filter on the log2 scale is sufficient.

Advanced Selection

We can then set up a rule to filter our cells based on the fact that they are in our Immune group AND also they are CD8B+ > 0.5 log2(counts). We can even export this new selection to a group by saving barcodes.

overview
overview

Reanalyze


Reanalyze

A lot of the time there are issues with the QC that need to be remedied. Most of the time this will require taking the data into R/Python to use a collection of scRNAseq tools. But Loupe Browser does allow for some optimization with out leaving the GUI.

The Reanalyze tab allows you access to custom filtering of low quality cells, clustering and dimension reduction.

Review Barcodes

The first step is to review the barcodes. This allows you to filter cells that you think are bad or unnecessary, based on whatever criteria.

You can supply the barcodes you want to filter as a .csv or you could filter based on the groups you have made. For example if you knew you were not interested in the T-cell portion of the cells than we can filter them out. Domain-specific knowledge is important for this decision making. Whatever groups are selected on the cluster pane are included, so we can deselect the Clusters 1, 6 and 12.

Review Barcodes

overview
overview

UMI thresholds

We can then customize filtering of cells based on UMI values. Often we would filter top and bottom cells as they represent doublets or empty droplets respectively.

overview
overview

Gene thresholds

Similarly we can then customize filtering of cells based on number of Genes expressed. Often we would filter top and bottom cells as they represent doublets or empty droplets respectively.

overview
overview

Mitochondrial UMI thresholds

We will then set thresholds for the amount of mitochondrial signal. mtRNA is one of the first signs of ambient RNA contamination, and cells with high mtRNA often will confound results.

This is a human data set, so we can select the Human genome. There is also mouse as an option. If you work with a different organism you can supply the mitochondrial genes in a .csv format.

Mitochondrial UMI thresholds

We often start with a 10% cut-off (1% for snRNAseq), but if you are seeing a lot of ambient RNA you may have to be flexible around this.

overview
overview

Reanalyze

Lastly, we can decide on some of the outputs. This includes what dimension reduction to perform. So far we have just looked at tSNE, but actually UMAPs are the most popular.

You can customize some of the options used in the dimension reduction by adjusting the parameters. This allows you to control PCA, tSNE and UMAP. Generally though it is best to stick to the default unless you know what you’re doing.

Once the analysis is given a name, we can then click Recluster. This may take a few minutes.

Reanalyze

All our prior work is preserved but now we have some additional projections, including a UMAP.

overview
overview

Import/Export


Working with your own files

Thus far we have been working with some demo data. As mentioned we can get Loupe files from our own experiments (or public data) when we run Cell Ranger. This will take FastQ files as an input.

Alternatively, we can use publicly accessible loupe files from the 10X website or other sources. These will have to be downloaded.

Once we have the files we simply open a new Loupe Browser file session and then click Open Loupe File and find the downloaded file.

Working with your own files

Unfortunately, most repositories do not require loupe files to be submitted with data sets i.e. GEO. That said, often an R object like .rds files, are submitted.

There is a package called loupeR from 10X which can convert a Seurat object from R into a loupe file. You can include in the export a variety of Group IDs as well, so any annotation that has been added can also be exported.

This makes it really easy to go from more advanced bioinformatic tools to a much more user friendly Loupe Browser.

loupeR

A quick piece of example code:

library(loupeR)

create_loupe_from_seurat(seu_obj, 
                         output_dir = "loupe", 
                         output_name = "scRNA_annotated_Igfilter")

loupeR

When you import it will look like this, with all the inherited categories.

overview
overview

loupeR

Unfortunately there is not an equivalent utility for Python, but it is relatively straightforward to export from Scanpy, read into R and then export again into Loupe Browser.

Alternatively, you can import all aspects of a data set into Loupe Browser individually. We have already seen how to import Group and Feature lists. You can also import projections. By using the “…” next to Projection Type.

Export and Saving

The analysis you perform in Loupe Browser can easily be saved. When you click File > Save the loupe file you are working with is updated. Making it easy to come back to the work or share it with others.

You can also save any of the gene lists, group information or differential results by clicking on export buttons next to each of those panels.

Lastly, you can export high res SVG files (similar to PDF) of your t-SNE/UMAP. These files can be read into Illustrator/Inkscape.

Summary

Hopefully we have shown you that you can complete a number of task with relative ease in Loupe Browser:

  • Check QC markers
  • Review clustering
  • Check marker genes
  • Run differentials
  • Subset the data

Limitations

  • There is less controls of parameters i.e. clustering
  • There are not the same tools available i.e. doublets, ambient correction, annotation
  • Reproducibility is harder to maintain as it is a GUI
overview
overview

Time for an exercise!

Exercises found here

Answers to exercise

Answers can be found here