Handout for WashU EpiGenome Browser Tutorial
Bold blue text represents browser functions, such as facet table or matplot.
Bold purple text represents browser buttons, Tracks, and Apps.
Bold gray text represents browser menus accessed by right-click on a track, track name, or colormap block, such as Configure or Information.
Topics and definitions are numbered or bulleted. To follow along with the demo, follow the instructions at the ➽ arrows.
Follow the demo as depicted in the screen shots. Places to click are marked with red circles. Circled numbers (e.g. ) denote the order to click. Screen shots are ordered by bracketed letters (e.g. [A] and [B]) as necessary.
To start again at any point, load the saved session status for the section just completed, indicated by the ✰ and green highlighted-text.
This tutorial has 12 demos:
[1-3] Loading the EpiGenome Browser and public data hubs
[4-5] Exploring Metadata
 Genome navigation
[8-9] Track visualization & customization
 Adding new tracks
 Using EpiGenome Browser apps
 Loading human genetic variation tracks
Loading the EpiGenome Browser and public data hubs
- To access the Browser, go to epigenomegateway.wustl.edu/browser. Browse the menu to select the genome you want to use.
- For today, click “human hg19”.
- Then click “Public hubs” to go access publicly available data.
- Load the public data hub.
- A data hub is a collection of publicly available data. There are several available, including hubs of data from the Roadmap Epigenome Project and the ENCODE Consortium. The “Reference human epigenomes from Roadmap Epigenomics Consortium” is a collection of 4 hubs.
- Click the “Reference human epigenomes from Roadmap Epigenomics Consortium” box to access these hubs.
- Then load the “Roadmap Data from GEO” hub by clicking the “Load” button.
- Click the “X” at the top of any floating window or press “Esc” to close it.
- Orientation to the Wash U Epigenome browser view.
- Use the colormap panel and information box to find metadata terms associated with tracks.
- The information box for each track displays metadata terms associated with that track and information about the experiment that generated that track.
- [A] Right-click a track to access the information box to learn more about the sample. [B] See how the metadata are organized hierarchically.
- The colormap organizes metadata terms for quick display.
- [A] Add metadata terms to the colormap by right-clicking on one of the titles, “Assay” or “Sample” and select “+ Add metadata terms”.
- [B] Mark the “Track type” check box to add that category to the metadata colormap.
- [C] Drag the RefSeq genes track above the chromosome ideogram. See that it now acquires a box in the metadata heatmap. Mouse-over to see that the “Track type” is “hammock”.
[A] [B] [C]
- The facet table is an intuitive way to browse large amounts of data by using metadata to stratify datasets.
- The green numbers in each cell represent the number of tracks available for that Assay/Sample. The red number is the number of tracks for that Assay/Sample currently displayed.
- Access the facet table through the Tracks button. Then click the track number icon.
- Genome navigation in the Wash U EpiGenome browser.
- Use the navigation box to relocate to EDC locus. Enter the locus coordinates in the “coordinate” search box: chr1:151880250-153605000.
- The sessions app allows you to save a browsing status to revisit later. Each session can have several versions.
- This workshop has a pre-saved session associated with it that includes different versions throughout the tutorial. If you get lost during the workshop, you can catch up by loading the pre-saved session. To start again at any point, load the saved session status for the section just completed, indicated by the ✰ and green highlighted-text.
- [A] Click the Apps button to enter the Sessions function from the apps menu.
- [B] Enter the session ID in the “Retrieve” text box and click “Retrieve”. The session ID for this workshop is ddSQFzBWvS.
- [C] Then select the 1) EDC locus version of this session.
Track visualization & customization
- Track navigation & customization is processed by right-clicking a track to enter the Configure menu.
- [A] Enter the configure menu for H3K9me3 fibroblasts by right-clicking on the H3K9me3 track.
- [B] Click the “positive” button to change the color to blue. To exit the menu, click anywhere outside the configure menu box.
- You can change the y-axis scale, track height, and other rendering features in the Configure menu.
- Use the matplot feature to compare two numerical tracks on the same y-axis scale.
- [A] Use the multiple select function to select both H3K9me3 tracks. Hold the Shift button and right-click on both H3K9me3 tracks. They will be highlighted in yellow.
- [B] Then right-click to open the options menu and choose Apply matplot.
- Using matplot, the two tracks can share the same y-axis. This makes the data easy to compare, in this case for the same histone modification ChIP-seq in two different cell types.
- To catch up, open the saved session and the status: 2) H3K9me3 matplot.
Adding new tracks
- Use the dataset search box to find data sets using keywords.
- To test our hypothesis that the EDC genes are cell type-specifically expressed, we will add keratinocyte and fibroblast RNA-seq datasets.
- [A] Click the Tracks button to access the dataset search box. Search for “keratinocyte AND RNA”.
- Note: the search function is case-sensitive and plural-sensitive!
- [B] Then choose the result “RNA-seq of Penis Foreskin Keratinocyte Primary Cells”. It will turn green. Click the green “Add 1 track” button to add the track.
- Repeat the search for “fibroblast AND RNA”. Add the “RNA-seq of Penis Foreskin Fibroblast Primary Cells” track.
- Both tracks appear at the bottom of the browser view.
- To catch up, open the saved session and the status: 3) RNA-seq tracks.
Using EpiGenome Browser apps
- Use applications found in the browser’s Apps menu to generate quantitative tests of our hypothesis.
- The Gene set app allows the user to submit a list of genes or genomic positions to the browser for analysis.
- To examine our hypothesis that the SPRR genes are differentially expressed between epidermal keratinocytes and dermal fibroblasts, we will use the SPRR gene set list.
- [A] Open the Apps menu and navigate to the Gene & region set app.
- The “SPRR genes” gene set has been pre-loaded in the session version for this workshop.
- [B] Click the “SPRR genes” box and the “edit” button to view the gene set configurations.
- In the genes panel, the double green arrow button can be used to find more gene models for genes in the gene set. This workshop uses all RefSeq gene models.
- Below the genes panel is the gene-region specifications. This tells the browser what gene-associated regions to analyze. We want to analyze the entire gene plus the 5kb upstream region.
[C] Configure the gene set region:
- (1) Click the “change >>” button to access the default gene-regions.
- (2) Ensure that the radio button for “5’ and 3’ flanking” is selected.
- (3) Move the green cursor to the “5 kb” mark.
- (4) Then select “entire gene or interval” radio button.
- [D] To trigger the gene set view, click on the banner box for “SPRR genes”. Then click the app “gene set view”.
- The browser is now displaying the specified gene and gene-associated regions contiguously in the browser.
- To catch up, open the saved session and the status: 4) Gene set view.
- To test the hypothesis that the DNase hypersensitivity regions are proximal enhancers, we will use the Scatterplot app to plot DNase vs H3K4me1 signal in the keratinocyte sample.
- [A] Navigate to the Scatterplot app in the Apps menu.
- [B] In the Scatterplot app window:
- (1) Click “Choose a gene set” button and select the “SPRR genes”.
- (2) Click “choose track>>” to select the x-axis track.
- (3) Select “H3K4me1 of Penis Foreskin Keratinocyte Primary Cells”.
- (4) Click “choose track>>” to select the y-axis track.
- (5) Select “DNase hypersensitivity of Penis Foreskin Keratinocyte Primary Cells”.
- (6) Click “SUBMIT” to run the app.
- Now the app window is displaying the scatterplot with specified x- and y-axes. The result shows a general positive correlation, where an increase in DNase hypersensitivity corresponds with an increase in H3K4me1 signal in keratinocytes. Configure the scatterplot using the options at the below the graph.
- Last use the Gene plot app to quantify the RNA signal over SPRR genes in both cell types.
- [A] Navigate to the Apps menu and use the app search box to search for the “Gene plot” app. Select “Gene plot” to enter the gene plot app window.
- [B] Configure the gene plot data:
- (1) In the first box of the app window, click the “Choose a gene set” button and select “SPRR genes”.
- In the “Data track” box, click “Select numerical track >>” (2) and choose the “RNA-seq of Penis Foreskin Keratinocyte Primary Cells” (3).
- [C] Configure the gene plot graph:
- (4) In the “Graph type” box, select the “gene parts” plot.
- (5) Click the orange button to “Make gene plot”.
- [D] The gene plot show the data for the RNA-seq track chosen over the SPRR gene bodies. Data are averaged, binned, and plotted on a normalized “metagene” along the x-axis. The y-axis is the RNA-seq values.
- [D] To generate the same plot for fibroblast RNA-seq data, click the “Go back” button.
- [E] Then change the numerical track to “RNA-seq of Penis Foreskin Fibroblast Primary Cells” and click the “Make gene plot” button again.
- [F] Now the gene plot is plotting data from the fibroblast RNA-seq track over the same set of genes. We can see they have much lower expression in this cell type.
Loading human genetic variation tracks
- View human genetic variation tracks in the EpiGenome Browser.
- The EpiGenome Browser allows visualization of human genetic variation. HapMap and dbSNP tracks are found under the “Variation” category of Annotation tracks.
- [A] Click the “X” to exit the gene set view to return to the linear genome.
- [B] Click the Tracks button and select “Annotation tracks”.
- [C] Scroll to “Population variation,” then chose “dbSNP release 137”.
- [D] The dbSNP track is displayed below the genes track. SNPs are color-coded according to their type of mutation, e.g. deletion or insertion.