Summary information of the gene expression study is displayed below.

A table containing information for each of the experimental conditions used in the gene expression study is displayed below. Each experimental condition relates to a column in the gene expression dataset in the 'Gene Expression Dataset' tab.

A table containing the gene expression data is displayed below. Each column relates to an experimental condition, each row relates to a gene, and each value relates to a gene expression value for that gene under that experimental condition. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Download

Generated using R plotly. The plot below displays the distribution of the values of the genes in the dataset. This plot is useful for identifying if the data is normalised before performing differential expression analysis. If density curves are similar from gene to gene, it is indicative that the data is normalized and cross-comparable. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R plotly. The plot below displays the distribution of the values of the genes in the dataset. This plot is useful for identifying if the data is normalised before performing differential expression analysis. If density curves are similar from gene to gene, it is indicative that the data is normalized and cross-comparable. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R plotly. The plot below displays the distribution of the values of the genes in the dataset. The quartiles are calculated using the linear method. Viewing the distribution can be useful for determining if the data in the dataset is suitable for differential expression analysis. Generally, median-centred values are indicative that the data is normalized and cross-comparable. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R prcomp and plotly. Principal component analysis (PCA) reduces the dimensionality of multivariate data to two dimensions that can be visualized graphically with minimal loss of information.

Eigenvalues correspond to the amount of the variation explained by each principal component (PC). The plot displays the eigenvalues against the number of dimensions. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R prcomp and R plotly. Principal component analysis (PCA) reduces the dimensionality of multivariate data to two dimensions that can be visualized graphically with minimal loss of information.

Eigenvalues correspond to the amount of the variation explained by each principal component (PC). The plot displays the eigenvalues for each individual (row) in the gene expression dataset for the top two principal components (PC1 and PC2). The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R limma and plotly. The plot below is used to check the mean-variance relationship of the expression data, after fitting a linear model. It can help show if there is a lot of variation in the data. Each point represents a gene. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R cor and heatmaply. The plot below compares the correlation values of the samples in a heatmap. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R prcomp and R plotly. Principal component analysis (PCA) reduces the dimensionality of multivariate data to two dimensions that can be visualized graphically with minimal loss of information.

Eigenvalues correspond to the amount of the variation explained by each principal component (PC). The plot displays the eigenvalues for each variable (column) in the gene expression dataset for the top two principal components (PC1 and PC2). The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R prcomp and R plotly. Principal component analysis (PCA) reduces the dimensionality of multivariate data to two dimensions that can be visualized graphically with minimal loss of information.

Eigenvalues correspond to the amount of the variation explained by each principal component (PC). The plot displays the eigenvalues for each variable (column) in the gene expression dataset for the top three principal components (PC1, PC2 and PC3). The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

Generated using R umap and plotly. Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique useful for visualizing how genes are related to each other. The number of nearest neighbours used in the calculation is indicated in the graph. The values are displayed post KNN imputation, count per million transformation and log transformation if selected.

A table containing information for each of the experimental conditions used in the gene expression study is displayed below. In the group column, select the experimental conditions you want to include in group 1, group 2 or N/A if you want the experimental condition excluded from differential gene expression analysis. During differential gene expression analysis, group 1 is compared against group 2.

**Select the experimental conditions to include in Group 1.**

**Select the experimental conditions to include in Group 2.**

The parameters for differential gene expression analysis are displayed below. Please select the appropriate parameters and click analyse to perform differential gene expression analysis.

Generated using R limma. The table below displays the top differentially expressed genes between the groups selected.

adj.P.Val is the P-value after adjustment for multiple testing. This column is generally recommended as the primary statistic by which to interpret results. Genes with the smallest P-values will be the most reliable.

P.Value is the Raw P-value

t is the Moderated t-statistic

B is the B-statistic or log-odds that the gene is differentially expressed

logFC is the Log2-fold change between two experimental conditions

F is the moderated F-statistic which combines the t-statistics for all the pair-wise comparisons into an overall test of significance for that gene

Download

Generated using R limma and plotly. Use to view the distribution of the P-values in the analysis results. The P-value here is the same as in the Top differentially expressed genes table and computed using all selected contrasts. While the displayed table is limited by size this plot allows you to see the 'big picture' by showing the P-value distribution for all analyzed genes.

Generated using limma (vennDiagram). Displays the number of differentially expressed genes versus the number of non-differentially expressed genes.

Generated using R limma (qqt) and plotly. Plots the quantiles of a data sample against the theoretical quantiles of a Student's t distribution. This plot helps to assess the quality of the limma test results. Ideally the points should lie along a straight line, meaning that the values for moderated t-statistic computed during the test follow their theoretically predicted distribution.

Generated using R limma and plotly. The volcano plot displays statistical significance (-log10 P value) versus magnitude of change (log2 fold change) and is useful for visualizing differentially expressed genes. Highlighted genes are significantly differentially expressed at the selected adjusted p-value cutoff value.

Generated using R limma and plotly. The mean difference (MD) plot displays log2 fold change versus average log2 expression values and is useful for visualizing differentially expressed genes. Highlighted genes are significantly differentially expressed at the selected adjusted p-value cutoff.

Generated using R limma and heatmaply. A heatmap plot displaying the top differentially expressed genes expression values for each experimental condition. The expression values are displayed post KNN imputation, count per million transformation, log transformation, normalisation and limma precision weights if selected.

Gene enrichment analysis is performed using Enrichr. Information on each of the databases is available from the Enrichr website via the link below. Enrichr

**Select the column containg the gene symbols and input any missing gene symbols.**

Generated using R enrichR. The table below displays the gene sets identified from the genes, including several summary and statistical values.

Download

Generated using R enrichR and plotly. The Barchart plot displays the gene sets along the y axis and the user selected column along the y axis. The points are ordered based on the user selected column.

Generated using R enrichR and plotly. The volcano plot displays statistical significance (-log10 P value) versus odds ratio and is useful for visualizing the statistically significant gene sets.

Generated using R enrichR and plotly. The manhattan plot displays the gene sets along the x axis and the user selected column along the y axis. The points are ordered based on the user selected column.