Welcome to TeaProt!
The online proteomics/transcriptomics analysis pipeline featuring novel underrepresented PTM genesets.
TeaProt is an online Shiny tool that integrates upstream transcription factor enrichment analysis with downstream pathway analysis through an easy-to-use interactive interface. TeaProt maps user’s omics data with online databases to provide a collection of annotations on drug-gene interactions, subcellular localizations, phenotypic functions, gene-disease associations and enzyme-gene interactions, usefull for further analyses. Users can combine TeaProt and urPTMdb for a novel and easy-to-use online proteomics/transcriptomics analysis pipeline featuring novel underrepresented genesets to allow the discovery of downstream cellular processes, upstream transcriptional regulation and classes of PTMs potentially regulated by a users’ intervention.
1. Uploading your data
- Convert the file to the right format
- accepted formats include '.csv', '.txt', '.xls', '.xlsx'
- Make sure your file contains the following types of columns:
- Identifiers (gene names/ UniProt ID/ ENSEMBL ID)
- Fold change values (log2)
- Click “Download demo data” for clarity
- Once the above is checked, press “Browse” to upload your data
2. Preparing for analysis
- Select the identifier column from the drop-down box
- Select the p-value column from the drop-down box
- Select the fold change column from the drop-down box
- Choose the type of species of which your data is sourced from
- Choose a p-value cut off as a determinant for significance
- Choose a (log2) fold change cutoff as a determinant for significance
3. Start the analysis
- Press “Start” to initiate analysis
4. View analysis
- Press “Analysis” on the sidebar to view the results and annotated datasets
|Linux||Ubuntu 20.04.1 LTS||87.0.4280.88||78.0.1||n/a||n/a|
This research was supported by use of the Nectar Research Cloud and by the University of Melbourne Research Platform Services. The Nectar Research Cloud is a collaborative Australian research platform supported by the National Collaborative Research Infrastructure Strategy. This work was funded by an Australian National Health and Medical Research Council Ideas Grant (APP1184363) and The University of Melbourne Driving Research Momentum program.
Human Protein Atlas subcellular localization data was obtained from http://www.proteinatlas.org and has previously been described in Thul PJ et al., A subcellular map of the human proteome. Science. (2017). Drug-gene interaction data was obtained from DGIdb (https://www.dgidb.org/downloads). Genotype-phenotype associations were downloaded from the International Mouse Phenotyping Consortium (IMPC, www.mousephenotype.org). Enzymatic annotations were retrieved from BRENDA (https://www.brenda-enzymes.org/). Disease-Gene annotations were retrieved from DisGeNet (https://www.disgenet.org/). Transcription factor data was downloaded from CHEA3 (maayanlab.cloud/chea3/). The DNA vector image used in the TeaProt banner on the Welcome page was obtained from Vecteezy (Human dna design Vectors by Vecteezy).
The underrepresented PTM gene-set database.
urPTMdb is a database of gene-sets covering currently underrepresented post-translational moditications (PTMs). Previously published studies and datasets (PRIDE / MASSIVE) are analyzed to identify substrates or interactions relating to PTMs. We have analyzed the results of 58 studies, generating 141 gene-sets covering 18 underrepresented PTMs. Additionally, we generated pathway gene-sets of the primary enzymes involved in the PTMs, as well as consensus gene-sets where replicate studies were available.
The code to generate urPTMdb is accessible at github.com/JeffreyMolendijk/urPTMdb.
urPTMdb is included as an option in the fgsea tab of TeaProt for analysis of your uploaded dataset. Alternatively, urPTMdb can be downloaded for use in external tools by clicking the download button on the right. urPTMdb is provided in ‘.gmt’ format.
|Number of studies:||58|
|Number of PTMs:||18|
|Number of gene-sets:||141|
urPTMdb is generated by analyzing the genes reported by many studies to create novel PTM-related gene-sets. urPTMdb is provided in three formats, containing either the original identifier, or formats where genes from other species have been converted to the species of interest. It is recommended to download the database for the species you plan to analyze. In TeaProt, the database use is determined by the species selected at the start of the analysis.
- urPTMdb Original - Contains the gene identifiers as reported in the original studies
- urPTMdb Human - All mouse genes have been converted to human homologs using homologene
- urPTMdb Mouse - All human genes have been converted to mouse homologs using homologene
About the Table
User-uploaded input data is annotated with information from various sources. The annotated table contain information of:
- Drug Interaction
- Cell ontology
- Associated disease
Export options are available at the bottom of the table
About the Analysis
Analysis are performed to analyze the p-values and fold-changes of your data.
- Bar graphs that show the distribution of p-values and fold-changes in the data
- Volcano plot that shows the fold-changes and corresponding p-values of each data point
- (Hover onto each data point to view the exact values)
About the Analysis
Your data is mapped with online databases to provide annotations. For each sets of graphs below, your data is mapped to a different database. The first graph of each section displays the number of genes that could be annotated by the mapped database. The second graph displays the annotation
BRENDA enzymatic reactions
About the Analysis
Analyses are performed to demonstrate the changes in gene expressions in relation to several annotations including (1) subcellular localization, (2) DisGeNet, (3) Drug-gene interactions and (4) International Mouse Phenotyping Consortium interactions. A Pearson’s Chi-squared test based on protein annotations (subcellular localization) indicates whether specific annotations are primarily found in upregulated, downregulated or non-significant (NS) proteins. Only localizations with positive residuals in the upregulated group are shown. The data in the figure is colored by Pearson residuals, and sized by the absolute Pearson residuals.
IMPC genotype-phenotype Associations
About the Analysis
This analysis is dependent on the fold-change values in your data. The graph displays the most enriched biological pathways that are associated with the differential expressions. In the input section, choose the geneset collection that you want the analysis to be based on. After running the analysis, the results will be displayed in the following tabs:
- panel: Image showing the top x positively and negatively enriched pathways
- table: Table showing all fgsea results
- volcano: Volcano plot showing the p-value and NES of each tested geneset
- single: Tab showing fgsea enrichment and coloured volcano plot for a single geneset of interest