Welcome to TeaProt!

The online proteomics/transcriptomics analysis pipeline featuring novel underrepresented PTM genesets.

Introduction

TeaProt is an online Shiny tool that integrates upstream transcription factor enrichment analysis with downstream pathway analysis through an easy-to-use interactive interface. TeaProt maps user’s omics data with online databases to provide a collection of annotations on drug-gene interactions, subcellular localizations, phenotypic functions, gene-disease associations and enzyme-gene interactions, usefull for further analyses. Users can combine TeaProt and urPTMdb for a novel and easy-to-use online proteomics/transcriptomics analysis pipeline featuring novel underrepresented genesets to allow the discovery of downstream cellular processes, upstream transcriptional regulation and classes of PTMs potentially regulated by a users’ intervention.



Tutorial

1. Uploading your data

  • Convert the file to the right format
    • accepted formats include '.csv', '.txt', '.xls', '.xlsx'
  • Make sure your file contains the following types of columns:
    • Identifiers (gene names/ UniProt ID/ ENSEMBL ID)
    • P-values
    • Fold change values (log2)
  • Click “Download demo data” for clarity
  • Once the above is checked, press “Browse” to upload your data

2. Preparing for analysis

  • Select the identifier column from the drop-down box
  • Select the p-value column from the drop-down box
  • Select the fold change column from the drop-down box
  • Choose the type of species of which your data is sourced from
  • Choose a p-value cut off as a determinant for significance
  • Choose a (log2) fold change cutoff as a determinant for significance

3. Start the analysis

  • Press “Start” to initiate analysis

4. View analysis

  • Press “Analysis” on the sidebar to view the results and annotated datasets


Browser compatibility

OS version Chrome Firefox Microsoft Edge Safari
Linux Ubuntu 20.04.1 LTS 87.0.4280.88 78.0.1 n/a n/a
MacOS 10.13.6 87.0.4280.67 83.0 n/a 13.1.2
Windows 10 87.0.4280.88 83.0 87.0.664.55 n/a



Contact

For technical support, please email support@coffeeprot.com. To contact the Parker lab, please contact ben.parker@unimelb.edu.au.



Citation

Molendijk J, Yip R, Parker BL. urPTMdb/TeaProt: Upstream and Downstream Proteomics Analysis. J Proteome Res. 2022 Jun 27. doi: 10.1021/acs.jproteome.2c00048. Epub ahead of print. PMID: 35759515.



Acknowledgements

This research was supported by use of the Nectar Research Cloud and by the University of Melbourne Research Platform Services. The Nectar Research Cloud is a collaborative Australian research platform supported by the National Collaborative Research Infrastructure Strategy. This work was funded by an Australian National Health and Medical Research Council Ideas Grant (APP1184363) and The University of Melbourne Driving Research Momentum program.

Human Protein Atlas subcellular localization data was obtained from http://www.proteinatlas.org and has previously been described in Thul PJ et al., A subcellular map of the human proteome. Science. (2017). Drug-gene interaction data was obtained from DGIdb (https://www.dgidb.org/downloads). Genotype-phenotype associations were downloaded from the International Mouse Phenotyping Consortium (IMPC, www.mousephenotype.org). Enzymatic annotations were retrieved from BRENDA (https://www.brenda-enzymes.org/). Disease-Gene annotations were retrieved from DisGeNet (https://www.disgenet.org/). Transcription factor data was downloaded from CHEA3 (maayanlab.cloud/chea3/). The DNA vector image used in the TeaProt banner on the Welcome page was obtained from Vecteezy (Human dna design Vectors by Vecteezy).


Demo data

Inputs


                      
                    

urPTMdb

The underrepresented PTM gene-set database.

urPTMdb

urPTMdb is a database of gene-sets covering currently underrepresented post-translational moditications (PTMs). Previously published studies and datasets (PRIDE / MASSIVE) are analyzed to identify substrates or interactions relating to PTMs. We have analyzed the results of 58 studies, generating 141 gene-sets covering 18 underrepresented PTMs. Additionally, we generated pathway gene-sets of the primary enzymes involved in the PTMs, as well as consensus gene-sets where replicate studies were available.



Citation

Molendijk J, Yip R, Parker BL. urPTMdb/TeaProt: Upstream and Downstream Proteomics Analysis. J Proteome Res. 2022 Jun 27. doi: 10.1021/acs.jproteome.2c00048. Epub ahead of print. PMID: 35759515.



Code access

The code to generate urPTMdb is accessible at github.com/JeffreyMolendijk/urPTMdb.



Using urPTMdb

urPTMdb is included as an option in the fgsea tab of TeaProt for analysis of your uploaded dataset. Alternatively, urPTMdb can be downloaded for use in external tools by clicking the download button on the right. urPTMdb is provided in ‘.gmt’ format.

Download urPTMdb

Number of studies:58
Number of PTMs:18
Number of gene-sets:141
Filesize:1,188 KB

urPTMdb is generated by analyzing the genes reported by many studies to create novel PTM-related gene-sets. urPTMdb is provided in three formats, containing either the original identifier, or formats where genes from other species have been converted to the species of interest. It is recommended to download the database for the species you plan to analyze. In TeaProt, the database use is determined by the species selected at the start of the analysis.


  • urPTMdb Original - Contains the gene identifiers as reported in the original studies
  • urPTMdb Human - All mouse genes have been converted to human homologs using homologene
  • urPTMdb Mouse - All human genes have been converted to mouse homologs using homologene
Download urPTMdb - Original Download urPTMdb - Human Download urPTMdb - Mouse

Browse geneset

About the Table

User-uploaded input data is annotated with information from various sources. The annotated table contain information of:

  1. Drug Interaction
  2. Cell ontology
  3. Associated disease

Export options are available at the bottom of the table

Annotated table

Loading...

About the Analysis

Analysis are performed to analyze the p-values and fold-changes of your data.

  1. Bar graphs that show the distribution of p-values and fold-changes in the data
  2. Volcano plot that shows the fold-changes and corresponding p-values of each data point
    • (Hover onto each data point to view the exact values)

Distributions

Loading...

Volcano plot

Loading...

About the Analysis

Your data is mapped with online databases to provide annotations. For each sets of graphs below, your data is mapped to a different database. The first graph of each section displays the number of genes that could be annotated by the mapped database. The second graph displays the annotation

Drug-gene interaction

Loading...

Subcellular localization

Loading...

IMPC procedure

Loading...

DisGeNet disease

Loading...

BRENDA enzymatic reactions

Loading...

About the Analysis

Analyses are performed to demonstrate the changes in gene expressions in relation to several annotations including (1) subcellular localization, (2) DisGeNet, (3) Drug-gene interactions and (4) International Mouse Phenotyping Consortium interactions. A Pearson’s Chi-squared test based on protein annotations (subcellular localization) indicates whether specific annotations are primarily found in upregulated, downregulated or non-significant (NS) proteins. Only localizations with positive residuals in the upregulated group are shown. The data in the figure is colored by Pearson residuals, and sized by the absolute Pearson residuals.

Subcellular Localizations

Loading...

DisGeNet

Loading...

Drug-gene interactions

Loading...

IMPC genotype-phenotype Associations

Loading...

About the Analysis

This analysis is dependent on the fold-change values in your data. The graph displays the most enriched biological pathways that are associated with the differential expressions. In the input section, choose the geneset collection that you want the analysis to be based on. After running the analysis, the results will be displayed in the following tabs:

  • panel: Image showing the top x positively and negatively enriched pathways
  • table: Table showing all fgsea results
  • volcano: Volcano plot showing the p-value and NES of each tested geneset
  • single: Tab showing fgsea enrichment and coloured volcano plot for a single geneset of interest

input