What it is
GVP Scout is an excel spreadsheet that searches through exported peptide data from X!Tandem using the Global Proteome Machine (GPM). The spreadsheet uses logical formulas to find single amino acid variants (SAVs) that are associated with nsSNPs which are inherited and detected as genetically variant peptides (GVPs). We include a search for all GVPs, but are especially interested in GVPs that have a global minor allele frequency > 0.5%. The spreadsheet reports putative GVPs which must then be confirmed by looking at he spectra, and then validated genetically. Only GVPs found in hair are currently used here.
How to use it
Data must be first analyzed in the GPM Fury software. Default GPM Fury settings are used in the advanced tab except for exclusion of viruses and prokaryotes, peptide and protein log(e) of -1, fragment mass error of 20 ppm, parent mass error of 100 ppm, and inclusion of point mutations. Note that these settings should change based on the instrumentation and instrument settings used. After analysis, export the peptide data and paste columns A-Q into the first tab of GVP Scout as shown...
After the calculations are finished, click on the Variant List tab in excel to obtain the list of new variants and existing GVPs.
How it works
Excel logical formulas are used to search for modifications associated with nsSNPs for every peptide identified in the dataset. Note than only minor alleles will flag an identification. Information for gene name, SAP, and RSID are extracted from these putative variants. These candidates are then compared to a list of existing GVPs and a list of putative GVPs found in the hair proteome with MAF > 0.5%. Three lists are then generated; new variants, variants MAF > 0.5%, and existing GVPs.
A word of caution
Putative GVPs are not held to stringent quality standards and therefore must be confirmed by checking the mass spectal data. Ensure that transitions exist around and including the variant in question. Also ensure that the quality of the whole spectrum is adequate.
GVPs in the variant list may have MAF > 0.5%. We've assembled a putative list list of GVPs that have a MAF > 0.5% based on hair proteomes from our own samples. However, proteomes may differ based on the processing or analysis methods and therefore, may introduce different proteins which include GVPs with MAF > 0.5%.
Putative GVPs which have been checked proteomically must undergo further standards of confirmation such as ensuring the tryptic sequence is unique, the RSID corresponds to a missense mutation, the mass shift is not due to a chemical modification. Otherwise, the peptide may not be confirmed as a GVP. Confirmed GVPs must then undergo validation via DNA genotyping.
Dr. Glendon Parker has a patent based on the use of genetically variant peptides for human identification (US 8,877,455 B2, Australian Patent 2011229918, Canadian Patent CA 2794248, and European Patent EP11759843.3). The patent is owned by Parker Proteomics LLC. Protein-Based Identification Technologies LLC (PBIT) has an exclusive license to develop the intellectual property and is co-owned by Utah Valley University and GJP. This ownership of PBIT and associated intellectual property does not alter policies on sharing data and materials. These financial conflicts of interest are administered by the Research Integrity and Compliance Office, Office of Research at the University of California, Davis to ensure compliance with University of California Policy.
Please cite the following manuscripts if you use GVP Scout for publishable works.
Insert FSIG citation
Craig, R., & Beavis, R. C. (2004). TANDEM: matching proteins with tandem mass spectra. Bioinformatics, 20(9), 1466-1467.
Craig, R., Cortens, J. P., & Beavis, R. C. (2004). Open source system for analyzing, validating, and storing protein identification data. Journal of proteome research, 3(6), 1234-1242.