Processing ...

home uniprot
Protein Search Site Search
 
HOME / Search / Peptide Match

Peptide Match User Guide

PIR Peptide Match allows users to quickly retrieve all occurrences for a given query peptide from the UniProtKB protein sequences. The matched results are tabulated and can be narrowed down to a specific set of organisms by browsing the taxonomy tree or taxonomy group. Within the web page, the query peptide can be searched against the three popular peptide spectra databases and libraries (gpmDB, PeptideAtlas and NIST Peptide Library) to check if the query peptide exists in any of these databases. We also provide batch retrieval from the web interface and programmatic access via RESTful web services.

If you want to link your peptide sequence directly to our search engine, please use the following url:
http://proteininformationresource.org/cgi-bin/peptidematch?peptide=XXXX (e.g. http://proteininformationresource.org/cgi-bin/peptidematch?peptide=AAVEEGIVLGGGCALLR).

Publication
Chuming Chen; Zhiwen Li; Hongzhan Huang; Baris E. Suzek; Cathy H. Wu; UniProt Consortium. A Fast Peptide Match Service for UniProt Knowledgebase. Bioinformatics 2013; doi: 10.1093/bioinformatics/btt484.


Single Peptide Match

User can find an exact match for a peptide sequence query in the selected database. The search can be performed against:

  • UniProtKB, which is the central hub for the collection of functional information on proteins, with accurate, consistent, and rich annotation. It consists of two sections: a section containing manually-annotated records with information extracted from literature and curator-evaluated computational analysis (UniProtKB/Swiss-Prot), and a section with computationally analyzed records that await full manual annotation (UniProtKB/TrEMBL).
  • A subset of UniProtKB entries belonging to a certain organism which is a complete proteome.
  • A subset of UniProtKB entries beloing to a set of organisms which are reference proteomes.

User Inputs

The main user interface for single peptide match is shown below. The first section is mainly deal with selecting database which will be used in the search. The second section is for user to input the peptide sequence. The "?" on the upper-right corner is a link to detailed user guide.

By default, the entire UniProtKB will be searched. However, if a user is only interested in a certain organism or a set of organsims, only a subset of the UniProtKB entries will be searched. For example, if a user is interested in an organism in the UniProt complete proteome set, A complete proteome consists of a set of proteins of organism whose genome has been completely sequenced, the second option allows user to specify the organism by either entering the organism name or corresponding NCBI taxonomy ID as shown below.

If a user is interesed in a larger set of organisms and want to select multiple organisms to search against, the third option alphabetically organizes the names of UniProt reference proteome set. A reference proteome is the complete proteome of well-studied model organism or organism related to biomedical research. The selected organisms are tracked while user is making multiple selections as shown below.

Furthermore, the search results can be limited to show only UniRef100 representative sequences within the UniProtKB to remove the redundancy. Leucine (L) and Isoleucine (I) equivalent search is also implemented to support MS-based proteomics study.

Results Display

Once the search is done, the matched results will be displayed as table shown below.

The table is organized into 9 sections:

1 - Query Peptide Sequence
Show the query peptide sequence as well as the sequence data set, search options selected by the user.

2 - Search query peptide in proteomics databases
We provide a dynamic search of the query peptide against the following four major proteomic/peptide spectral databases from our web application using the search engines of those individual databases.

  • GPMDB: A database constructed to utilize the information obtained by GPM servers to aid in the difficult process of validating peptide MS/MS spectra as well as protein coverage patterns.
  • NIST Peptide Libraries: A set of libraries of peptide tandem mass spectra that serves as a biochemical reference database for protein and proteome analysis.
  • PeptideAtlas: A multi-organism, publicly accessible compendium of peptides identified in a large set of tandem mass spectrometry proteomics experiments.
  • PRIDE: A centralized, standards compliant, public data repository for proteomics data, including protein and peptide identifications, post-translational modifications and supporting spectral evidence.
Follow the links on the result page to view the Mass Spectra of query peptide as shown below.

The summary lists all the organisms that have the Mass Spectra of the query peptide, user can follow the links on the right to go to the corresponding proteomics database to view the actual Mass Spectra.

3 - Summary of Matched Organisms
The maxium top 5 most matched organisms and the rest of organisms are shown as pie chart with the information of the number and percent of proteins in each category. Click the count number in each organism will show the matched proteins of that organism. The "Others" link links to the paginated table of all the matched organisms and corresponding matched proteins as shown in section 6.

4 - Summary of Matched Taxonomy Groups
The maxium top 5 most matched taxonomy groups and the rest of taxonomy groups are shown as pie chart with the information of the number and percent of proteins in each category. Click the count number in each taxononmy group will show the matched proteins of that taxonomy group. "Others" link links to the taxononmy group view as shown in section 7.

5 - Paginated matched proteins
Show the pagination information of the matched proteins and allow users to navigate to different pages.

6 - Browse by organism
Click to view the matched proteins organized by organisms as shown below.

7 - Browse by taxonomic group
Click to view the taxonomy groups of matched proteins as shown below.

On the top is information about the query peptide sequence user entered, the matched organisms and the total number of protein sequences matched. Then the matched results are divided into 4 tables: Archaea, Bacteria, Eukaryota and Others. In each table, the first column is the taxonomy groups with the links to NCBI taxonomy. The second column is the number of matched protein entries and detailed view of those matched proteins. The third column is the % of matched proteins in its corresponding taxonomy group.

8 - Browse by taxonomic tree
Click to view the taxonomy tree of matched proteins as shown below. Each tree node includes the taxonomy name with links to NCBI taxonomy, the rank of taxonomy if it does exist, and finally the number of proteins whose lineage contain this taxonomy. The number links to a tab view of those proteins.

9 - Save current table view and do analysis: BLAST, FASTA, Pattern Match, Multiple Alignment and Domain Display
The current view of the result table can be saved to the user's local computer. The results will be saved for selected entries or, if no proteins are selected, for all entries. Clicking "Table" will save the displayed columns as a Tab-delimited text file, which may be imported into a spreadsheet for easier viewing or analysis. Clicking "FASTA" will save the IDs and sequences in FASTA format:

User can selected matched protein entries for further analyzed using the sequence analysis programs available in the Results page. First, select the protein(s) using the checkboxes on the left side of the table, then click the corresponding analysis tool.

  • Click "BLAST" or "FASTA" button, and a new query page will be displayed, along with the parameters that were selected in the initial search.
  • Click "Pattern Match" to search against the PROSITE database.
  • For multiple alignment, check at least 2 proteins (but no more than 70), then click the "Multiple Alignment" button. This will open the Multiple alignment form from which you can select one of the alignment programs: ClustalW, T-Coffee or Muscle. The result page will display the alignment and alignment viewer. For ClustalW and T-Coffee, the neighbor-joining tree and alignment can be viewed, edited and saved using either PIR-TAV viewer or JalView. For Muscle only Jalview is available.
  • Domain display option, shows PFam domains (if present) in graphical format:

10 - Results Table
The matched proteins are displayed in a table with the following columns:

Protein AC/ID
The Protein AC/ID refers to the UniProtKB identifiers. Below these numbers, user may choose either the iProClass or the UniProtKB view of each protein entry. The source of the UniProtKB sequence is shown as UniProtKB/Swiss-Prot or UniProtKB/TrEMBL if the protein sequence is from Swiss-Prot or TrEMBL, respectively.

If User specified that the search should be limited to UniRef100 representative sequence, then "UniRef100 Cluster ID" and corresponding "Representative Protein AC" will be displayed.

Protein Name
The Common name given to a protein, that identifies its function or specifies its features.

Length
Number of amino acid residues in the matched protein.

Organism Name
The genus and species of the source organism from which the sequence is originated. Links to NCBI taxonomy information is provided.

Protein Links to Proteomics Databases
If a protein can be found in NIST peptide library, PRIDE or PeptideAtlas databases, the column will show the corresponding links to these external proteomics databases.

Immune Epitope Database
Links to Immune Epitope Database and Analysis Resource.

Match Range
This column displays in red is the query peptide within the sequence.

Back to TOC


Batch Peptide Match

User can find the exact match for a set of peptide sequence queries in the selected database. The search can be performed against:

  • UniProtKB, which is the central hub for the collection of functional information on proteins, with accurate, consistent, and rich annotation. It consists of two sections: a section containing manually-annotated records with information extracted from literature and curator-evaluated computational analysis (UniProtKB/Swiss-Prot), and a section with computationally analyzed records that await full manual annotation (UniProtKB/TrEMBL).
  • A subset of UniProtKB entries belonging to a certain organism which is a complete proteome.
  • A subset of UniProtKB entries beloing to a set of organisms which are reference proteomes.
The batch peptide match results can be downloaed for further analysis.

User Inputs

The main user interface for batch peptide match is shown below. It is very similar to the single peptide match. The "Sequence data set selection" section is exactly the same as in Single Peptide Match as shown above. The only difference is in "Input query peptides" section. We allow user to enter multiple peptide sequences, one per line or put them together as a text file, one peptide per line and upload them. As in Singple Peptide Match, the search results can be limited to show only UniRef100 representative sequences within the UniProtKB to remove the redundancy. Leucine (L) and Isoleucine (I) equivalent search is also implemented to support MS-based proteomics study.

Search Progress Report

Depending on the user inputs, the batch peptide match may take a while to complete. Therefore, we provide a progress report as shown below which shows the peptide sequence which is currently running, as well as the counts of the matched proteins and organisms once it is finised.

Download Batch Peptide Match Reports

Once the whole batch job is successfully finished, the job id becomes a link, clicking it will direct the user to select the types of match reports to download for further analysis.

The example match reports can be viewed from the links below:

Back to TOC


RESTful Web Services API

In addition to the web interface to the Peptide Match web application, we also provide access to it via RESTful web services. REST (or REpresentational State Transfer) refers to a style of building web services which makes it easy to interact programmatically with the web site. A programmatic interface, also called an Application Programming Interface (API) allows users to write scripts or programs to do peptide match, rather than having to rely on a browser to do the search on the web site, which makes it more streamline with the downstream analysis. Below is a list of services provided.

In general, user needs to provide the following parameters: the query peptide sequence(s), database ranges (either the whole UniProtKB or a subset of organisms) and finally, the format of the results. Currently, we support tab, xls, fasta, and xml formats as described below:

  • tab returns the matched protein(s) in a Tab-delimited format with the following information:
    • Protein acession number
    • Protein ID
    • Protein name
    • Protein sequence length
    • Organism name
    • Matched range(s) (maybe multiple, if the query peptide matches multiple locations in a protein sequence)
    • Proteomic databases
    • IEDB
  • xls returns the matched protein(s) with the same information as in tab foramt for easy importing into Excel.
  • fasta returns the matched protein(s) in FASTA format: The Def line of each protein record in the fasta file contains the following information separated by "^|^":
    • Protein acession number
    • Protein ID
    • Protein name
    • PIRSF ID
    • PIRSF name
    • Organism name
    • Organism taxonomy ID
    • Organism taxonomy group name
    • Organism taxonomy group ID
    • Matching peptides with their corresponding matched ranges
  • xml returns the full information of matched protein(s) in XML format: The XML schema can be found here.

Single Peptide Match

  • Example 1: get matches of a peptide sequence in UniProtKB

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=tab http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=xls http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl.xls

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=fasta http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl.fasta

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=xml http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl.xml

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=tab -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=xls -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl_uniref100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=fasta -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR -F format=xml -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_1_curl_uniref100_leqi.xml

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR" -O example_1_wget.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=tab" -O example_1_wget.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xls" -O example_1_wget.xls

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=fasta" -O example_1_wget.fasta

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xml" -O example_1_wget.xml

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&uniref100=y&leqi=y" -O example_1_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=tab&uniref100=y&leqi=y" -O example_1_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xls&uniref100=y&leqi=y" -O example_1_wget_unierf100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=fasta&uniref100=y&leqi=y" -O example_1_wget_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xml&uniref100=y&leqi=y" -O example_1_wget_uniref100_leqi.xml

  • Example 2: get matches of a peptide in a set of organisms from UniProtKB

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=tab http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=xls http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl.xls

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=fasta http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl.fasta

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=xml http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl.xml

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:

    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=tab -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=xls -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl_uniref100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=fasta -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR -F organism=9606,10090 -F format=xml -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_2_curl_uniref100_leqi.xml

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&organism=9606,10090" -O example_2_wget.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=tab&organism=9606,10090" -O example_2_wget.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xls&organism=9606,10090" -O example_2_wget.xls

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=fasta&organism=9606,10090" -O example_2_wget.fasta

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xml&organism=9606,10090" -O example_2_wget.xml

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&organism=9606,10090&uniref100=y&leqi=y" -O example_2_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=tab&organism=9606,10090&uniref100=y&leqi=y" -O example_2_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xls&organism=9606,10090&uniref100=y&leqi=y" -O example_2_wget_uniref100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=fasta&organism=9606,10090&uniref100=y&leqi=y" -O example_2_wget_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR&format=xml&organism=9606,10090&uniref100=y&leqi=y" -O example_2_wget_uniref100_leqi.xml

  • Example 3: programmatically getting matches of a peptide in a set of organisms from UniProtKB

    Perl
    			
    use strict;
    use warnings;
    use LWP::UserAgent;
    
    my $base = 'http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest';
    
    my $params = {
      format => 'tab',
      query => 'AAVEEGIVLGGGCALLR',
      organism => '9606,10090'
    };
    
    my $agent = LWP::UserAgent->new(agent => "libwww-perl");
    push @{$agent->requests_redirectable}, 'POST';
    
    my $response = $agent->post("$base", $params);
    
    while (my $wait = $response->header('Retry-After')) {
      print STDERR "Waiting ($wait)...\n";
      sleep $wait;
      $response = $agent->get($response->base);
    }
    
    $response->is_success ?
      print $response->content :
      die 'Failed, got ' . $response->status_line .
        ' for ' . $response->request->uri . "\n";
    
    										

    Java
    
    import java.io.InputStream;
    import java.io.UnsupportedEncodingException;
    import java.net.HttpURLConnection;
    import java.net.URL;
    import java.net.URLConnection;
    import java.net.URLEncoder;
    import java.util.logging.Logger;
    
    public class Example_3
    {
      private static final String PEPTIDEMATCH_SERVER = "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest";
      private static final Logger LOG = Logger.getAnonymousLogger();
    
      private static void run(ParameterNameValue[] params)
        throws Exception
      {
        StringBuilder locationBuilder = new StringBuilder(PEPTIDEMATCH_SERVER + "?");
        for (int i = 0; i < params.length; i++)
        {
          if (i > 0)
            locationBuilder.append('&');
          locationBuilder.append(params[i].name).append('=').append(params[i].value);
        }
        String location = locationBuilder.toString();
        URL url = new URL(location);
        LOG.info("Submitting...");
        HttpURLConnection conn = (HttpURLConnection) url.openConnection();
        HttpURLConnection.setFollowRedirects(true);
        conn.setDoInput(true);
        conn.connect();
    
        int status = conn.getResponseCode();
        while (true)
        {
          int wait = 0;
          String header = conn.getHeaderField("Retry-After");
          if (header != null)
            wait = Integer.valueOf(header);
          if (wait == 0)
            break;
          LOG.info("Waiting (" + wait + ")...");
          conn.disconnect();
          Thread.sleep(wait * 1000);
          conn = (HttpURLConnection) new URL(location).openConnection();
          conn.setDoInput(true);
          conn.connect();
          status = conn.getResponseCode();
        }
        if (status == HttpURLConnection.HTTP_OK)
        {
          LOG.info("Got a OK reply");
          InputStream reader = conn.getInputStream();
          URLConnection.guessContentTypeFromStream(reader);
          StringBuilder builder = new StringBuilder();
          int a = 0;
          while ((a = reader.read()) != -1)
          {
            builder.append((char) a);
          }
          System.out.println(builder.toString());
        }
        else
          LOG.severe("Failed, got " + conn.getResponseMessage() + " for "
            + location);
        conn.disconnect();
      }
    
      public static void main(String[] args)
        throws Exception
      {
        run(new ParameterNameValue[] {
          new ParameterNameValue("format", "tab"),
          new ParameterNameValue("query", "AAVEEGIVLGGGCALLR"),
          new ParameterNameValue("organism", "9606,10090"),
        });
      }
    
      private static class ParameterNameValue
      {
        private final String name;
        private final String value;
    
        public ParameterNameValue(String name, String value)
          throws UnsupportedEncodingException
        {
          this.name = URLEncoder.encode(name, "UTF-8");
          this.value = URLEncoder.encode(value, "UTF-8");
        }
      }
    }
    										

Batch Peptide Match

  • Example 4: get matches of a list of peptides in UniProtKB

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=tab http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=xls http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=fasta http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=xml http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl.xml

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=tab -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=xls -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl_uniref100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=fasta -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F format=xml -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_4_curl_uniref100_leqi.xml

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK" -O example_4_wget.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=tab" -O example_4_wget.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xls" -O example_4_wget.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=fasta" -O example_4_wget.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xml" -O example_4_wget.xml

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&uniref100=y&leqi=y" -O example_4_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=tab&uniref100=y&leqi=y" -O example_4_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xls&uniref100=y&leqi=y" -O example_4_wget_uniref100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=fasta&uniref100=y&leqi=y" -O example_4_wget_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xml&uniref100=y&leqi=y" -O example_4_wget_uniref100_leqi.xml

  • Example 5: get matches of a list of peptides in a set of organisms from UniProtKB

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=tab http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=xls http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=fasta http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=xml http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl.xml

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=tab -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=xls -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl_uniref100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=fasta -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK -F organism=9606,10090 -F format=xml -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_5_curl_uniref100_leqi.xml

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&organism=9606,10090" -O example_5_wget.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=tab&organism=9606,10090" -O example_5_wget.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xls&organism=9606,10090" -O example_5_wget.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=fasta&organism=9606,10090" -O example_5_wget.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xml&organism=9606,10090" -O example_5_wget.xml

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&organism=9606,10090&uniref100=y&leqi=y" -O example_5_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=tab&organism=9606,10090&uniref100=y&leqi=y" -O example_5_wget_uniref100_leqi.tab

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xls&organism=9606,10090&uniref100=y&leqi=y" -O example_5_wget_uniref100_leqi.xls

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=fasta&organism=9606,10090&uniref100=y&leqi=y" -O example_5_wget_uniref100_leqi.fasta

    Searching "AVEEGIVLGGGCALLR" and "SVQYDDVPEYK" in all Homo sapiens (9606) and Mus musculus (10090) sequences plus isoforms in UniProtKB, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    wget "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest?query=AAVEEGIVLGGGCALLR,SVQYDDVPEYK&format=xml&organism=9606,10090&uniref100=y&leqi=y" -O example_5_wget_uniref100_leqi.xml

  • Example 6: get matches of a list of peptides in a set of organisms from UniProtKB using files as inputs

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_file.tab

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=tab http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_file.tab

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=xls http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_file.xls

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=fasta http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_file.fasta

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, not limited to UniRef100 representative sequences, not treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=xml http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_file.xml

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_uniref100_leqi_file.tab

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in Tab-delimited format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=tab -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_uniref100_leqi_file.tab

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XLS format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=xls -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_uniref100_leqi_file.xls

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in FASTA format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=fasta -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_uniref100_leqi_file.fasta

    Searching query peptides (one peptide per line, listed in the file "queryFile.txt") within the organisms (taxonomy Ids listed in "organismFile.txt") of UniProtKB sequences plus isoforms, limited to UniRef100 representative sequences, treat Leucine (L) and Isoleucine (I) equivalent, output is in XML format:
    curl -F queryFile=@queryFile.txt -F organismFile=@organismFile.txt -F format=xml -F uniref100=y -F leqi=y http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest -o example_6_curl_uniref100_leqi_file.xml

  • Example 7: programmatically getting matches of a list of peptides in a set of organisms from UniProtKB

    Perl
    
    use strict;
    use LWP::UserAgent;
    use Getopt::Long;
    
    my $base = 'http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest';
    
    my ($queryFile, $organismFile, $format);
    
    if(@ARGV < 2 or !GetOptions('queryFile=s' => \$queryFile, 'organismFile=s' => \$organismFile, 'format=s' => \$format)) {
    	usage();
    	exit 1;
    }
    
    sub usage {
    	print "Unknown option: @_\n" if (@_);
    	print "usage: perl example_7.pl [--queryFile QUERYFILE] [--organmsimFile ORGANMISMFILE] [--format FORMAT]\n";	
    }
    
    if(!$format) {
    	$format ='tab';
    }
    
    my $query = "";
    my $organism = "";
    
    open(QUERY, $queryFile) or die "Can't open $queryFile\n";
    while(my $line=) {
    	$query .= $line.",";	
    }
    close(QUERY);
    $query =~ s/\,$//;
    
    open(ORGANISM, $organismFile) or die "Can't open $organismFile\n";
    while(my $line=) {
    	$organism .= $line.",";	
    }
    $organism =~ s/\,$//;
    close(ORGANISM);
    
    my $params = {
      format => $format,
      query => $query,
      organism => $organism 
    };
    
    
    my $agent = LWP::UserAgent->new(agent => "libwww-perl");
    push @{$agent->requests_redirectable}, 'POST';
    
    my $response = $agent->post("$base", $params);
    
    while (my $wait = $response->header('Retry-After')) {
      print STDERR "Waiting ($wait)...\n";
      sleep $wait;
      $response = $agent->get($response->base);
    }
    
    $response->is_success ?
      print $response->content :
      die 'Failed, got ' . $response->status_line .
        ' for ' . $response->request->uri . "\n";
    
    										

    Java
    
    import java.io.*;
    import java.net.HttpURLConnection;
    import java.net.URL;
    import java.net.URLConnection;
    import java.net.URLEncoder;
    import java.util.logging.Logger;
    
    public class Example_7
    {
      private static final String PEPTIDEMATCH_SERVER = "http://research.bioinformatics.udel.edu/peptidematch/webservices/peptidematch_rest";
      private static final Logger LOG = Logger.getAnonymousLogger();
    
      private static void run(ParameterNameValue[] params)
        throws Exception
      {
        StringBuilder locationBuilder = new StringBuilder(PEPTIDEMATCH_SERVER + "?");
        for (int i = 0; i < params.length; i++)
        {
          if (i > 0)
            locationBuilder.append('&');
          locationBuilder.append(params[i].name).append('=').append(params[i].value);
        }
        String location = locationBuilder.toString();
        URL url = new URL(location);
        LOG.info("Submitting...");
        HttpURLConnection conn = (HttpURLConnection) url.openConnection();
        HttpURLConnection.setFollowRedirects(true);
        conn.setDoInput(true);
        conn.connect();
    
        int status = conn.getResponseCode();
        while (true)
        {
          int wait = 0;
          String header = conn.getHeaderField("Retry-After");
          if (header != null)
            wait = Integer.valueOf(header);
          if (wait == 0)
            break;
          LOG.info("Waiting (" + wait + ")...");
          conn.disconnect();
          Thread.sleep(wait * 1000);
          conn = (HttpURLConnection) new URL(location).openConnection();
          conn.setDoInput(true);
          conn.connect();
          status = conn.getResponseCode();
        }
        if (status == HttpURLConnection.HTTP_OK)
        {
          LOG.info("Got a OK reply");
          InputStream reader = conn.getInputStream();
          URLConnection.guessContentTypeFromStream(reader);
          StringBuilder builder = new StringBuilder();
          int a = 0;
          while ((a = reader.read()) != -1)
          {
            builder.append((char) a);
          }
          System.out.println(builder.toString());
        }
        else
          LOG.severe("Failed, got " + conn.getResponseMessage() + " for "
            + location);
        conn.disconnect();
      }
    
      public static void main(String[] args)
        throws Exception
      {
    	int i = 0;
    	String arg = null;
    	String queryFile = "";
    	String organismFile = "";
    	String query = "";
    	String organism = "";
    	String format = "tab";
    	while(args.length > i && args[i].startsWith("-")) {
    		arg = args[i++];
    		if(arg.equals("-queryFile")) {
    			if(args.length > i) {
    				queryFile = args[i++];
    				query = readFileContent(queryFile);
    			} 
    			else {
    				System.err.println("-queryFile requires a filename");
    			}
    		}
    		else if(arg.equals("-organismFile")) {
    			if(args.length > i) {
    				organismFile = args[i++];
    				organism = readFileContent(organismFile);
    			} 
    			else {
    				System.err.println("-organismFile requires a filename");
    			}
    		}
    		else if(arg.equals("-format")) {
    			if(args.length > i) {
    				format = args[i++];
    			} 
    			else {
    				System.err.println("-format requires a value");
    			}
    		}
    	}
    	if(args.length < 2) {
    		System.err.println("Usage: Example_7 [-queryFile] queryFileName [-organismFile] organismFileName [-format] tab|xls|fasta|xml");
    		System.exit(1);
    	}
        run(new ParameterNameValue[] {
          new ParameterNameValue("format", format),
          new ParameterNameValue("query", query),
          new ParameterNameValue("organism", organism),
        });
      }
      private static String readFileContent(String fileName) { 
    	String fileContent = "";
    	try {
    		FileReader input = new FileReader(fileName);
    		BufferedReader bufRead = new BufferedReader(input);
    		String line;
    		line = bufRead.readLine();
    		while(line != null) {
    			fileContent += line+",";
    			line = bufRead.readLine();
    		}
    		bufRead.close();
    	}catch(IOException ioe) {
    		ioe.printStackTrace();
    	}
    	fileContent = fileContent.substring(0, fileContent.length() -1);
    	return fileContent;
      }
    
      private static class ParameterNameValue
      {
        private final String name;
        private final String value;
    
        public ParameterNameValue(String name, String value)
          throws UnsupportedEncodingException
        {
          this.name = URLEncoder.encode(name, "UTF-8");
          this.value = URLEncoder.encode(value, "UTF-8");
        }
      }
    }
    										

Back to TOC

PIR
 HomeAbout PIRDatabasesSearch/AnalysisDownloadSupport  SITE MAPTERMS OF USE
© 2016 Protein Information Resource