New in GWYRE 2.0
GENERAL
- Data on interactions were collected from IntAct and BioGRID databases. Information on proteins was collected from UniProt.
- AlphaFold2-multimer fold-and-dock models are incorporated into the database instead of the template-based docking models.
- Models and experimental structures are provided also for the non-human protein pairs.
- The web application has been implemented in Python using the Flask web
application framework. The visualization page has been re-implemented with JavaScript using D3.js for the variant viewer
and JSmol for the structure viewer.
EXPERIMENTAL STRUCTURES
- From the Expression tag information extracted from the asymmetric PDB files, the residues associated with those expression tags have been removed from the experimental structure file.
- The remaining residues, which we have termed as the trimmed experimental sequence, have been re-numbered to match the numbering of the corresponding canonical UniProt sequence.
- Sequences shown on the variant panels use the trimmed experimental sequences.
- Chain IDs in the experimental structures are changed to A (larger chain in the pair as determined by the original UniProt sequence) and B (smaller chain in the pair).
- A mapping of the original residue numbering to the numbering in the original UniProt sequence is included in the downloadable files. The mapping includes specifying
the removed and missing residues. In the mmCIF file, a summary of the mapping is included in the REMARKS at the top. The accompanying information file (referenced
in the mmCIF file) contains more specific information about the renumbering changes.
- A mapping of original residue numbering to the numbering in the original UniProt sequence is provided as REMARKS in the downloadable file as well as in a plain text
file included in the download. This includes specifying the removed and missing residues.
- The file format for the download is in mmCIF format.
SEARCH PAGE
- When searching for an interaction, an optional secondary search is now available. This will restrict the results from the primary search term.
- All searches can now be performed for parts of the search term. For example, 'P55' rather than specifying P55160 for accession, kinase to find all
interactions where one of the proteins includes kinase in the name, or PPA for B'KAPPA for gene names.
- You can now search by Protein Name. This search is done across all names associated with the protein.
- Gene searches are performed against the gene name and gene synonyms.
- When you perform a search by UniProt accession or partial accession, the returned results will include matches based on UniProt secondary accessions.
All proteins matched based on secondary accession are indicated by the accession being in bold and colored red with an information button.
When you hover over the information button, it will show a box containing all the accessions for this protein.
VISUALIZATION
- A new design provides better visual perception based on common display resolutions for different monitors, tablets, and phones.
- The visual interface shows multiple experimental structures for the same interactions when such structures are available.
- If multiple binding sites exist for the interaction, the representative structure for the first binding site will be first on the list and displayed
by default. Each binding site has its own representative structure.
- A new feature has been added to show only the variants on the interface for each protein.
- Variants for the experimental structures on portions of the sequence not resolved in the experiment are not shown.
- When a variant is selected, the corresponding portion of the structure is zoomed in and changes its color and representation to the ball-and-stick.
- Different color schemes for color vision deficiency are implemented.
- For the AlphaFold models, pLDDT and PAE graphs are provided as additional information for user to evaluate the reliability of the models.