TCW GO Help

The version of GO that was used for the TCW annotation is shown at the bottom of the sTCWdb Overview. Unless TCW was built with the latest GO database, there will be differences from what is found on amigo.geneontology.org as the gene ontology is constantly being updated. The following example is dated 4-Mar-2021, and screen shots are from:

Amigo http://amigo.geneontology.org/amigo/term/GO:0000166
QuickGO https://www.ebi.ac.uk/QuickGO/GTerm?class=GO:0000166

Proteins in the UniProt database have been assigned GO terms. TCW has the following mappings:

  • Protein names to sequences (from a heuristic search program such as BLAST or DIAMOND),
  • GO terms to proteins (from UniProt),
  • GO terms to sequences (the GOs assigned to the proteins that hit the sequence).
When a protein is assigned a GO term, the ancestors of the GO term inherits the hit and sequences of the hit; this is because ancestors are increasingly more general terms and all apply. Hence, the GO terms in TCW are all the assigned (direct) and inherited (indirect) GO terms.

For example, nucleotide binding (GO:0000166) 'is_a' nucleoside phosphate binding (GO:190165) and 'is_a' small molecule binding (GO:0036094). This is often referred to s GO:0000166 being the child of GO:190165 and GO:190165 being the parent of GO:0000166; likewise for GO:0036094. GO:0000166 inherits four other GOs as shown in the figure below.

Figure 2: Ancestor Paths and Graph for Nucleotide binding (GO:0000166)
A. QuickGO Graph     B. TCW GO Ancestor Path
In Figure 2A, the paths can be followed by following the arrows up the tree. In Figure 2B, the numbers on the right of the paths table correspond to 'levels', however, a term may be in multiple levels, e.g. nucleotide binding is in level 4 for Path3 but level 5 for Path2; it is only shown in its lowest level.

To quote http://geneontology.org/doc/faq, "GO terms do not occupy strict fixed levels in the hierarchy. Because GO is structured as a graph, terms would appear at different 'levels' if different paths were followed through the graph. This is especially true if one mixes the different relations used to connect terms." Nevertheless, TCW computes levels where it assigns a GO its lowest level.

Figure 2: Neighborhood for Nucleotide binding (GO:0000166)
A. Amigo Neighbors (Partial Children List)     B. TCW Neighbors

NOTE: In TCW, only GOs that have an assigned or inherited hit will be in the database, hence, a GOs descendants will not necessarily all be in the TCW database (e.g. GO:0016502 is a child, but is not list for the TCW display since it is not in the database). However, all ancestors will be in the database since they are inherited by at least this GO.

The relation types used in TCW are 'is_a' and 'part_of'.

Graph terms

In computer science, a graph has nodes and edges. There are paths between nodes by following the edges. The graph is directed if the edges have arrows, where a child node points to the parent node. In the gene ontology graphs, the GO terms are represented by nodes; an edge from a child node to a parent node indicates its a specialized term of the parent node. The gene ontology has three root nodes, biological process, cellular_component, and molecular_function. GO is not strictly hierarchical as a node can have more than one parent. See http://geneontology.org/page/ontology-structure for a more in-depth discussion.

TCW does not uses edges that cross the three trees.

E-values

A GO term may be assigned or inherited by many protein hits in the TCW database. Each protein hit has an e-value indicating how well it aligned to the sequence. The GO-protein hit is assigned the best e-value over all the hit-sequence pairs.

Examples

Say protein X (proX) aligns to sequence Y (seqY) with an e-value, and GO:0000166 is assigned to proX with an evidence code EC, then seqY is also said to be annotated with GO:0000166.
  • Since proX is assigned GO:0000166, it also inherits all GO:0000166 ancestors.
  • Since seqY has annotation of GO:0000166, it also inherits all GO:0000166 ancestors.
  • Since GO:0000166 is assigned to proX, all its ancestors inherit proX.
  • Since GO:0000166 annotates seqY, all its ancestors also annotated seqY.
  • GO:0000166 has EC from proX and e-value from proX-seqY hit.
Searching in TCW: If you search for GO:0000166 in a TCW database in the GO annotation basic search, it will be shown in the table IF it has been assigned to a protein in the database or any of its descendants have been assigned a protein.

Select GO:0000166 in the GO table followed by View Sequences; this will show all sequences that have been annotated by this GO, either assigned or inherited.

In the resulting table of sequences, select a sequence followed by View Selected Sequence; the Sequence Detail view will be shown for the sequence. Select Show GO for all hits to see whether it is assigned (has EC) or inherited (no EC).

See All GO View for a detail flow through one example.

Evidence codes (EC)

They are shown in the pop-up hit table if there is a direct assignment. They are also shown in the sequence GO table.
Experimental evidence Computational analysis
EXPInferred from Experimental ISSInferred from Sequence or structural Similarity
IMPInferred from Mutant Phenotype ISAInferred from Sequence Alignment
IDAInferred from Direct Assay ISOInferred from Sequence Orthology
IGIInferred from Genetic InteractionISMInferred from Sequence Model
IPIInferred from Physical InteractionIGCInferred from Genomic Context
IEPInferred from Expression Pattern IBAInferred from Biological aspect of Ancestor
Author statementIBDInferred from Biological aspect of Descendent
TASInferred from Traceable Author StatementIKRInferred from Key Residues
NASNon-traceable Author StatementIRDInferred from Rapid Divergence
Curatorial judgementRCAInferred from Reviewed Computational Analysis
ICInferred by CuratorAutomatically-assigned
NDNo biological Data availableIEAInferred from Electronic Annotation

According to http://geneontology.org/page/guide-go-evidence-codes, all evidence codes are assigned by a curator except the last one. And the following quote is from this site: "Evidence codes are not statements of the quality of the annotation. Within each evidence code classification, some methods produce annotations of higher confidence or greater specificity than other methods, in addition the way in which a technique has been applied or interpreted in a paper will also affect the quality of the resulting annotation. Thus evidence codes cannot be used as a measure of the quality of the annotation."