METHODOLOGY
Criteria for assigning amplified and overexpressed genes
To be considered in this analysis, genes must lie within a minimal region of amplification and overexpression of the normal gene must accompany the amplification. Genes that meet this criteria are classified as class IV genes. Genes were further classified according to their scores as determined by the evidence shown in the table.
- Class III genes require 1 or 2 points indicating significant evidence of their involvement in cancer development.
- Class II genes require 3 or more points and indicate substantial evidence for their involvement in cancer development.
- Class I genes require that a drug that targets the encoded protein is used to treat patients for which efficacy must have been shown in clinical trials.
Supporting evidence | Criteria | Assigned score |
---|---|---|
Clinical correlation | Expression correlated with clinical outcome | 1 point |
Knowledge of cancer genes and control pathways | Mutation and amplification mutually exclusive | 1 point |
Sole identified gene within the amplicon | 1 point | |
Inherited mutation predisposed to the same cancer type | 1 point | |
Mutation or amplification and overexpression of other genes in the same pathway | 1 point | |
Biological evidence | Overexpression causes biological effect | 1 point* |
siRNA knockdown or targeted drug causes biological effect in the cell containing the amplified and overexpressed gene, but not in cells lacking the amplified and overexpressed gene | 1 point* | |
Animal studies | Substantial evidence from animal experiments | 1 point |
* Modulation of gene expression affecting biological properties can only be counted once.
Integrated datasets methodology
This methodology is used to screen for additional Class III genes using integrated datasets of DNA copy number and gene expression. We cross reference reported recurrent gene amplifications and overexpression found in these studies, with mutations in established cancer genes.
- Identify papers reporting parallel genome wide analysis of DNA copy number and gene expression data.
- For each integrated dataset we identify all genes where there was significant relationship between increase in gene copy number and overexpression. Examples of the criteria used are a statistically significant correlation between copy number and expression over all samples, a statistically significant difference between samples with the amplicon and those without, or a greater than two fold overexpression compared to the median expression level of all samples.
- For each individual cancer type we use the KEGG Pathway database to determine whether the amplified and overexpressed gene occurred in the same control pathway as mutated genes listed in both the cosmic database and in the census of cancer genes.
-
We then review each of these genes to confirm that
- it occurred within a recurrent minimal region of amplification (MRA),
- the amplification has been reported in at least 3 tumours or cell lines,
- the predicted biological effects were likely to contribute to cancer development based on current knowledge of cancer mechanism for the cancer type under consideration.
Pubmed search
"2009"[Publication Date] AND cancer AND ( ( (amplification[Title/Abstract] OR copy number[Title/Abstract] OR CGH[Title/Abstract] OR gain[Title/Abstract]) AND expression[Title/Abstract] AND (microarray[Title/Abstract] OR SNP[Title/Abstract] OR array[Title/Abstract]) ) OR ( integrative[Title/Abstract] ) )