Taverna has now moved to the Apache Software Foundation. For updated information, see Apache Taverna (incubating).

Graves disease

Graves disease scenario

The aim of this scenario is to identify and characterise genes which are located in regions on human chromosomes which show linkage to Graves disease (GD) (shown in figure below). GD is an autoimmune disease of the thyroid in which the immune system of an individual attacks cells in the thyroid gland resulting in hyperthyroidism. This is caused by the stimulation of the thyrotrophin receptor by thyroid-stimulating autoantibodies secreted by lymphocytes of the immune system.

Graves Disease Scenario

Affymetrix microarray studies

The GD candidate genes were identified by microarray analysis. Affymetrix U95A arrays were probed with RNA extracted from CD4 positive lymphocytes from four GD patients and four healthy controls. The four GD microarray datasets were then compared to the four control datasets using the Affymetrix data mining tool to identify differentially expressed genes.

Annotation pipeline

Over 50 genes were found to be differentially-expressed in CD4 positive lymphocytes from GD patients. In order to understand why these genes were expressed in lymphocytes from GD patients but not in healthy individuals, the GD biologist would like to use myGrid to query public databases such as EMBL, GO, HGVBASE and MEDLINE to view information about gene structure and function, chromosome location, the presence of single nucleotide polymorphisms (SNPs), expression control features and association with other genetic diseases. The experimental conditions and diseases in which the expression of the candidate genes are significantly altered also need to be identified from OMIM.

Genotype assay design system

SNPs are small (single base pair) changes genetic variations which are found in the genome amongst individuals. The differential expression of the candidate genes in GD individuals may be due or related to the presence of SNPs associated with GD. The GD biologist is interested in identifying and determining the frequency of those SNPs which are found in her GD patients.

Restriction fragment length polymorphism (RFLP) assays are developed to genotype SNPs in her candidate genes. A region flanking either side of the SNP is amplified using polymerase chain reaction (PCR). The amplified PCR product is digested with a suitable restriction enzyme (i.e. one that will cut at one SNP allele and not the other) and the products are run on agarose gels to view product size and determine the genotype.

The GD biologist would like to use myGrid to:

  1. Query databases to retrieve SNP information associated with candidate genes.
  2. Aid in the design of primers (bits of DNA which signify the start and end points of the section of the DNA sequence which she wants to amplify) for the PCR experiment.
  3. Select the restriction enzyme that is specific to a particular SNP for the RFLP experiment.

3D protein structure & effect of coding SNP on protein active site

Any SNPs occurring in the coding regions of a candidate gene may potentially give rise to a change in the amino acid sequence of the protein encoded by the gene. The GD biologist would like to use myGrid to:

  1. Query a protein structure database, e.g. PDB or MSD, to determine whether a structure of the protein encoded by her candidate gene is available. If so, view the protein structure to study how it relates to the function of the protein.
  2. Obtain information about the protein, e.g. its function and functional domains, by querying SWISS-PROT and InterPro. Use Sheffield’s AMBIT Web service to retrieve information about an active site whose characteristics may be altered due to the presence of a coding SNP which has affected a change in the amino acid sequence of the protein where the active site is encoded.

The workflow for the Graves Disease analysis is published on myExperiment.

Publications

Articles and papers about the success of Taverna for Graves Disease research.