Welcome to the PGSB Wheat Genome Database
Major Breakthrough in Deciphering Bread Wheat’s Genetic Code. The UK bread wheat sequence and genome analysis has been published in nature on Nov. 29, 2012. For more information please visit our wheat (UK 454 sequence instance) genome database.
Read more from the international press release or from the HMGU press release in German language. Listen to a radio feature (mp3; 2.0 MB) about the Nature wheat article in German language. (Copyright: Lucian Haas / www.lucianhaas.de)
Bread wheat has a hexaploid genome with a size of approximately 17GB, making it one of the largest and most complex plant genomes. Wheat is of fundamental importance to world agriculture with an estimated 2007 harvest of ~550m tons.
In this project the genome of bread wheat (Chinese Spring line) was sequenced to 5x coverage using 454 technology. The wheat raw sequence reads can be downloaded at EMBL/Genbank under SRA study ERP000319.
Constructed from these raw reads, the Orthologous Assembly (OA) dataset represents a genic sub-assembly. We constructed a set of orthologous representative grass genes incorporating genes from Brachypodium distachyon, Sorghum bicolor, Oryza sativa and Hordeum vulgare. We mapped and assembled wheat raw reads on each OG (Orthologous Group) representative using stringent parameters to avoid collapsing of homologous sequences. Sub-assembly sequences were assigned sub-genome predictions (A, B, D or X for unknown) using a trained machine learning classifier. The Low Copy-number Genome assembly (LCG) was constructed by filtering out repetitive sequences and assembling the remaining low-copy sequences de novo using gsAssembler from the Newbler package (development version 2.6pre) using the "-large" parameter. The data sets represent outputs of the BBSRC funded grant "Mining the allohexaploid wheat genome for useful sequence polymorphisms". The grant is led by Prof. Keith Edwards (University of Bristol) and is a collaboration between Prof. Neil Hall and Dr. Anthony Hall (University of Liverpool), Dr. Gary Barker (University of Bristol) and Prof. Mike Bevan (John Innes Centre) (BB/G013004/1, BB/G012865/1).
This website in PGSB PlantsDB gives access to the UK wheat data in several ways:
- A.) Bulk download of OA and LCG assemblies: please go to "Download" http://pgsb.helmholtz-muenchen.de/plant/wheat/uk454survey/download/index.jsp
- B.) Search for wheat OA sequences using a known gene from one of the reference organism Brachypodium, Sorghum, rice or barley (fl-cDNA): go to "Search" http://pgsb.helmholtz-muenchen.de/plant/wheat/uk454survey/searchjsp/index.jsp and enter the gene identifier in the search field under "Search for Genetic Elements", e.g. Bradi1g00370.1 in case you have a gene in Brachypodium and you want to retrieve the corresponding wheat genic sub-assemblies.
- C.) Use the BLAST server http://pgsb.helmholtz-muenchen.de/plant/search.jsp
to search your sequence against the set of orthologous
genes (OGs) from Brachypodium, Sorghum, rice or barley.
For that, please select either the BLAST database 'TriticumAestivum_UK454_OGreps_CDS.fa'(for BLAST against
nucleotide DB) or 'TriticumAestivum_UK454_OGreps_PEP.fa' (for BLAST against protein
DB), depending on your input sequence type.
This will give you (hopefully) one or more hits against
a reference organism gene (for example a Brachypodium gene). If you identify sufficient
homology for you purposes use the found gene identifier
to retrieve the corresponding wheat sub-assembly
sequences as in B.)
WARNING: if you get an error saying e.g. "No Analysis results found for element_name: Bradi1g00270.3" then this particular gene is not part of the OG. This may mean that we didn't find (enough or any) wheat sequence data associated with this gene!
This project also produced both raw sequences and an assembly for the wheat transcriptome... this data was uploaded to EMBL/Genbank under accession number ERP001415.
http://www.cerealsdb.uk.net/ provides an additional data resource for some of this data including access to wheat SNPs etc.
A detailed description of these data and its generation can be obtained from the supplementary material of the Nature publication (http://www.nature.com/nature/journal/v491/n7426/full/nature11650.html). If you have any questions on this website and data content, please contact Manuel Spannagl (firstname.lastname@example.org) or Klaus Mayer (email@example.com).