Identification of Conserved Promoter Motifs and Transcription Factor Binding Sites in Plant Promoters

identifying conserved promoter motifs n.w
1 / 27
Embed
Share

Explore the identification of conserved promoter motifs and transcription factor binding sites in plant promoters through wet-lab and in silico methods. Discover the significance of transcription factor binding sites, experimentally verified sites, de novo motif discovery, and real promoter structure in understanding gene regulation. Utilize databases of orthologous promoters for comparative analysis and annotation of transcription start sites in plants and chordates.

  • Promoter Motifs
  • Transcription Factor Binding Sites
  • Gene Regulation
  • Orthologous Promoters
  • Plant Promoters

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Identifying conserved promoter motifs and transcription factor binding sites in plant promoters Endre Sebesty n, ARI-HAS, Martonv s r, Hungary 26th, November, 2009 RCPGD Annual Meeting

  2. Transcription factor binding sites TFs bind short, often degenerate DNA sequences Promoters are variable length 5 sequences With TFBSs TFBSs are usually conserved in a nonconserved surrounding sequence Some well known TFBSs TATA box GC box CpG island Lots of other, less genereal TFBSs Similarly expressed genes, or homologues should contain similar TFBSs

  3. Transcription

  4. TFBS search and promoter analysis Wet-lab methods DNAse footprinting Electrophoretic mobility shift assay ChIP-Chip, ChIP-Seq In silico methods Experimentally verified sites Consensus sequences Consensus matrices De novo motif discovery Oligo frequency Phylogenetic footprinting Other methods

  5. Experimentally verified sites TRANSFAC JASPAR PLACE PlantCARE

  6. De novo motif discovery Orthologous gene groups Evolutionary conserved functional sites Co-regulated genes Same tissue, body part Same developmental stage Etc

  7. Real promoter structure No general motifs No TATA-box, GC-box, etc Lots of false positive TFBS With wet-lab and in silico methods Sometimes no apparent common TFBSs between coregulated genes

  8. Database of Orthologous Promoters Orthologous promoter sequence collections Based on a BLAST search with first exons of reference species Plants (Viridiplantae) Reference species: Arabidopsis thaliana Chordates Reference species: Homo sapiens 500/1000/3000 bp 5 upstream regions Conserved sequence regions Annotations Xrefs to other databases Annotated transcription start sites

  9. DoOP http://doop.abc.hu

  10. DoOP cluster number

  11. DoOP subsets Cluster > Subset Subset: collection of evolutionary monophyletic sequences in a cluster Plant subsets Brassicaceae Arabidopsis thaliana Brassicaceae species Eudicotyledons Grape, Solanum species, papaya, tobacco Magnoliophyta Maize, rice Viridiplantae

  12. DoOP subsets

  13. Other 45000 Solanum tuberosum Arabidopsis lyrata 40000 Sorghum bicolor Physcomitrella patens 35000 Capsella rubella Glycine max 30000 Zea mays Oryza sativa Solanum lycopersicum 25000 Nicotiana tabacum Brassica napus 20000 Lotus japonicus Medicago truncatula 15000 Vitis vinifera Ricinus communis 10000 Populus trichocarpa Carica papaya Boechera stricta 5000 Brassica oleracea Brassica rapa 0 v1.5 v1.6 v1.8 Arabidopsis thaliana

  14. Gene types Gene Ontology Standardized annotation for genes Biological process What does it do? Transcription, translation, stress response, etc Cellular component Where is it located? Membrane, ribosome, cytosol, etc Molecular function How does it work? Dehydrogenase, ATP binding, etc

  15. Gene types Gene Ontology 500 bp promoters Search for significantly enriched terms in annotation Brassicaceae Eudicotyledons Magnoliophyta Viridiplantae BP: transcription, translation, protein folding, stress response CC: plasma membrane, ribosome parts MF: ATP/GTP binding, DNA binding, ribosome parts

  16. Motif generation Phylogenetic footprinting Functional TFBSs should be conserved Local sequence alignment Define conserved regions

  17. Motif generation eudicotyledons Magnoliophyta Brassicaceae

  18. Motif statistics Motif number 500 1000 3000 Brassicaceae 323411 410720 893788 eudicotyledons 13863 20192 34353 Magnoliophyta 2009 2211 1938 Viridiplantae 589 565 372

  19. Motif statistics % conserved Brassicaceae eudicotyledons Magnoliophyta Viridiplantae 500 22 5 6 4 1000 19 3 5 2 3000 16 2 2 1 Avg length Brassicaceae eudicotyledons Magnoliophyta Viridiplantae 500 9 7 8 9 1000 9 7 9 9 3000 9 7 8 9

  20. TFBS databases Database TRANSFAC JASPAR PLACE PlantCARE ABS AGRIS TFBSs 977 18 416 646 650 72 Lots of redundant data Low quality, not updated More than a 100 different version for TATA box

  21. Synthetic biology Synthetic biology iGEM competition BioBricks MIT Registry of Standard Biological Parts UV responsive promoter Promoter expressed in roots Etc Synthetic promoters Define basic promoter elements Build and use custom made promoters Gene expression more or less when and where you want it

  22. SNP conservation Gene expression levels change because Regulatory elements change Usually NOT protein coding regions Conserved promoter regions might be functional regulatory elements Search for SNPs in this regions These SNPs might be interesting for breeders as theye are likely to be functional ones

  23. A real example Vilmos So s, Endre Sebesty n, Ang la Juh sz, J nos Pint r, Marnie E. Light, Johannes Van Staden, Ervin Bal zs (2009) Stress-related genes define essential steps in the response of maize seedlings to smoke-water. Functional and Integrative Genomics, Volume 9, Number 2, Pages 231-242; doi:10.1007/s10142- 008-0105-8 Microarray experiments Maize kernels (Mv 540) 24 and 48 h control vs smoke treated samples Up and downregulated genes Promoter sequences up to 1500 bp were extracted if available

  24. Analysis of promoters TRANSFAC database version 12.1 Collection of TFBSs More than a 100 plant TFBSs DRE-element: GCCGAC Scan for the TFBSs in the maize promoters Up and downregulated Also count the frequencies of all 5-8mer sequences In all available maize promoters, not only the up or downregulated Calculate the over or underrepresentation of a TFBS by the following Observed frequency in up or downregulated promoters divided by the expected frequency in all promoters If ratio > 1 : overrepresented If ratio < 1 : underrepresented

  25. Analysis of promoters Results Binding sites related to Organogenesis Meristem development Housekeeping functions Biotic stress Cold and dehydration stress ABA related motifs

  26. Thank you for your attention!

Related


More Related Content