Dr. Atchison on regulating gene expression. 2nd of 3 biochem lectures this afternoon. bleah. Transcription - copying DNA gene into RNA copy of gene. Several billion base pairs in average mammalian genome. Most genes start expression at a singe base pair, so the polymerase has to know where to start. Global considerations: chromatin structure - dna is all wound up, mostly inaccessible. So, polymerase doesn't worry about the 97% that is inaccessible. What controls this accessibility? Well, nucleosomes - DNA wound around octomer of 4 histones - are tightly packed up in the 30nm fiber. Dr Pherson spoke about huge loops attached to the protein scaffold of the chromosome - those loops may be targeted by various enzymes and may be key to identifying which regions are accessible. What controls accessibility? Modification of histones via acetylation, phosphorylation or glycosylation. Specific proteins can modify histones. Also, variant histones such as macro-H2A exist w/in the cell.. These variants may target certain genes for expression. These variants may look partly like a histone and partly like something else. Then when they are used to build nucleosome, they may act as a target. Locus control regions - may open up/remodel chromatin from those large loops. LCRs can open and organize chromatin w/in a loop. So far, all LCRs are tissue specific, only operating in one cell type. They may be a means of identifying DNA regions that are to be transcribed in a particular cell type. Deficiency in on causes ß thalassemia. Methylation: eukaryotic cells can methylate their DNA at CpG (cytosine - guanine) dinucleotides. This methylation seems to play a significant role in chromatin structure and in gene regulation, eg TAT gene expressed in liver, not in other tissues, and is methylated in all sites but liver. Methylation patters can change dureing development to allow gene activation. Patterns are first laid down during gametogenesis. this imprinting of methylation patters can lead to sex related differences in gene expression. Juvenile huntingdon's dz, preferentially inherited from male, related to methylation pattern infantile myotonic muscular dystrophy - congenital from inherited from female. diabetes- if inherited from male only, genes may contribute to disease. methylation patterns distribution is called imprinting. they change during development. knocking out the methylation segment is lethal, too. INITIATION OF TRANSCRIPTION DNA sequences - promoter DNA sequences control the initiation and frequency of transcription. These DNA sequences consist of two types of DNA elements. There are basal promoter elements and regulatory promoter elements. __________ ________TATAA_____| gene |____ ---------------------------------- DNA if TATAA box is w/in 20-30 base pairs of gene, will initiate transcription. Some genes don't have this, TATAA less promotors just have the initiator sequence. TATAA box contains tataa and an initiator sequence. TATAA binding protein binds to TATAA box, which is the first, rate limiting perhaps, step in transcriptional initiation. The TBP looks like a saddle, kinda. it hangs over the DNA (and TATAA box, of course). When it binds to TATAA box, it bends the DNA, making kind of a kink in it, bringing some regulatory proteins into association w/other regions of the promoter. So, TBP binds down on top of the TATAA box. Once that happens, another protein, TFIIA, can bind to the promoter to make a pre-initiation complex. TFIIA interacts w/TBP and helps stabilize that interaction, and helps displace any inhibitors of the TBP/TATAA interaction. Then TFIIB joins, and then and only then can the RNA polymerase, which doesn't directly recognize DNA, join the party. The RNA polymerase comes in with TFIIF, then other things join in - [see handout p 11. note that IID can be called TBP in the first steps. note tail portion of repeating subunits tyr-ser-pro-thr-ser-pro-ser] repeating sequence repeats 26-52 times depending on species. this sequence is easily phosphorylated (ser, thr,tyr, that is) and has to be phosphorylated for transcription to start. it is believed that TFIIH does the actual phosphorylating. Then, ribonucleotide triphosphate is added and transcription proceeds. note that even TATAA less promoters require TBP, which must then be recruited to the promoter by interaction with proteins that bind to the initiator DNA sequence. recall that each cell type has different genes turned on and off. Something must be regulating this. Modulation of transcription rates occur due to the action of other regulatory proteins called transcription factors. These factors bind to the regulatory dna sequences (upstream promoter elements) and then somehow communicate w/the proteins on the basal promoter complex. This requires TBP associated factors (TAFs). The complex of TBP with its associated TAFs is called TFIID. The TFIID complex is what allows the basal promoter complex to interact with regulatory proteins bound at the distant regulatory sites. regulation really occurs this way...in liver, may not have a particular TAF, so something might not occur... recently the ability to clone the genes encoding transcription gactors has been gained. so, suddenly, there's a surge in the info available concerning the proteins that control transcription. almsot all of the transcription factors cloned so far are composed of at least two distinct functional protein domains those that interact w/DNA, and those that interact w/the basal complex.This is generally called the "activation domain" because it activates transcription. Structure of transcription factors: DNA BINDING DOMAIN this part of the transcription factor is responsible for recognizing and binding to specific DNA sequences. These domains can be categorized into one of several groups. Zinc fingers - contains conserved cysteine/histidine residues that complex w/zinc. these fingers recognize ad bind to specific DNA sequences. Homeodomains- made of certain conserved AAs, often involved in control of development. Leucine zippers and Adjacent Base Regions: made of amphipathic alpha helix with leucine residues occuring every 7 AAs (3.5 AA's per turn). so, the leucines all face the same side of the helix. proteins dimerized via their leucine zippers, with long hydrophobic side chains on one side of the helix, and bind to dna via adjacent basic regions. Helix-Loop-Helix - HLH domains make amphipathic alpha helices that allow transcription factors to dimerize. Again, DNA binding is by adjacent basic domain. Transcriptional Activation Domain Acidic Domains - acid blobs. composed of acid amphipathic alpha helices - may activate transcription by directly interacting with TBP, TFIIB, or TFIIA. may maintain TBP in a productive complex on the TATAA box. see handout for rest of list. Why do we CARE about this? Turns out htat transcription factors are common defects in diseases, especially cancers which are associated with chromosomal translocations. These translocations cause the disruption of transcription factor genes. Some transcription factor genes are proto-oncogenes, which cause cancer when mutated. MyoD is an HLH (helix loop helix) protein. you can take a fibroblast, put MyoD in it, and it will change into a muscle cell. Can dimerize w/different proteins. If it dimerizes w/another HLH protein caused E2A, it turns into a muscle. BUt if it dimerizes w/ an inhibitor (Id), then it is turned off, and the cell won't be a muscle cell.This MyoD, therefore, is MAJORLY important in determining muscle development. E2A can dimerize by itself in B cells. IF you lose E2A, you don't have any B cells. GATA-1 - if you lose it, you don't make any RBC's - embryonic lethal defect. other transcription factors can incorrectly dimerize, causing a leukemia. You can also have the right protein around, but maybe it isn't activated because it hasn't been phosphorylated for some reason.