Rates of E. coli gene transcription are greatly affected by various factors, including gene dosages, protein occupancies on genes, supercoiling, and interference and competition between genes. Several of these impacting factors have been measured individually. However, with simultaneous variation of these factors possible, a comprehensive understanding of how these factors affect transcription at a specific sites in genome has not been achieved. It remains unclear whether genome positions play a role in determining transcription levels.
To study whether or not the position of a gene in the E. coli genome has an effect on its transcription, a transposon assay has been designed to randomly insert a barcode with a reporter gene into the E. coli genome. When the reporter gene is expressed its transcript continues into the adjacent genomic region allowing determination of insertion site. RNA-seq is performed to read out the expression levels of the reporter gene and map genomic locations to the corresponding expression level using the unique barcode of each insertion. By doing this, we are able to profile the transcription levels of random positions on the E. coli genome for over 100,000 sites. We then analyze the RNA-seq data to map the expression levels to genomic positions and to investigate possible factors influencing transcription levels.
We have found that across the genome, the transcription levels display an overall periodic pattern, with peaks near the middle of each chromosomal macrodomain and generally reduced levels in the Ter macrodomain. We have also analyzed the correlation between transcription levels and genomic features such as supercoiling rate, transcriptionally silent Extended Protein Occupancy Domain (tsEPOD), RNA polymerase and various histone-like protein occupancies. Preliminary data show that overall protein occupancy, H-NS and Fis binding are highly correlated with transcription levels at a genome scale.