Genetics 153:179-219 (September 1999)

An Exploration of the Sequence of a 2.9-Mb Region of the Genome of Drosophila melanogaster

M. Ashburner, S. Misra, J. Roote, S. Lewis, R. Blazej, T. Davis, C. Doyle, R. Galle, R. George, N. Harris, G. Yartzell, D. Harvey, L. Hong, K. Houston, R. Hoskins, G. Johnson, C. Martin, A. Moshrefi, M. Palazzolo, M. Reese, A. Spradling, G. Tsang, K. Wan, K. Whitelaw, B. Kimmel, S. Celniker and G.M. Rubin.

A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.

Adh Sequence Data

Note: The following sequence files are very large. In order to save one to your directory without having it appear in your browser, position your mouse over the link and then press the right mouse button for a menu. In Netscape Communicator, use the "Save Link As" command; in Internet Explorer, use "Download Link to Disk".

Java applet for interactive browsing of the Adh annotations

If your Web browser supports Java 1.1, you can browse the Adh annotations in our graphical annotated contig viewer, Ribbon . The ribbon for Adh takes several minutes to load because it is so large. When it comes up, you will see the entire 3Mb Adh region represented as colored bars on both sides of the central axis. The purple bars represent the genomic clones that were sequenced. Green bars represent annotated genes, while blue bars under the green bars are supporting homologies. Turquoise triangles mark P element insertions, and mustard-yellow bars represent transposons.

The Ribbon applet is a work in progress; we are continually debugging and improving it. Please check back periodically for updated versions. Ribbons for the rest of the Drosophila genome will soon be made available.

GASP1: The Genome Annotation Assessment Project

Members of the BDGP recently organized the first Genome Annotation assessment project (GASP1) to evaluate the state-of-the-art in genome annotation. GASP1 participants were asked to submit annotations on the Adh region, which at that time was still unpublished, allowing a "blind" test.