This directory contains real start codons and faked start codons, where the faked ATG sites were selected from a window of +/- 100bp around the actual start codon This data is from ftp://www-hgc.lbl.gov/pub/genesets/Drosophila/multi_exon_GB.sets. This data set was created to build different splice site models. ATGreal_100_100.fa.gz start codon sites ATGfaked_100_100.fa.gz faked "ATG" sites Both start codon data sets 100bp of the upstream region and 100bp of the downstream region. =================== Martin Reese, 26apr99 mgreese@lbl.gov