Genome
Welcome Resume Certificates Projects Photos Private

Geo SPAM
Genome
Genome Project

Project Objectives:

1) Use the Human Genome Sequence in conjunction with the known Human Protein Sequences to accurately identify nucleotide sequences within the Human Genome.
2) Once the nucleotide sequences have been identified find accurate consensus sequences and statistics information

Methodology:
By using the Protein Databases from NCBI and EMBL, and also using the Human Genome Sequence,
Stage 1  Locate all nucleotide sequences within the HGS, by reverse mapping Protein sequences to their location in the Genome. By definition the Hypothetical Protein Sequences are not to be used to identify consensus sequences.
Stage 2 Analyze the Nucleotide Sequences to determine accurate consensus sequences and Statistics.
Stage 3 Using the information from stage 2 scan the Genome for previously unidentified Genes

Current Status: -- Currently in Stage 1 as of 7/5/2009 11:20:49 AM
Database Status:
 

Chromosome Unique Non-Unique Non-Hypothetical Hypothetical Finished on
CHR1  2734  4928  1997  2931  11/3/2002 10:46:13 AM 
CHR10  1624  2660  1434  1226  10/31/2002 9:25:43 PM 
CHR11  1784  3142  1379  1763  10/30/2002 11:47:36 PM 
CHR12  1436  2466  913  1553  10/30/2002 1:23:36 AM 
CHR13  1014  2914  1591  1323  10/29/2002 4:02:29 AM 
CHR14  1030  1754  725  1029  10/28/2002 12:18:36 AM 
CHR15  795  1756  817  939  10/27/2002 10:09:25 AM 
CHR16  1173  1971  957  1014  10/26/2002 11:10:10 AM 
CHR17  1284  1807  818  989  10/25/2002 7:48:14 PM 
CHR18  887  1484  731  753  10/25/2002 7:40:47 AM 
CHR19  1179  1601  657  944  10/24/2002 4:26:58 AM 
CHR2  2362  4109  1821  2288  10/23/2002 7:04:59 PM 
CHR20  749  1061  491  570  10/22/2002 6:18:41 AM 
CHR21  403  562  284  278  10/21/2002 3:34:15 PM 
CHR22  603  840  412  428  10/20/2002 11:08:11 PM 
CHR3  1782  3504  1373  2131  10/21/2002 12:51:14 AM 
CHR4  1515  3093  1191  1902  10/19/2002 7:02:40 AM 
CHR5  1697  3316  1415  1901  10/18/2002 3:33:35 AM 
CHR6  1980  3702  1748  1954  10/17/2002 12:40:45 AM 
CHR7  1784  3142  1640  1502  10/15/2002 9:50:08 PM 
CHR8  1453  2707  1242  1465  10/14/2002 9:33:31 PM 
CHR9  1335  2302  1024  1278  10/14/2002 12:30:48 AM 
CHRX  1325  3248  1056  2192  10/13/2002 7:03:29 AM 
CHRY  265  821  234  587  10/12/2002 1:55:25 PM 

Genome Wide Status:
Total Unique Proteins found 28507 of 38050 Proteins 74%
Total Protein sequences found 58890 including duplicates.
Total Hypothetical Sequences 32940 including duplicates.
Total Non-Hypothetical Sequences 25950 including duplicates.
Total Unique Hypothetical Sequences 17605 of 23440 which is 75%
Total Unigue Non-Hypothetical Sequences 10902 of 14610 which is 74%