Sister Jennifer wrote:
Chilli, that's a funny pic but what is it mean to be?
I haven't the foggiest idea
Just made me smile
GB, rather. I just checked and the human genome has roughly 3.3 billion base pairs. Since you can store a DNA-Sequence with just the 4 letters ATGC and you only need one half of a strand to have all information, that makes roughly 1.6 billion bytes of data = 1.6 GB. Since you only need 4 symbols, that isn't even accurate, since you could store those 4 letters in just 2 bits - (00 = A, 10 = T, 01 = G, 11 = C), so in one byte you can encode 4 bases. Makes it 400 MB of data. Since parts of the DNA are highly redundant you could probably get to 75 MB with a compression algorithm.
Gotta check if that is correct.
Edit: Minor correction: Dividing by 2 isn't allowed here, since you could just store a base pair in one bit. So that makes around 800 MB for the whole genome. The coding part of that is about 3%, which amounts to about 24 MB - but since we don't know what the other approximately 97% of the DNA do, it's probably safer to not ditch them
. Else evolution would probably have taken care of that already.
Edit II: Oh, and if we're talking about the complete genome, not the one in sperm or egg cells, then we have to multiply with 2, since all other cells are diploid (have 2 complete DNAs, one from the father and one from the mother).