Wednesday, April 13, 2005

more gmail

I got this email from a buddy:
    on the gmail login page they now have a continuously increasing, running counter of the max storage space allocated to each account. I've seen running counts like this for "interest paid" on savings account by a given bank which is straight forward since the number simply grows as a function of the interest rate.

    What could google be doing that allows the storage space to continue growing a small, but constant rate? When they recently increased from 1gb to 2gb i saw some vaguely worded press release that basically said "we have better technology/methods" for storing information rather than "we bought a bunch more servers"


That's a great question. I hadn't even thought about that. I kind of just assumed that it was a marketting gimmick implying that they are adding capacity. I thought maybe they're doing it at a predictable enough rate that they can just put a counter there for advertising. kind of how McDonald's has the 'number of burgers served' counter.

'Compression' would be an obvious but incomplete answer. Compression of historical data is actually tricky. You face this speed/space tradeoff when you compress, but just saying that you'll compress 'old' data doesn't work because data often goes through cycles of being relevant (think of your bank statements where, at tax time, you may need to go back one year in time). Maybe they've figured out better ways of predicting what data you're interested in, so that they can afford to compress some data a lot, some data a little, some data not at all.
Mail is also a huge revenue generator for them and for people placing ads. Maybe they're good enough at managing their system that the incremental cost of adding storage is less than the predicted incremental ad revenue they're getting (but that answer still boils down to 'more servers').

What else? If you owned all of the email servers in the entire world, maybe you could afford to store a single copy of every email message and just show it to all the people who are meant to receive that email. I doubt Google has reached that critical mass yet. But maybe they do this
with large attachments (since I figure you're only going to send a 300mb file to someone with a gmail account). That means they're actually storing less data. Actually, come to think of it, I think this is quite likely since the google file system would probably make it very very easy to do that.
    What could google be doing that allows the storage space to continue growing a small, but constant rate?

None of the answers I have address the question of whether that counter means something more than a general prediction of their overall capacity growth (e.g. 200 GB worth of disk per day) combined with the average account usage (e.g. if they have 1tb of capacity and 100 users, they could actually advertise an account limit of more than 10GB if the average usage is much less - just as long as they're constantly forecasting their demand. Also known as overprovisioning.).

What do you think? What is the significance (if any) of the counter?

Thanks, Sood, for a great question to start my day.

No comments: