"The durability of your volume depends both on the size of your volume and the p...

bestes · on Aug 21, 2008

I'm really impressed they even acknowledge the possibility of failures instead of just touting all availability and reliability stats.

Also, 10 times more reliable than a hard drive is pretty good. For me, the takeaway is that I should treat this as a hard drive and NOT as a magical solution that allows me to ignore everything I have learned about systems administration.

Babysitting servers at 2AM really drives home lessons for me and the urge to simply declare those problems moot is strong--but would be a mistake.

river_styx · on Aug 21, 2008

I don't see how this is different from the risks associated with any other storage service. Just be sure to have a proper backup strategy/maintenance plan.

khangtoh · on Aug 21, 2008

why is this unacceptable? you do realized that there is no 0% failure storage in existence

mdasen · on Aug 22, 2008

It's unacceptable to the poster because they don't want to have to deal with the realities of computing infrastructure. They want a turn-key, no-hassle, no-think solution.

Which is fine to want.

The EBS failure rate is 0.1-0.5% annually. That's awesome. At 0.1%, the disk is likely (mean time to failure) to fail in 500 years on average. At 0.5%, it is likely to fail in 100 years on average. At either point, I'm more than comfortable. I can snapshot the drive daily/weekly/monthly and when you combine the snapshots as a backup with the likelyhood of failure in any given year being so low, I can rest easy.

Compare that with commercial drives which are likely to fail in a decade give or take. . . Well, I know where I'd rather have my data.

Getting back to the original poster, (s)he wants a system that takes care of the snapshots without intervention. So, package up EBS along with auto-snapshotting with an SLA saying that you won't loose more than a day of data and you have a business. You charge a premium because unlike EBS, you're reliable. All the while, you are just EBS with S3 - something that the individual could do themselves. EBS is unlikely to ever fail for your clients since I'm guessing no one is going to have a client for 100 years and even if that happens, you just restore one of the daily snapshots. Nice! You've lived up to your SLA while getting paid!

Think about it, the backups aren't where the cost is. 1000 PUT requests with 4MB chunks means 4GB is backed up for a penny there in terms of the number of requests. Data transfer from EC2 to S3 is free. So, those aren't the costs. As long as you can consolidate the delta backups so you aren't storing a version for every day since inception, I don't see why this service couldn't be offered for 2x the cost of EBS.

patrickg-zill · on Aug 21, 2008

I realize there is no 0% failure, 100% guaranteed storage.

However, saying "yep, your storage space is now dead and you have lost all your files stored there" is not something I feel comfortable with. Why not offer something more resilient?

Tichy · on Aug 21, 2008

Because it would get more expensive?

Probably you could combine several of these "drives" for more resilience?

jrockway · on Aug 21, 2008

Why not offer something more resilient?

Why force that added expense on people that don't want it? Some people might be using the disk as temporary storage, or something.

Anyway, you can easily back up to S3... so everyone gets what they want.

evgen · on Aug 21, 2008

What magic storage device have you been using that has a failure rate better than the industry average for disk drives? If you are really concerned about this issue you can just run a software RAID over these raw block devices.

redorb · on Aug 21, 2008

yeah I was thinking you can stack these things and almost get 99.99999% (with around 3-4 of them) of course that makes the cost more, but not 4x.

newt0311 · on Aug 21, 2008

Stacking would be the wrong way to go about it. With ZFS, you could still get 99.99999...% but with only a slight percentage increase in necessary storage space.

rcoder · on Aug 21, 2008

I don't think you read closely enough: they define the failure rate as describing the likelihood of complete loss of the volume, not minor data corruption.

Unless you're mirroring your data across multiple drives, there is no way ZFS can magically recreate a volume that simply ceases to exist. Think of it this way: how would ZFS help you in your own server room if a non-mirrored disk caught fire and melted? Answer: it wouldn't. You'd go to backups, like you would with any other filesystem.

jrockway · on Aug 21, 2008

Answer: it wouldn't. You'd go to backups, like you would with any other filesystem.

Nuh-uh! ZFS is magical and not bounded by the "laws of reality". After all, Apple said they liked it!

evgen · on Aug 21, 2008

You will also pay for increased i/o transaction costs. TANSTAAFL.

newt0311 · on Aug 21, 2008

It is completely unacceptable for a normal FS. Something a bit more advanced like ZFS may be able to cope. What I am wondering about is why amazon didn't implement such a solution on a massive scale themselves and then run EBS on top of it.

evgen · on Aug 21, 2008

Because some users don't need reliability beyond what a normal hard drive offers, and they really shouldn't be compelled to pay for it. If you need reliability on these devices you can run software RAID or ZFS over the raw block devices being offered and tune the cost/reliability equation in whatever manner makes the most sense for your application.