Ben Johnson - by Wikipedia

Ben Johnson - by Wikipedia

That was the first phrase i did heard the word “restore”. And i do remember that we had a good discussion about just that in school. Well, i was a kid learning english through lyrics (now you know why i don’t know english ;-) I did heard it in an ad from a sugar company here in Brazil, and Ben Johnson was the protagonist saying the line: “I restore my energy with sugar”. Well, that was before the Ben’s Drama, and while his image was a “winning” one. Restore has a trick meaning in brazilian portuguese…
Anyway, this post is about ZFS, and… restore. How to restore a whole server… It does not need to be as fast as Ben Johnson was (with or without drugs ;-), but needs to be fast. The backup procedure is the same for many, many years, and transfer the data through the wire, and restore from it, is not a solution at all. That was the solution in the old days, because the backup media was a tape. So, the “put/get” procedure was needed.
Now we can make backups directly on disks. Following the ZFS discussion list, we see that many users do use just that procedure. So, why restore it? Why do not just use the copy?
How can we have an acceptable MTTR for restore a 4TB backup? Or a 12TB backup? We did know many histories about companies with downtime of days, just doing restore. And doing a restore can lead to another problem, because as any process, it can go bad.
So, my point is that i think the only solution to an acceptable MTTR, is replicate the data to another system and use it in the case of caos. Look, i’m talking about a real crash! That case where the filesystem for some reason is unusable, corrupted by a bug or something else. But a 1:1 procedure is the only, or really the better solution? And if we get another crash? What about have a backup server that holds data for more than one machine, and use it in the case of crash? Depending on the RAID level used on the backup server, we could use the disks to generate a clone…
But the fact is that the hard disks are still a big problem in our world. Even a giga ethernet network is to much for middle server. A storage with a moderated number of files cannot transfer TB of data on a susteined way. Well, what do YOU think?
peace