ZFS Internals (part #4)
From the MANUAL page: The zdb command is used by support engineers to diagnose failures and gather statistics. Since the ZFS file system is always consistent on disk and is self-repairing, zdb should only be run under the direction by a support engineer.
In this post we will do something simple, but that shows a great feature of ZFS. More one time we will use ext3 filesystem to understand a ZFS feature…
Here you can see the layout of the ext3 filesystem, and we will use the same file to our today’s test:
# mount -oloop fakefs mnt # debugfs debugfs: show_inode_info bash_completion Inode: 12 Type: regular Mode: 0644 Flags: 0x0 Generation: 495766590 User: 1000 Group: 1000 Size: 216529 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 432 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x48d4f6a3 -- Sat Sep 20 10:12:03 2008 atime: 0x48d4f628 -- Sat Sep 20 10:29:36 2008 mtime: 0x48d4f6a3 -- Sat Sep 20 10:12:03 2008 BLOCKS: (0-11):24638-24649, (IND):24650, (12-52):24651-24691 TOTAL: 54
The file we will work on is the only one on that filesystem: bash_completion.
So, let’s see the head of that file:
# bash_completion - programmable completion functions for bash 3.x # (backwards compatible with bash 2.05b) # # $Id: bash_completion,v 1.872 2006/03/01 16:20:18 ianmacd Exp $ # # Copyright (C) Ian Macdonald ian@caliban.org # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2, or (at your option)
Ok, that’s just ten lines of that file, and we have 4096 bytes of data per block of our filesystem. So, let’s umount that filesystem, and read the first block that has the first ten lines (block 24638).
# umount fakefs # fsck.ext3 -v fakefs e2fsck 1.40.8 (13-Mar-2008) fakefs: clean, 12/25600 files, 1896/25600 blocks # dd if=fakefs of=/tmp/out bs=4096 skip=24638 count=1
Now, imagine some bad guy acting wild…
# vi /tmp/out change # This program is free software; you can redistribute it and/or modify for # This program isn't free software; you can't redistribute it , modify
ps.: That is the problem with flat files, to change it we need worry obout the size of what we are changing.
So, after we did that, we can put the block back to the filesystem.
# dd if=fakefs of=/tmp/fake2 skip=24639 bs=4096 # dd if=fakefs of=/tmp/fake1 count=24638 bs=4096 # dd if=/tmp/out of=/tmp/out2 ibs=4096 obs=4095 count=1 # cat /tmp/out2 >> /tmp/fake1 # cat /tmp/fake2 >> /tmp/fake1 # cp -pRf /tmp/fake1 fakefs # fsck.ext3 -v fakefs e2fsck 1.40.8 (13-Mar-2008) fakefs: clean, 12/25600 files, 1896/25600 blocks # mount -oloop fakefs mnt/ # debugfs debugfs: open fakefs debugfs: show_inode_info bash_completion Inode: 12 Type: regular Mode: 0644 Flags: 0x0 Generation: 495766590 User: 1000 Group: 1000 Size: 216529 File ACL: 0 Directory ACL: 0 Links: 1 Blockcount: 432 Fragment: Address: 0 Number: 0 Size: 0 ctime: 0x48d4f6a3 -- Sat Sep 20 10:12:03 2008 atime: 0x48d4fac0 -- Sat Sep 20 10:29:36 2008 mtime: 0x48d4f6a3 -- Sat Sep 20 10:12:03 2008 BLOCKS: (0-11):24638-24649, (IND):24650, (12-52):24651-24691 TOTAL: 54 debugfs: quit # head mnt/bash_completion # bash_completion - programmable completion functions for bash 3.x # (backwards compatible with bash 2.05b) # # $Id: bash_completion,v 1.872 2006/03/01 16:20:18 ianmacd Exp $ # # Copyright (C) Ian Macdonald ian@caliban.org # # This program isn't free software; you can't redistribute it , modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2, or (at your option) # ls -l mnt/bash_completion -rw-r--r-- 1 leal leal 216529 2008-09-20 10:12 bash_completion
ps.: I did use dd because i think is simple, and i am simulating a HD with a 100mb file. But remember that in a real scenario, that task can be done rewriten just that block, and could be done exactly like this.
Ok, a silent data “corruption”! A really bad one… and the filesystem does not know anything about it. And don’t forget that all the attrs of that file are identicals. We can use a “false” file for days, without know… What about ZFS? I tell you: in ZFS that would not gonna happen! Don’t believe me? So you will need to wait for the next part… ;-)
peace.