ZFS: Trivia
zdb output for one sparse file:
Object lvl iblk dblk lsize asize type 818 4 16K 128K 40.0G 27.3G ZFS plain file (K=inherit) (Z=inherit) 264 bonus ZFS znode path /some/sparsefile uid 0 gid 0 atime Sun Mar 29 15:23:52 2009 mtime Sat Mar 28 20:19:15 2009 ctime Sun Mar 29 15:23:52 2009 crtime Sun Mar 29 14:48:29 2009 gen 438161 mode 100600 size 42943381504 parent 202 links 1 xattr 0 rdev 0x0000000000000000 Indirect blocks: 0 L3 0:4bd77872400:c00 1:4bede85b800:c00 4000L/800P F=254607 B=438255 0 L2 1:4ba28eb4000:2400 0:4b8c2320000:2400 4000L/1e00P F=16384 B=438168 0 L1 0:4b861790000:2000 1:4b9c7d8a000:2000 4000L/1a00P F=128 B=438161 0 L0 1:4b9c7774000:7c00 20000L/6a00P F=1 B=438161 20000 L0 1:4b9c7770000:4000 20000L/3600P F=1 B=438161 40000 L0 1:4b9c777c000:1c000 20000L/18000P F=1 B=438161 60000 L0 1:4b9c77bd800:25800 20000L/20000P F=1 B=438161 80000 L0 1:4b9c7798000:25800 20000L/20000P F=1 B=438161 a0000 L0 1:4b9c7808800:25800 20000L/20000P F=1 B=438161 c0000 L0 1:4b9c77e3000:25800 20000L/20000P F=1 B=438161 e0000 L0 1:4b9c782e000:25800 20000L/20000P F=1 B=438161 100000 L0 0:4b860f62800:25800 20000L/20000P F=1 B=438161 120000 L0 1:4b9c7853800:25800 20000L/20000P F=1 B=438161 140000 L0 0:4b860f3d000:25800 20000L/20000P F=1 B=438161 160000 L0 0:4b860f17800:25800 20000L/20000P F=1 B=438161 180000 L0 0:4b860ef2000:25800 20000L/20000P F=1 B=438161 1a0000 L0 1:4b9c7879000:25800 20000L/20000P F=1 B=438161 1c0000 L0 0:4b860f88000:25800 20000L/20000P F=1 B=438161 1e0000 L0 0:4b860fad800:25800 20000L/20000P F=1 B=438161 200000 L0 0:4b860fd3000:25800 20000L/20000P F=1 B=438161 ...
195591 blocks 25800 bytes (150KB) in size. And…
# zdb -R mypool:0:4bd7782a800:25800:r Found vdev type: raidz Assertion failed: size <= (1ULL << 17) (0x25800 <= 0x20000), file ../../../uts/common/fs/zfs/zio.c, line 396 Abort
And...
zdb -R mypool:0:4bd7782a800:20000:r Found vdev type: raidz Segmentation Fault
peace.
I have made a modification to zdb that gives you a way to find the blocks on raid-z. The block numbers that zdb is reporting in your output above are absolute across all disks. There is a kernel routine, vdev_raidz_map_alloc(), which takes the block number and physical size and returns the mapping to the specific vdev, offset, and size of where the data is placed. So, I modified zdb to have a “-Z” option that, given block offset and physical size, dumps the block offsets, vdevs, and physical sizes to see how the block is striped. See my blog (https://mbruning.blogspot.com) for more information. By the way, knowing the block and device still causes problems with zdb for raidz. For instance:
# zdb -R mypool:0.1:4bd7782a800:25800:r
(0.1 would be the second disk in the raidz array) also does not work. I get the block offsets and sizes on each disk using my modified zdb -Z, and then use dd to dump the blocks.
I’ve got to learn more about sparse files. I recently had my first significant encounter with one, and ended up very confused (Why is my file bigger than the filesystem?).
Thanks for jostling my brain into action!
I get the assertion failure from zdb on any raid-z file or metadata. I didn’t notice the original post was for a sparse file.
Hello Bruning and Simmons,
Actually i’m investigating a big problem with ZFS, about sparse files, snapshots, and fragmentation. zdb seems not to work with raidz, like Bruning said, but the fact that the wrong information about the block is the concern to me. I got many blocks bigger than 128KB in zdb information, and some sparse files like it, that in production consume 500GB, in the backup server are consuming 3TB (without snapshots), after many modifications on the original files. I’m suspecting of 128KB fragmentation…
If you are interested, I have a tool that will give you a bit more information about the blocks. Here is an output example:
# ./raidzmap tank:0:f800:800
Columns = 5, bigcols = 0, asize = c00, firstdatacol = 1
devidx = 4, offset = 3000, size = 200
devidx = 0, offset = 3200, size = 200
devidx = 1, offset = 3200, size = 200
devidx = 2, offset = 3200, size = 200
devidx = 3, offset = 3200, size = 200
#
This is from a raidz pool with 5 disks. The raidzmap command uses same syntax as zdb -R, but shows the locations on disks, along with allocated sizes for a given block and physical size as zdb reports. Let me know and I’ll either email it or place it where you can ftp it.
Hello Bruning!
Yes, would be really nice to have a better view from of the blocks allocation on raidz. I did not look at the zdb code, but i think the information on zdb for a raidz is the problem. The assertion on zio.c is weird… i think your tool will help me to investigate this problem. There is some source code available? Here you can see the whole history:
https://www.opensolaris.org/jive/thread.jspa?threadID=101078&tstart=0
Thanks for the comment!
Leal
Hi Macelo. You can download the source and a readme at ftp://ftp.bruningsystems.com/raidzmap.tar
Let me know if it helps.
max
Thanks Bruning! I’m sure will help…
Hello all..
Seems like i’m hitting this bug (and related):
https://bugs.opensolaris.org/view_bug.do?bug_id=6792701
I think some of them can be fixed just on Solaris… i don’t know for sure. But many cases on that bugs are similar with the problems i’m having. Thanks to mikee to point them out.
This is a beginner type question. I have a program that runs on Windows that has zdb files. Can I download the files onto a Mac and open them with Quicken?