Wednesday, February 18, 2009

ZFS with LZJB Compression

I've still been working on tweaking out my FreeBSD-based 6 Disk ZFS setup. It's mainly going to be used for file storage and so it had crossed my mind that perhaps it would be best to use some type of filesystem level compression. Initially, as in all of the tutorials, I was going to use gzip. However, with a small amount of googling I discovered that LZJB compression is far superior in most cases. In fact, because it's cpu and memory utilization is so low, LZJB outperforms an uncompressed filesystem under many circumstances. This is mainly due to fewer bytes being read or written to the disk. Also, with the 2x-3x compression, I can easily bump up my redundancy from two to three copies of important files with almost no cost.

The real question is, is my compressed data more susceptible to corruption? Of course, the answer to this is yes. Each bit is holding more information and so flipping it causes more information to be lost. It seems to me though that the triple parity more than offsets this. I just need to remember to set a zfs scan on crontab.

Truthfully, using compression in a filesystem leaves me a bit uneasy under any circumstances. I lost more than one installation to stacker and diskdoubler corruption back in the day. It may not have been a lot of bytes I lost, but those bbs lists and space quest save files were worth a lot more to me then than any image or music file is to me now.

1 comment:

Anonymous said...

The real question is, is my compressed data more susceptible to corruption? Of course, the answer to this is yes. Each bit is holding more information and so flipping it causes more information to be lost.
It's not so simple. First, you have less bits, so probability of having one flipped decreases. Second, with one bit flipped you're going to loose whole block regardless of whether data is compressed or not, which is the same size in both cases.
Now there's a case of what happens when you loose a block, have no backup and go to some recovery services. It seems to me that no compression is better here, but I know almost nothing about it.