Replacement for Zones and ZFS in Linux

I was using OpenSolaris ago some years, i really liked how stable the system was, though due to a few reasons i switched back to linux. For example due to missing energy saving for my CPU (Which is supported in Windows, FreeBSD and Linux, just not in OSOL), missing drivers for my DVB-T Card and i didn’t find out how to use OpenSolaris for Games (at that time i needed a low latency kernel and 1000hz for my gaming mouse – I wasn’t able to configure that properly in OpenSolaris – This isn’t bad at all, i think OSOL was/is more optimized for server usage. However – Some of the features which i had in OpenSolaris i’d love to have at Linux:

For example Zones. You’ve been able to simply create a Brandz Linux Zone, or a Solaris Zone (which is in simple just a container with „some sort“ of virtualization. You can read more about Zones here. A few months later i found something similar for Linux – It’s called OpenVZ. With OpenVZ you can setup a Container with Memory, CPU and other Limits and you can use different Distributions. At the time i tested OpenVZ there have been some problems though. One of the main disadvantage at OpenVZ is in my humble opinion that you can’t define swap per Virtual Environment. Some applications are confused if you got no Swap, and even if you „trick“ those by just adding some value to the displayed swap it’s not optimal, for example: I got a Server with 8 GB Ram and 4 GB Swap. I’d like to be able to give every VE 1 GB Swap without touching the Swap of the Host-System as i’d save Swap for the VE’s on another disc. When i was testing OpenVZ this wasn’t possible and configuring memory was a mess (I’m still unsure, whats the correct way to do that, however, yesterday i read that there’s vswap which simplifies the memory-setup as you just have to tell how much ram and how much swap a VE should use/have). The next disadvantage with OpenVZ is or was, that you couldn’t use PREEMPT (low latency kernel) with OpenVZ – The Kernel was segfaulting on boot. So for Gameservers this was useless. Apart from these problems OpenVZ is a very very good Solution and you should really consider taking a look at it.

The next thing i got in OpenSolaris which i miss in Linux is? Of course. ZFS. However, there are two options to get ZFS for Linux, Option 1 is zfs-fuse. When i tested zfs-fuse it was extremly slow and buggy. For example i’ve just setup a raidz2, then i took of 2 discs, then the zfs part was segfaulting and the module got unloaded, then i loaded the module again and it couldn’t find the zfs pool anymore, so i had to recreate the zfs pool. Things like that made me stay away from zfs-fuse in the past. But hey, wait. There’s the second option: zfsonlinux provides a native zfs module for the kernel, if you compile it yourself. I’m testing that implementation now since 4 days and i’m impressed, because apart from two bugs it works really really good and i’m also considering switching to it from my software raid 5. The Bugs are: No Preempt, the module gives some segfaults if you use it with PREEMPT (low latency desktop) while „desktop“ and „server (no preemption)“ seems to work fine so far, currently i’m only testing with „no preemption“. The second bug is due to double caching; Currently it’s not possible to mount ZFS directly, so you have to use a zvol, place your favorite FS onto it – Now the Linux OS does caching for reads, and ZFS does it also. Which results in swapping (it copies stuff from ram to ram) without swap beeing used (you can see 100-170% kswapd cpu load, and a load of 10-20 – after some minutes the box hangs, it’s still there, just extremly slow). One way to come around this issue, is doing:

zfs set primarycache=none pool/volume

make sure, you do this only for the volume (so if your pool is zfstest and your volume is wdp, you’re using zfstest/wdp, don’t do it on zfstest directly – this will be extremly bad for performance). Another thing you can do, just to make sure is limiting the cache to 1 GB (if you got enough, i got 3 GB here) so instead of doing modprobe zfs, you’d do:

modprobe zfs zfs_arc_max=1073741824

And you should be fine with results similar to a Software Raid 5 (tested and compared on 3x500GB).

Now you got two of the (in my humble opinion) best features of OpenSolaris in your Linux. However, the Solaris Kernel is much more stable and i’d really prefer that one over Linux because of the stability and also the file hierarchy makes more sense in OpenSolaris than in any linux-distribution i’ve seen yet.

No Comments

Post a Comment