General :: Performance - Best Available Technology For Layered Disk Cache
Oct 17, 2010
I've just bought a 6-core Phenom with 16G of RAM. I use it primarily for compiling and video encoding (and occassional web/db). I'm finding all activities get disk-bound and I just can't keep all 6 cores fed. I'm buying an SSD raid to sit between the HDD and tmpfs. I want to setup a "layered" filesystem where reads are cached on tmpfs but writes safely go through to the SSD. I want files (or blocks) that haven't been read lately on the SSD to then be written back to a HDD using a compressed FS or block layer.
So basically reads:
- Check tmpfs
- Check SSD
- Check HD
And writes: - Straight to SSD (for safety), then tmpfs (for speed) And periodically, or when space gets low: - Move least frequently accessed files down one layer. I've seen a few projects of interest. CacheFS, cachefsd, bcache seem pretty close but I'm having trouble determining which are practical. bcache seems a little risky (early adoption), cachefs seems tied to specific network filesystems. There are "union" projects unionfs and aufs that let you mount filesystems over each other (USB device over a DVD usually) but both are distributed as a patch and I get the impression this sort of "transparent" mounting was going to become a kernel feature rather than a FS.
I know the kernel has a built-in disk cache but it doesn't seem to work well with compiling. I see a 20x speed improvement when I move my source files to tmpfs. I think it's because the standard buffers are dedicated to a specific process and compiling creates and destroys thousands of processes during a build (just guessing there). It looks like I really want those files precached.....
Is it possible to use one fast disk as a giant file cache?
I.e. automatically copying frequently accessed data to that one disk, and transparently redirecting reads and writes to that disk, so that other drives would only have be accessed occassionally.
(writes would have to be forwarded to the other disks after a while of course)
Advantages:
The other drives could be powered down most of the time; reducing power, heat, noise speed of the other drives would not matter much. cache disk could be solid state.
How can I set such a system up?
What OS supports these options? Is this possible at all using Windows or Linux?
Example:
There are 3 disks with 1 Tb each. most of the files are only accessed very rarely, but about 5% of each disk is used frequently.
Which files are used frequently may change over time.
A solid state disk with 150GB should cache the currently frequently accessed files, so that access time is faster and the drives can be put into power saving mode.
dns cache serThis is probably more of a network question but I figured some one who is a network expert might know. Currently my organization has DNS servers. But my questions is would setting up a cache server improve the performance any? When I first thought about it i thought probably not. But since it stores information in ram that made me think maybe it would improve network performance a little.
I needed a larger cache because I have some videos stored on another samba server and it's laggy. I set options: cache=20000, cache-min=10 , and that helped to play those videos smoothly, but that caused all 1280x720 mp4 files stored on my local drive to lag and A/V desync with mplayer message: **** Your system is too SLOW to play this! ****
I tried cache values from 1000 to 80000, and they lag in any case. But without the option "cache" these videos play well. Now I commented "cache" in config.
(Ubuntu Linux server, 64-bits)I was troubleshooting a problem with a file (~3.0 GB) which I had just downloaded, but it was failing the integrity test, when I discovered something really unusual.First this is the MD5 of the file after download, which didn't match the expected value:
This was really unexpected. Since I have a lot of RAM, I suspected this was the effect of caching and something was going awry with it. I decided to retry with the whole file from disk, for my surprise:
~% sudo sysctl -w vm.drop_caches=3 # This linux command invalidates vm.drop_caches = 3 # everything in the memory cache. ~% md5sum media.iso 2992aa6270f6e1de9154730ed3beedc1 media.iso
I redid it and now it seems to stay consistent, although this still isn't the value I was expecting. Certainly, the contents in memory cache were different from the contents on disk.This is the big problem.To fix the download, I created a torrent on the source machine and opened it in the target machine. Five 1MB chunks out of ~3.0GB failed integrity check. I used the torrent to fix these file chunks and how the file integrity is ok.The problem now is to determine where the data got out-of-sync.
I tested the memory with memtest86+, all but the bit fading test. I was expecting to see some failing memory module, but there wasn't anything. Everything is ok.Filesystem is Ext4, over LVM2, over a 3-disk RAID5 array.Ext4 is considered stable, and if data were inconsistent between disks, mdadm would have warned. But there is nothing in the logs. S.M.A.R.T. error logs are clean, the disks are new (have less than 30 days of "power-on-hours").I'm looking for information about any data-loss bugs in my current kernel (2.6.35), but there doesn't seem to be anything, as far as I looked.what else I could check, or where exactly could be the defect/bug?It is a Ubuntu 10.10 64-bit, Core i7 930, 6 GB non-ECC RAM.
Update: I confirmed that the files are being correctly written to the disk, the pages are being altered after they are read from disk, while in memory. I did a lot more memtests (I left it doing bit fade test overnight),and still nothing. All memory modules seem ok.Some more tests:
~% md5sum media.iso cc8bcf1ce67ff7704eadc2222650c087 media.iso ~% cp media.iso tmp[code]....(direcat is a version of cat that reads with O_DIRECT, that is, bypassing page cache)There is a clear pattern: it always happens to the 2nd byte in a 16-byte alignment. In that byte, almost always the bit 4 (LSB) flips to one, but there was one instance where bit 2 flipped to zero.
I am running openSuse 11.2 (32-bit), my CPU only supports 32-bits. I have a hardware RAID device. My system has 4GB of RAM. When I configure my system to only use 3GB, 2GB, or even 1GB, using mem=1024M in grub, my RAID performance is much better then when letting my system use the default 4GB available.Can anyone explain to me why this is? Is there anything I can do, i.e. kernel configuration, that will help performance when running with all 4GB enabled?
I have doubts regarding storage: How to configure the Events of Storage Processor? What are performance issues will come daily in a critical production server? What are first steps for disk performance Check? What are first steps for Storage Processor performance Check? What are first steps for MetaLUN performance Check?
Using KInfoCenter | Memory module it shows my 2GB of ram. Approx. 14% is used for Application Data and Disk Cache has been anywhere from 29-35%...leaving approx 1GB free. Can this 'Disk Cache" be reduced leaving more memory free or is this determined by the OS?
how much of a performance impact full disk encryption (say, AES 256-bit) has on disk-related activities? On one particular project I'm involved in I am trying to weigh out security vs performance issues.
Ubuntu 10.10 is dual booted but it is my primary OS.
Unfortunately it's on the outer edges of the disk in an extended partition.
This has always bugged me, with regards to read/write performance.
Do my concerns of reduced performance have any foundation? Should i bite the bullet and format the drive installing ubuntu first?
I ran the disk read benchmark and my read speeds were 100MB/Sec at the beginning of the test to just 55MB/Sec at the end. I have no idea if the position of the test has any bearing on the position of the disk or whether the speed recorded is affected by other factors such as the tests function or simulation.
It seems like the optimal use would be as a cache for the regular hard drives in my computer. Eliminating the need for a fast hard drive, so I can just use a slow 2TB (~US$100) drive with a SSD cache.
Is there a good way to do this yet?
It seems like it would be nice to be able to exclude some files from caching, for things like bittorrent.
I am experiencing disk write performance issues and I cannot find the cause. I have LSI-9211-8i SAS 2 controller (latest firmware), Centos 5.5 latest x86_64 kernel (2.6.18-194.17.4.el5 #1 SMP with latest LSI driver v. 7.00 datet Jul 27) and Seagate Cheetah ST3600057SS drives. These drives have a std write performance (sustained) of > 200MB/s (and read as well); with Fedora core 13 (same machine), issuing a dd if=/dev/zero of=/dev/sdo bs=1024k count=16384 (16 GB direct device write), gets normally to 213 MB/s (repeated retries). On Centos 5.5 I am getting speeds around 110/113 MB/s. iostat does not show anything specific (just 1.3 % wait, CPU 99.7 idle). There are 14 drives: tried with several of them, same figures. Reads go around 200 MB/s.
I'm working on a few servers running centos and using postfix. I don't know what the exact problem is, but we are having problems with the disk space being maxed out at 100 gigs. What we think the problem is...is that postfix is either caching or logging all the emails we send out. We sent 250k emails (500kb apiece) over the weekend and we were having trouble with that quantity. It seems some of those email were queued up for retry sending...but we didn't have sufficient disk space for that? Something broke - I'm not sure what.
What I want to do is to find and change the config file that has to do with postfix email retrying - possibly limit this (not sure if this will fix my problem). Or, turn off /limit any way that postfix logs/caches emails so that it won't take up all the disk space when queued up for retry... Again, I'm totally lost here (on both what's going on, and how to fix it). I'm not sure what more information is needed to address this problem
I upgraded my old Kubuntu installation to 10.10 Maverick Meercat and I am now experiencing a really annoying problem. I boot my computer and everything seems fine for a while, but eventually my disk performance drops to horrible levels. It's not gradual. It's fine one second and then abysmal the next.
If I do "cp file1 file2" and then kill the cp process after 10s, there are only a couple of MB copied.When I run dmesg after the performance degradation, I see this:
After installing Ubuntu to a flash drive, and customizing it heavily, I am running out of room (8 percent remaining on a 4GB flash drive). While I could go buy another flash drive, I already have plenty of storage between a 500GB external hard drive and a DLink NAS. What I would like to do is use unionfs to mount an image (around 6GB in size) over my 4GB flash drive. When it is mounted, all file changes should be synced to the unionfs image, thus "archiving" my flash drive. I have created the 6gb image, formatted it as Ext4 (same type as my flash drive), but when I mount it (using 'unionfs-fuse /media/SG_EHDD/devices/unionfs.dev -o cow /') any changes that I make are not made on the image. I think I am missing an option, namely the -o chroot=SOME_TEXT, but when I put -o chroot=/media/SG_EHDD/devices/, it says it cannot load /media/SG_EHDD/devices//media/SG_EHDD/devices/unionfs.dev. It also gives me the same errors when I cd to /media/SG_EHDD/devices and run unionfs-fuse with ./unionfs.dev for the path to the filesystem image.
Can FUSE filesystems be layered/nested? (I would use the verb "stacked", but that might confuse the issue with kernel module file system stacking in the same process space).I've scoured the web an haven't been able to find an answer to this question. I've also tested emperically for the answer, with inconclusive results. I have numerous use cases in mind for this. E.g.:Mp3 tagging FS on top of ZFS-FUSE. Union FS on top of NTFS-3G. Caching FS on top of checksumming FS on top of compression FS on top of encryption FS on top of webdav FS. And so on. (Yes I realize the last one would be a stretch in any scenario.) Or basically, ANY special-purpose FS on top of ZFS-FUSE, or NTFS-3G. (My particular interest is the former.)
I have tested this out - specifically, encfs on top of ZFS-FUSE. It didn't work. I don't recall the exact problems nor did I document the results, but the result was a very unhappy filesystem. (It did actually all mount without error if I remember correctly.) I tried many workarounds, and the end result was a non-functioning file system (I know that's vague; as is my memory). So that got me thinking, is it even possible? Is FUSE even designed for this? And if it is, is it designed in such a way that layered filesystems are reliably abstracted from each other so that they don't care and can't know what the underlying FS is? I do realize there would be performance penalties with these use cases, many non-exact alternatives, etc. So it would be nice to not receive use-case/alternatives lectures that (some) in the Linux community feel compelled to provide
Every time I right click on my CD/DVD with my DVD+RW popped in the drive, it freezes for like 10 minutes and then it unfreezes with nothing happening.I think I formatted it before using ImgBurn but when I did it went from 4.7GB capacity to 3GB. Is it possible that I accidentally turned my double-layered DVD to single-layer? If so is there a way to turn it back to double-layer?
I want to simplify some of my rules, so I want to create rules for certain services like xmpp, web, etc. since some of them use multiple ports, and I toggle them on/off a lot. Can I simply put the jump to rule clauses in the Input chain, and once the sub chains run, does it return to the input chain after the jump to rule clause? I want to do this so I don't have a ton of rules in the input chain. I think that if I simply make a list of all the rules to jump to in the input chain, it will work itself through all of them until it finds a matching filter in one of them correct?
I don't understand this error nor do I know how to solve the issue that is causing the error. Anyone care to comment?
Quote:
Error: Caching enabled but no local cache of //var/cache/yum/updates-newkey/filelists.sqlite.bz2 from updates-newkey
I know JohnVV. "Install a supported version of Fedora, like Fedora 11". This is on a box that has all 11 releases of Fedora installed. It's a toy and I like to play around with it.
I was laughing about klackenfus's post with the ancient RH install, and then work has me dig up an old server that has been out of use for some time. It has some proprietary binaries installed that intentionally tries to hide files to prevent copying (and we are no longer paying for support or have install binaries), so a clean install is not preferable.
Basically it has been out of commission for so long, that the apt-get upgrade DL is larger than the /var partition (apt caches to /var/cache/apt/archives).
I can upgrade the bigger packages manually until I get under the threshold, but then I learn nothing new. So I'm curious if I can redirect the cache of apt to a specified folder either on the command line or via a config setting?
I installed squid cache on my ubuntu server 10.10 and it is work fine but i want to know how to make it cache all files like .exe .mp3 .avi ....etc. and the other thing i want to know is how to make my client take the files from the cache in the full speed. since am using mikrotik system to use pppoe for clients and i match it with my ubuntu squid
Basically my portable computer's speakers have "Dolby Home Theater - Virtual Surround Sound" written on them. Now I figure there has to be some software (in Windows) that controls the whole thing. Is there something similar that in Ubuntu?
Consider that symbolic links at times can be broken. The target reference can be lost or misplaced. As such, usage of a symbolic link becomes deterred or broken. I propose the following: Symbolic links from now on carry a checksum relating to the original file they are linked to. The original file will carry a property that checks if it has been changed. This property will figure out the new checksum, reconfigure the symbolic link, and continue the symbolic link. Furthermore, if the system were to change, there would exist the possibility of have a symblink-comparison feature that allows the user to check the entire filesystem for files that meet the checksum criteria so that symbolic links can once more be re-established.
I just wanted to know if having my laptop set to ondemand, will this affect performance in any way? I realize it increases the clock speed to performance when the CPU is under load, but does the time it take to go from ondemand to performance affect speed? Will there be any noticeable difference between the two setups? I have a dual core intel at 2.2GHz when in performance. When ondemand is set with no load it downclocks to 800Mhz.
I have a pretty powerful desktop running fedora and was wondering if I could use the new SPICE VDI technology to create a dummy terminal virtual desktop using just a 2nd monitor, keyboard, and mouse (no thin client). Is this possible using USB ports? Would I need a 2nd video card for the monitor?