Blogs‎ > ‎Tech Stuff‎ > ‎

2015.06.18 Roll Your Own Hybrid SSD/HDD (SSHD)

posted Jun 18, 2015, 4:26 AM by Troy Cheek   [ updated Jun 19, 2015, 1:59 PM ]
I just spent the last couple of days removing some old games and programs from my SSD drive.  I had to install the last Steam game I bought on the HDD because I was running out of space.  Like many people, I cheaped out (relatively speaking) when I had my last computer system built.  I bought a small SSD (120 GB) for my operating system and frequently used programs.  I also bought a large HDD (2 TB) for storing less-frequently used programs, data, huge video files, mp3 collections, etc.  The small SSD cost me about as much as the huge HDD.  I could probably buy bigger and faster ones today for half of what I spent a few years ago.  But as I was deleting some files I didn't think I'd ever use again and moving some others to the storage drive, I got to thinking:  Why is this necessary?  I've got a computer with a million times the processing power it took to fake putting a man on the moon.  Why am I manually moving files around?  Isn't this the sort of menial task my computer is supposed to be doing for me?  Isn't my computer supposed to be doing things to make my life easier instead of the other way around?

SSD = Solid State Drive
HDD = Hard Disk Drive
SSHD = Solid State Hybrid Drive

When I first set up my new system, I made some compromises.  Windows 7 was installed on the SSD.  Power button to usable desktop in 17 seconds!  The big video files I was editing were created, edited, and rendered on the HDD.  I told Windows 7 to put my Documents, Pictures, Music, and Downloads on the HDD.  I installed games and programs on the SSD, but configured them to put downloaded levels and other data on the HDD.  As the SSD filled up, I manually deleted files or moved them to the HDD.  I even turned on drive compression to squeeze a few extra gigabytes out of the SSD.

Surely, there's some way to combine the fast access times of an SSD with the storage capacity of the HDD.

It turns out there is a way and don't call me Shirley.

It turns out there's such a thing as a hybrid drive, combining a small SSD and a large HDD in a single package.  To the computer they're connected to, they look like a single drive.  The SSD might be as small as 4 or 8 GB, which is smaller than the smallest dedicated SSD.  The hybrid drive uses this as a disk cache in different ways depending on the implementation.  In some implementations, drive writes go straight to the HDD.  Drive reads come from the HDD.  But as you read more and more, the drive figures out what files are used most often and moves those to the SSD.  Eventually, for frequently used files at least, you get the read performance of a full SSD drive.  Writes are HDD speed, however.  In other implementations, all drive writes go to the SSD first.  When the drive is idle, the data is copied to the HDD.  For drive writes, at least those smaller than the size of the SSD, you have full SSD performace.  If you read the data before it's purged, you get SSD read speeds as well.  I'm sure some implementations combine the two in some way, shape, form, or fashion.

It turns out that you don't need a special hybrid drive to get hybrid performance.  There's at least one Windows hardware product that lets you hook up an SSD and a HDD to the same device.  This hardware uses the SSD as a cache for the HDD, much as described above.  There's at least one way to do the same with software in Mac OS, at least until Apple figures out that's cutting into the sales of hybrid drives and disables that utility.  I'm sure Linux has this problem beat all to hell already.

The problem with most of these products, however, is that they use the SSD as a sort of temporary home for data as it moves to and from the HDD.  That's not what I bought an SSD for.  I bought it to store my data quickly.  I don't want to load a program a dozen times and maybe see launch times decrease because some controller finally realized that I'm running it a lot.  I don't want a 120 GB drive to sit idle most of the time, only to be partly filled after I save a project file, then copied over to the SSD.

What I'd like is a product which lets me use my SSD as an SSD.  Only when the SSD starts getting full are old, seldom used files automatically moved over to the HDD.  If I start using the file again, it's automatically copied back to the SSD.  This process seems simple to me.  I don't see why it's so difficult.  I don't see why I need a special part SSD/part HDD drive, a certain Intel motherboard chipset, a chunk of fancy hardware, Mac OS, Linux, or something other than plain old Windows and a couple of standard drives.

It turns out there is a partial solution.  SSD Boost Manager will allow you to move files from your SSD to your HDD.  It will even create symbolic links or junctions or whatever they're called.  This means that as far as you or the operating system or the files themselves are concerned, they're still in the original location.  Furthermore, you can move the files back at any time.  It's great.  It's fine.  It's wonderful.  It's French!  I speak a little French, but generally I am le sucks at it.  I don't want to trust my data to my imperfect understanding of the language.  According to reviews, there's a way to switch the interface to English, but that option doesn't appear to be available in the only version I've been able to locate and download.  Also, the project seems to have been abandoned since 2011, about 4 years ago as of this writing.  The biggest problem is that SSD Boost Manager automates a lot of things, but it's still up to the user to decide what to move from SSD to HDD.  As I mentioned before, the computer is supposed to be making my life easier, not the other way around.

What I'm asking for is a simple, plain, automatic hybrid utility that can move files from the SSD to the HDD and back.  It will sit silently in the background watching the SSD drive, scanning all files except in folders we've told it to exclude, like maybe the Windows folder because that's probably best left on the SSD anyway.  When the SSD drive gets filled to a user-specified amount, say 80%, the utility will check the SSD looking for the least used files or the oldest files or the files whose "accessed by" date is oldest or whatever's easiest to program.  Let's call these "stale" files.  Enough of these stale files will be moved to the HDD to free up enough space to bring the SSD under 80%.  Symbolic links or junctions will be created so that as far as the user is concerned, these stale files are still on the SSD.  If he tries to access the files, he'll be able to in the usual fashion, and he might not even notice that it's loading off the slower HDD.  If a file is accessed like this, it's no longer considered stale.  The utility will at the first opportunity silently move the file back to the SSD and remove the symbolic link.  Optionally, if SSD usage drops below a certain threshold, say the user defines this as 50%, then the freshest of the stale files will be moved back from the HDD to fill the SSD to that level.

And that's it.  I'm fairly tempted to try to code something like this myself.  Windows 7 ships with the "last access" functionality disabled by default, but you can fix that by using the command fsutil behavior set disablelastaccess 0 with administrator privileges.  You can create symbolic links by using the mklink command.  The hard part would be scanning all the files on a drive and keeping track of which files had been copied, so you'd know to where to move them back to.

Anybody know the best way to keep track of 121,955 files in 13,448 folders?

Update on June 19, 2015

I've done some more research.  Apple Mac OS X has something called a Fusion drive which combines an SSD and a HDD into one device that stores data on the SSD part and then moves data back and forth to the HDD part depending on what's most used.  And it turns out that this functionality is part of the OSX operating system (Core Storage?) and not the hardware, so with a little effort you can do the same with your own separate SSD and HDD devices.

In the Windows world, there's Storage Spaces and Tiered Data.  Apparently, in Windows Server 2012 R2 Bacon Ranch Edition, you can set up your SSD and HDD into a single logical drive where the "hot" frequently used data is automatically moved to the SSD while the "cold" infrequently used data is moved to the HDD at the sub-file level.  Exactly what I'm looking for.  Unfortunately, data tiering seems to only be available in the server editions, and storage spaces is only available in Windows 8 or later.  In fact, I read somewhere that Storage Spaces is "intentionally incompatible" with Windows 7.

Now, I can understand that some programs won't run on some versions of Windows because a feature they depend on isn't available or works differently.  I don't think I can ever understand intentionally writing a program that will only work on one version of Windows when it could work on some/many/all of them.

Speaking of it could work...

I'm about halfway through coding a utility to automatically move files between SSD and HDD depending on usage.  This GFA BASIC 32 program clocks in so far at 240 lines.  The compiled version is 10 KB.  I'm going to say that again: halfway done with this project and I'm at 240 lines of BASIC.  The fact that someone else hasn't done this already is scary.  This is a one-banana problem, people!  I'm tempted to set up some kind of open source project and set a bounty just to see this done by someone who actually knows how to code.