ReFS

Re-Implemented FS?

Re-Designed FS?

Resilient FS?

Its actually all of the above.

ReFS is the new file system that Microsoft’s Steven Sinofsky announced on the 16 Jan 2012, unlike previous attempts by Microsoft to develop a new FS (WinFS, an unholy and aborted RMDBS/OODBMS as a FS that sat on NTFS) this represents a much more logical and less insane evolution of file systems reminiscent in some minor ways to the awesome ZFS (which powers my home server incidentally) developed by the group then called SUN microsystems (they dead, eaten by a Grue/Oracle).

What can it do?

After reading the MS posts about this system I can see 3 huge advantages to using this FS.

1) GREATER INTEGRITY

The Metadata that the OS uses is verified with a 64-bit checksum which is the product of a mathematical operation of a series of data that creates a fairly unique digest of the contents of that data.

Simply put a checksum is a fairly unique fingerprint of data which should change if as little as a single bit in a data set is modified for example using the SHA1 checksum the phrase “The name is bond james bond” is digested to “47b5b5cb374b6adf5523aff8b45c742cb03cda48”  but “The name is bond James bond” is digested to “a497f3764a972469d89b4d689eeaf71779b8ec7b” this is used to verify that what we had written to a disk is as intended. This is important as all current storage mechanisms are not 100% reliable and can lead to hidden corrupted files in future.

This would function something like this:

[Hold data in RAM] >> [Calculate Checksum of Data: CHK0] >> [Write Data to Disk] >> [Read Data from disk and calculate its Checksum:CHK1]  >> [Compare CHK0 with CHK1, if they differ then attempt to write again]

Check-summing of the Metadata is automatic but this function is optional for the files contents using “integrity streams”, this makes me unhappy as I prefer to have my data verified as well and so my home server will stick to ZFS.

 

2) SCALEABLE

Using B+ trees (more info) exclusively this FS does offers scalable storage capacity and can grow while still offering efficient operation times.

This means that limits on File size,Volume Size, number of files/directories are now only limited by 64-bit numbers meaning that the maximum volume size is 1 Yobibyte (280 bytes) when using 64KB clusters with a max file size being 16 Exbibytes (260 bytes) for perspective a Terabyte is (212 bytes) so we are talking about some limits that shouldn’t affect us for a while (us as a species, ill likely be dead by then… as will you) .

 

 

3) FOREVER ONLINE

Ok that’s a little bit of bait if you turn the system off the disk will not be available but if you wish to perform a low level operation (like a disk check) you shouldn’t need to take the volume offline, no more rebooting to fix corruption (which shouldn’t affect ReFS as much anyway, a little bit of irony here)

Where does this fit in?

At the core of windows, deep in the Kernel.

NTFS.SYS = NTFS upper layer API/semantics engine / NTFS on-disk store engine; ReFS.SYS = Upper layer engine inherited from NTFS / New on-disk store engine

Above is the image representing the change in FS feature produced by Microsoft, in case you haven’t noticed this is as clear as tar and low on details (what is the bump? Shared Features to be relocated?, Implementation Bloat?, A collection of Ugly Hacks?).

The blue blocks represent API calls and logic for software to consume the features offered by this FS driver, much like NTFS.sys I would think that these calls are not to consumed by end programmers by instead by the OS Kernel and abstracted into programmer API calls (http://technet.microsoft.com/en-us/library/cc781134(WS.10).aspx), if performed correctly this migration should be transparent to application software.

 

NTFS Architecture

 

The Red blocks are where all the exciting changes take place so for example a BlueBox command called writeBytes() could be the same in both NTFS and ReFS but the Red block logic for NTFS could simply write the data to a free area with the ReFS implementation being similar to the write function previously proposed in this article which (for the lazy) is:

[Hold data in RAM] >> [Calculate Checksum of Data: CHK0] >> [Write Data to Disk] >> [Read Data from disk and calculate its Checksum:CHK1] >> [Compare CHK0 with CHK1, if they differ then attempt to write again]

Then why is the blue box bigger than the redbox?

The blue box commands all abstract a combination of common red box commands to produce a goal, e.g a red box open a directory command could be combined with a red box list file/directory function to perform a blue box search function.

What’s the catch?

Well there are 3

1) This is untested

This is a big drawback, although I’m sure MS have tested every aspect of this system with dog-fooding, Fuzzing and Unit tests there is still the risk that some aspect of this FS will not be implemented 100% correctly and will kill your data and eat your children if given the chance!

If MS want me to trust this they should move all of their internal systems and Source code repos to  ReFS and inform the world what happened 2 years later…. this wont happen so I will wait until all the bugs have been repaired.

2) Windows server 8 only and Not backwards compatible

A ReFS volume cannot be accessed by Windows 7/2008R2 or earlier so if your one Win 8 server dies you may have to wait for a reinstall to get essential data off your SAN, if this delays the payroll processing for your company you better blame a virus or there may be an old fashioned angry mob at your door with cacti.

3) Not Default and non-bootable

This FS not the default in Windows 8 server and cannot be booted from, this does provide a lot of information about Microsoft’s confidence in their new baby

More potential issues

It appears that MS will be deprecating some NTFS features of these 2 features concern me most: EFS and Hard links.

EFS

Is the encrypting File system that NTFS has been incorporated into windows since Win 2000 and is used to secure data in corporate and personal environments (NTFS permissions mean nothing if the ‘protected’ disk is accessed with a live Linux disk or another windows install as they are easily overcome).I think this feature may be the bump in the Red box.

I hope that the companion storage spaces feature or some other user land tool will add support for EFS as this is an essential feature for backwards compatibility that said bitlocker seems to be the way of the future for windows encryption and so this could be the end of EFS (you willl be moving files to the new volume anyway).

HARD LINKS

Removing hard links removes POSIX compatibility, this could mean that systems such as CYGWIN may not function any more in a ReFS volume which may be bad for academic workstations that need GCC and other such UNIX tools in windows (this is essential for system modelling with SPIN).

OTHER REMOVALS

There are some other features that are being removed but I personally am not worried as user land tools could easily replace these functions ( and compression is less relevant in the time of 3 TB hard disks)

Q) What semantics or features of NTFS are no longer supported on ReFS?

The NTFS features we have chosen to not support in ReFS are: named streams, object IDs, short names, compression, file level encryption (EFS), user data transactions, sparse, hard-links, extended attributes, and quotas.

 

More info:

http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next-generation-file-system-for-windows-refs.aspx


Comments

  1. Saved, I really like your blog! :)

  2. This is actually interesting, You are a quite skilled blogger. I’ve joined your rss feed and look forward to seeking far more of your magnificent post. Also, I’ve shared your web internet site in my social networks!

  3. Thank you both I hope this site provides some valuable info for you :)

  4. Hey I stumbled on your page by mistake when i searched Google for this concern, I must point out your page is very helpful I also enjoy the style, its amazing!

  5. I honestly learned about almost all of this, but never the less, I still assumed it had been valuable. Very good blog!

  6. Nice details! I have been seeking for things such as this for quite a while currently. Regards!

  7. I will be to some great extent impressed with the submit. I recommend the idea.Good writeup.

  8. Amazing Article Anuj. Ought to begin an excellent debate ideally. I own a couple of questions though.

  9. Really good article, We are viewing back more often to search for posts.

  10. I have been absent for a while, but now I remember why I used to love this blog. Thanks , I¡¦ll try and check back more frequently. How frequently you update your web site?

Leave a Reply

Your email address will not be published / Required fields are marked *

nIkwN

Please type the text above: