The Deduplicating Warp-speed Advanced Read-only File System. A fast high compression read-only file system for Linux and Windows. DwarFS is a read-only file system with a focus on achieving very high compression ratios in particular for very redundant data. This probably doesn't sound very exciting, because if it's redundant, it should compress well. However, I found that other read-only, compressed file systems don't do a very good job at making use of this redundancy. See here for a comparison with other compressed file systems. DwarFS also doesn't compromise on speed and for my use cases, I've found it to be on par with or perform better than SquashFS. For my primary use case, DwarFS compression is an order of magnitude better than SquashFS compression, it's 6 times faster to build the file system, it's typically faster to access files on DwarFS and it uses less CPU resources.
Features
- Clustering of files by similarity using a similarity hash function
- Documentation available
- Segmentation analysis across file system blocks in order to reduce the size of the uncompressed file system
- Categorization framework to categorize files or even fragments of files and then process individual categories differently
- Highly multi-threaded implementation
- Easier to exploit the redundancy across file boundaries