Someone once told me a interesting quote – “data grows to encompass all storage”. Although drives are getting bigger, things we store gets bigger too. For home users, this is probably fine – a 3TB external USB drive just sets you back a $100 or so. However, for enterprise storage, the growing storage is not so simple. We can’t just simply hook up 1000s of USB external drives, and hope for them to work.
Enterprise storage is crazily expensive, probably 10 to 20 times more expensive than commodity USB storage. With that in mind, and future requirements coming in (dropbox anyone?), we have decided to roll our own distributed storage to enable us to meet the computing requirements of the near future.
Our basic idea is simple. Run a distributed file system that provides the backend storage. Multiple services can layer on top of it to provide different services, e.g. NFS, SMB, volume and block storage.
We have decided to go with Ceph, as it can provide both object, block and filesystem storage. Ceph also integrates nicely with OpenStack, providing the block storage layer for OpenStack volumes. This means that a user on SoC cloud can spin up a VM, and attach a separate (bigger) volume (e.g. /dev/vdb) to it. The OS of the VM still remains on the physical machine, which the (bigger) volume is in the more redundant large storage, insulated from any single machine failure.