New Linux kernel(v2.6.34) includes Ceph kernel client
Saturday, May 29, 2010 10:02:11 AM
Ceph is the result of petabyte-scale storage research at the Storage Systems Research Center at the University of California. Sage Weil did most of the development on Ceph as part of his Ph.D. thesis at University of California. He and his faculty Scott Brandt along with Ethan Miller, Darrell Long and Carlos Maltzhan produced a paper "Ceph: A Scalable, High-Performance Distributed File System" at Proceedings of the 7th Conference on Operating Systems Design and Implementation (OSDI ‘06), November 2006.
Features :
1. Seamless scaling : Ceph file system can be expanded by adding storage nodes called Object Storage Devices(OSDs)[Recent distributed file systems are based on object-based storage in which hard disks are replaced with Object Storage Devices(OSDs) which combine CPU, network interface, and local cache with an underlying disk or RAID]. Ceph migrates data onto new devices in order to maintain a balanced distribution of data. This effectively utilizes all available resources (disk bandwidth and spindles) and avoids data hot spots (e.g., active data residing primarly on old disks while newer disks sit empty and idle).
2. Strong reliability and fast recovery — All data in Ceph is replicated across multiple OSDs. If any OSD fails, data is automatically re-replicated to other devices. However, unlike typical RAID systems, the replicas for data on each disk are spread out among a large number of other disks, and when a disk fails, the replacement replicas are also distributed across many disks. This allows recovery to proceed in parallel (with dozens of disks copying to dozens of other disks), removing the need for explicit “spare” disks (which are effectively wasted until they are needed).
3. Adaptive MDS — The Ceph metadata server (MDS) is designed to dynamically adapt its behavior to the current workload. As the size and popularity of the file system hierarchy changes over time, that hierarchy is dynamically redistributed among available metadata servers in order to balance load and most effectively use server resources. (In contrast, current file systems force system administrators to carve their data set into static “volumes” and assign volumes to servers. Volume sizes and workloads inevitably shift over time, forcing administrators to constantly shuffle data between servers or manually allocate new resources where they are currently needed.) Similarly, if thousands of clients suddenly access a single file or directory, that metadata is dynamically replicated across multiple servers to distribute the workload.
For more information you can follow the following links.
Useful links :
1. Ceph - Home page
2. Wiki
3. Ceph: A Scalable, High-Performance Distributed File System










