• Immutable Page
  • Info
  • Attachments



Ceph is a clustered storage system that uses Object Storage Devices (OSDs) accessed over a TCP/IP network as backing store. Ceph uses the CRUSH algorithm and RADOS cluster map distribution algorithm to determine how data on a client is placed in storage available on the cluster. CRUSH and RADOS offer an automated way of providing fault-tolerant and scalable storage.

Most of the code that implements Ceph is found in user space, but there are three major components of Ceph implemented in the Linux kernel: libceph; kRBD; and CephFS. The Ceph library implements common features required by Ceph system components in the kernel, and includes the kernel implementation of the Ceph messenger. kRBD is the Kernel implementation of the RADOS Block Device. It presents a block device interface (e.g., /dev/rbd1) which is backed by Ceph OSD storage. Finally, CephFS is a native Linux file system with POSIX user semantics, which provides Ceph's fault tolerance and scalability.

Ceph Cleanup Tasks

As with all software, the Ceph code has portions that could benefit from some cleanup. Below is a list of tasks of varying sizes, as a starting point.

* In the OSD client, the function handle_reply() never uses its con argument, so it can (should) be removed. (Trivial)

  • DONE by Ioana Ciornei: 8a703a3 libceph: remove con argument in handle_reply()

* In the OSD client, the macro osd_req_op_data() is not defined safely. It evaluates its parameters more than once. (Easy)

  • DONE by Shraddha Barke: 5ca1346 libceph: evaluate osd_req_op_data() arguments only once

* In the Ceph messenger, the name of the field out_kvec_left could be changed to be a little clearer about the purpose it serves. (Trivial, but hard if you want to understand the reason for the suggested change.)

* read_partial_msg_data() could use its local variable cursor in several places instead of &msg->cursor. (Trival change, but you must ensure it's correct.)

  • DONE by Shraddha Barke: 621a56f libceph: use local variable cursor instead of &msg->cursor

* In the Ceph messenger, there are several blocks of code that are almost identical that could be factored out into a helper function. The code in question gets a grabs a connection's incoming message (if any), replacing the in_msg pointer with NULL if needed. (Simple refactoring, but nontrivial.)

* The Ceph messenger abstracts types of data that can be carried over a connection (page list, page array, or bio). A message has a list of data items making up what's being sent or received, and the message has a cursor to keep track of which is the current data item and how much of that data item has already been sent or received. Each data item type defines three functions:

  • cursor_init() initializes a cursor for a message;

  • data_next() returns a page pointer, and an offset and length of the next bytes of the message to be transferred; and

  • data_advance() advances the cursor to reflect that some number of message bytes have been consumed (sent or received)

With some work, these functions could be made more object oriented, defining a data item type to include a block of these function pointers. (Difficult--some pretty advanced refactoring.)

Tell others about this page:

last edited 2016-01-20 16:08:02 by AlexElder