== Ceph ==

[[http://ceph.com/|Ceph]] is a clustered storage system that uses Object
Storage Devices (OSDs) accessed over a TCP/IP network as backing store.
Ceph uses the CRUSH algorithm and RADOS cluster map distribution algorithm
to determine how data on a client is placed in storage available on the
cluster.  CRUSH and RADOS offer an automated way of providing fault-tolerant
and scalable storage.

Most of the code that implements Ceph is found in user space, but there
are three major components of Ceph implemented in the Linux kernel:
libceph; kRBD; and CephFS.  The Ceph library implements common features
required by Ceph system components in the kernel, and includes the
kernel implementation of the Ceph messenger.  kRBD is the Kernel
implementation of the RADOS Block Device.  It presents a block device
interface (e.g., /dev/rbd1) which is backed by Ceph OSD storage.
Finally, CephFS is a native Linux file system with POSIX user
semantics, which provides Ceph's fault tolerance and scalability.

=== Ceph Cleanup Tasks ===

As with all software, the Ceph code has portions that could benefit from
some cleanup.  Below is a list of tasks of varying sizes, as a starting
point.

* In the OSD client, the {{{function handle_reply()}}} never uses its {{{con}}}
argument, so it can (should) be removed. (Trivial)
 '''DONE''' by Ioana Ciornei:  
 8a703a3 libceph: remove con argument in handle_reply()

* In the OSD client, the macro {{{osd_req_op_data()}}} is not defined safely.
It evaluates its parameters more than once. (Easy)
 '''DONE''' by Shraddha Barke:  
 5ca1346 libceph: evaluate osd_req_op_data() arguments only once

* In the Ceph messenger, the name of the field {{{out_kvec_left}}} could be
changed to be a little clearer about the purpose it serves. (Trivial,
but hard if you want to understand the reason for the suggested change.)

* {{{read_partial_msg_data()}}} could use its local variable {{{cursor}}}
in several places instead of {{{&msg->cursor}}}.  (Trival change, but
you must ensure it's correct.)
 '''DONE''' by Shraddha Barke:  
 621a56f libceph: use local variable cursor instead of &msg->cursor

* In the Ceph messenger, there are several blocks of code that are
almost identical that could be factored out into a helper function.
The code in question gets a grabs a connection's incoming message
(if any), replacing the in_msg pointer with NULL if needed.  (Simple
refactoring, but nontrivial.)

* The Ceph messenger abstracts types of data that can be carried over
a connection (page list, page array, or bio).  A message has a list of
data items making up what's being sent or received, and the message
has a cursor to keep track of which is the current data item and how
much of that data item has already been sent or received.  Each data
item type defines three functions:
    * {{{cursor_init()}}} initializes a cursor for a message;
    * {{{data_next()}}} returns a page pointer, and an offset and length of the next bytes of the message to be transferred; and
    * {{{data_advance()}}} advances the cursor to reflect that some number of message bytes have been consumed (sent or received)
With some work, these functions could be made more object oriented,
defining a data item type to include a block of these function
pointers. (Difficult--some pretty advanced refactoring.)