Ceph is an open-source, massively scalable, software-defined storage system which provides object, block and file system storage in a single platform. It runs on commodity hardware-saving you costs, giving you flexibility and because it's in the Linux kernel, it's easy to consume.
Ceph is able to manage:
Object Storage: Ceph provides seamless access to objects using native language bindings or radosgw, a REST interface that's compatible with applications written for S3 and Swift.
Block Storage: Ceph's RADOS Block Device (RBD) provides access to block device images that are striped and replicated across the entire storage cluster.
File System: Ceph provides a POSIX-compliant network file system that aims for high performance, large data storage, and maximum compatibility with legacy applications (not yet stable)
Whether you want to provide Ceph Object Storage and/or Ceph Block Device services to Cloud Platforms, deploy a Ceph Filesystem or use Ceph for another purpose, all Ceph Storage Cluster deployments begin with setting up each Ceph Node, your network and the Ceph Storage Cluster. A Ceph Storage Cluster requires at least one Ceph Monitor and at least two Ceph OSD Daemons. The Ceph Metadata Server is essential when running Ceph Filesystem clients.
OSDs: A Ceph OSD Daemon (OSD) stores data, handles data replication, recovery, backfilling, rebalancing, and provides some monitoring information to Ceph Monitors by checking other Ceph OSD Daemons for a heartbeat. A Ceph Storage Cluster requires at least two Ceph OSD Daemons to achieve an active + clean state when the cluster makes two copies of your data (Ceph makes 2 copies by default, but you can adjust it).
Monitors: A Ceph Monitor maintains maps of the cluster state, including the monitor map, the OSD map, the Placement Group (PG) map, and the CRUSH map. Ceph maintains a history (called an "epoch") of each state change in the Ceph Monitors, Ceph OSD Daemons, and PGs.
MDSs: A Ceph Metadata Server (MDS) stores metadata on behalf of the Ceph Filesystem (i.e., Ceph Block Devices and Ceph Object Storage do not use MDS). Ceph Metadata Servers make it feasible for POSIX file system users to execute basic commands like ls, find, etc. without placing an enormous burden on the Ceph Storage Cluster.
Ceph stores a client's data as objects within storage pools. Using the CRUSH algorithm, Ceph calculates which placement group should contain the object, and further calculates which Ceph OSD Daemon should store the placement group. The CRUSH algorithm enables the Ceph Storage Cluster to scale, rebalance, and recover dynamically.
Testing case
If you want to test with Vagrant and VirtualBox, I've made a Vagrantfile for it running on Debian Wheezy:
# -*- mode: ruby -*-# vi: set ft=ruby :ENV['LANG']='C'# Vagrantfile API/syntax version. Don't touch unless you know what you're doing!VAGRANTFILE_API_VERSION="2"# Insert all your Vms with configsboxes=[{:name=>:mon1,:role=>'mon'},{:name=>:mon2,:role=>'mon'},{:name=>:mon3,:role=>'mon'},{:name=>:osd1,:role=>'osd',:ip=>'192.168.33.31'},{:name=>:osd2,:role=>'osd',:ip=>'192.168.33.32'},{:name=>:osd3,:role=>'osd',:ip=>'192.168.33.33'},]$install=<<INSTALLwget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | sudo apt-key add -echo deb http://ceph.com/debian/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.listaptitude updateaptitude -y install ceph ceph-deploy openntpdINSTALLVagrant::Config.rundo|config|# Default box OSvm_default=procdo|boxcnf|boxcnf.vm.box="deimosfr/debian-wheezy"end# For each VM, add a public and private card. Then install Cephboxes.eachdo|opts|vm_default.call(config)config.vm.defineopts[:name]do|config|config.vm.network:bridged,:bridge=>"eth0"config.vm.host_name="%s.vm"%opts[:name].to_sconfig.vm.provision"shell",inline:$install# Create 8G disk file and add private interface for OSD VMsifopts[:role]=='osd'config.vm.network:hostonly,opts[:ip]file_to_disk='osd-disk_'+opts[:name].to_s+'.vdi'config.vm.customize['createhd','--filename',file_to_disk,'--size',8*1024]config.vm.customize['storageattach',:id,'--storagectl','SATA','--port',1,'--device',0,'--type','hdd','--medium',file_to_disk]endendendend
This will spawn VMs with correct hardware to run. It will also install Ceph as well. After booting those instances, you will have all the Ceph servers like that:
Installation
First node
To get the latest version, we're going to use the official repositories:
A Ceph Monitor maintains maps of the cluster state, including the monitor map, the OSD map, the Placement Group (PG) map, and the CRUSH map. Ceph maintains a history (called an "epoch") of each state change in the Ceph Monitors, Ceph OSD Daemons, and PGs.
Add the first monitor
To create the first (only the first today because ceph-deploy got problems) monitor node (mon1 here):
To add 2 others monitors nodes (mon2 and mon3) in the cluster, you'll need to edit the configuration on a monitor and admin node. You'll have to set the mon_host, mon_initial_members and public_network configuration in:
Then you need to exchange SSH keys to remotely be able to connect to the target machines.
OSD
Ceph OSD Daemon (OSD) stores data, handles data replication, recovery, backfilling, rebalancing, and provides some monitoring information to Ceph Monitors by checking other Ceph OSD Daemons for a heartbeat. A Ceph Storage Cluster requires at least two Ceph OSD Daemons to achieve an active + clean state when the cluster makes two copies of your data (Ceph makes 2 copies by default, but you can adjust it).
Add an OSD
To deploy Ceph OSD, we'll first start to erase the remote disk and create a gpt table on the dedicated disk 'sdb':
>cephosdtree
# id weight type name up/down reweight-10.02998rootdefault
-20.009995hostosd1
30.009995osd.3up1-30.009995hostosd3
10.009995osd.1up1-40.009995hostosd2
20.009995osd.2up1
Remove an OSD
To remove an OSD, it's unfortunately not yet integrated in ceph-deploy. So first, look at the current status:
>cephosdtree
# id weight type name up/down reweight-10.03998rootdefault
-20.01999hostosd1
00.009995osd.0down030.009995osd.3up1-30.009995hostosd3
10.009995osd.1up1-40.009995hostosd2
20.009995osd.2up1
>cephosdtree
# id weight type name up/down reweight-10.02998rootdefault
-20.009995hostosd1
30.009995osd.3up1-30.009995hostosd3
10.009995osd.1up1-40.009995hostosd2
20.009995osd.2up1
RBD
To make Block devices, you need to have a correct OSD configuration done with a created pool. You don't have anything else to have :-)
Configuration
OSD
OSD Configuration
Global OSD configuration
The Ceph Client retrieves the latest cluster map and the CRUSH algorithm calculates how to map the object to a placement group, and then calculates how to assign the placement group to a Ceph OSD Daemon dynamically. By default Ceph have 2 replicas and you can change it by 3 in adding those line to the Ceph configuration:
For the OSD, you've got 2 network interfaces (private and public). So to configure it properly on your admin machine by updating your configuration file as follow:
To avoid service restart on a simple modification, you can interact directly with Ceph to change some values. First of all, you can get all current values of your Ceph cluster:
To locate the file on the hard drive, look at this folder (/var/lib/ceph/osd/ceph-1/current). Then look at the previous result (3.47) and the filename af0f2847. So the file will be placed here :