Friday, July 09, 2010

Lien quan den casandra

Cassandra を試用しました。
インストールと起動までです。
続けて、パフォーマンステストも行う予定です。
どなたか、情報があったら教えてください。

本家
 http://cassandra.apache.org/
参考
 http://symfoware.blog68.fc2.com/blog-entry-286.html

●Server
 192.168.2.56
192.168.2.63
192.168.2.67

●インストール
 # wget apache-cassandra-incubating-*.*.*-bin.tar.gz
 # tar zxvf apache-cassandra-incubating-*.*.*-bin.tar.gz
 
 README.txt のとおり。
 # mkdir /var/log/cassandra
 # mkdir /var/lib/cassandra

●起動
 foreground
 # bin/cassandra -f
 daemon
 # bin/cassandra -p /var/run/cassandra.pid
 
●停止
 kill `cat /var/run/cassandra.pid`

●登録/参照
 # bin/cassandra-cli --host localhost
 assandra> set Keyspace1.Standard1['jsmith']['first'] = 'John'
 Value inserted.
 cassandra> set Keyspace1.Standard1['jsmith']['last'] = 'Smith'
 Value inserted.
 cassandra> set Keyspace1.Standard1['jsmith']['age'] = '42'
 Value inserted.
 cassandra> get Keyspace1.Standard1['jsmith']
  (column=last, value=Smith; timestamp=1258206613381)
  (column=first, value=John; timestamp=1258206609055)
  (column=age, value=42; timestamp=1258206616769)
  Returned 3 rows.

※ポート
 Port の指定は、起動スクリプトにあるらしい。
 ./bin/cassandra.in.shのなかの
 -Dcom.sun.management.jmxremote.port=8080

●Cluster設定
conf/storage-conf.xml

 自アドレス 192.168.2.56
 ノード1 192.168.2.63
 ノード2 192.168.2.67
 の場合

# diff -u conf/storage-conf.xml.org conf/storage-conf.xml
--- conf/storage-conf.xml.org 2010-03-25 16:35:02.000000000 +0900
+++ conf/storage-conf.xml 2010-03-25 16:43:46.000000000 +0900
@@ -173,7 +173,9 @@
~ the ring. You must change this if you are running multiple nodes!
-->

- 127.0.0.1
+ 192.168.2.56
+ 192.168.2.63
+ 192.168.2.67



@@ -196,7 +198,7 @@
~ (hostname, name resolution, etc), and the Right Thing is to use the
~ address associated with the hostname (it might not be).
-->
- 127.0.0.1
+ 192.168.2.56

7000

@@ -210,7 +212,7 @@
~ Leaving this blank has the same effect it does for ListenAddress,
~ (i.e. it will be based on the configured hostname of the node).
-->
- 127.0.0.1
+ 0.0.0.0

9160
|
 
●Perl Module
Net::Cassandra
http://search.cpan.org/~lbrocard/Net-Cassandra/lib/Net/Cassandra.pm

my $cassandra = Net::Cassandra->new( hostname => 'localhost' );
my $client = $cassandra->client;

my $key = '123';
my $timestamp = time;

eval {
$client->insert(
'Keyspace1',
$key,
Net::Cassandra::Backend::ColumnPath->new(
{ column_family => 'Standard1', column => 'name' }
),
'Leon Brocard',
$timestamp,
Net::Cassandra::Backend::ConsistencyLevel::ZERO
);
};
die $@->why if $@;

[gijutsu@server_test_56 ~]$ perl cassandra.pl
Leon Brocard / 1269585801 at cassandra.pl line 36.

●Thrift
 Using the low-level Thrift interface, You can use API.
 http://wiki.apache.org/cassandra/ThriftExamples
http://wiki.apache.org/cassandra/API

▼Install
Installing the required packages on CentOS 5
 http://wiki.apache.org/thrift/GettingCentOS5Packages
 # yum install automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel
 
 $ wget http://www.apache.org/dyn/closer.cgi?path=/incubator/thrift/0.2.0-incubating/thrift-*.*.*-incubating.tar.gz
 $ tar -xvzf thrift-*.*.*-incubating.tar.gz
 $ cd thrift-*.*.*-incubating
 $ make
 # make install
 
 ▼Perl client の作成
 # cd /path/to/cassandra
 # thrift --gen perl cassandra.thrift
 
 ./gen-perl 以下にコードが生成される。


●Backup & Restore

 ▼設定しだいで、常にReplication (何もしなくてよい)
  Consistency Level = QUORUM 、replication factor を適切に設定すると、
  QUORUM = replication factor / 2 + 1 なので、1レコードを2つ以上のノードが持つ。
  例) 2 = 2/2 + 1
  http://wiki.apache.org/cassandra/ArchitectureOverview
 
 ▼Manual Backup
 http://wiki.apache.org/cassandra/Operations
 >Cassandra can snapshot data while online using nodetool snapshot. You can then back up those snapshots using any desired system, although leaving them where they are is probably the option that makes the most sense on large clusters.
>Currently, only flushed data is snapshotted (not data that only exists in the commitlog). Run nodetool flush first and wait for that to complete, to make sure you get all data in the snapshot.
>To revert to a snapshot, shut down the node, clear out the old commitlog and sstables, and move the sstables from the snapshot location to the live data directory.
>You can get an eventually consistent backup by flushing all nodes and snapshotting; no individual node's backup is guaranteed to be consistent but if you restore from that snapshot then clients will get eventually consistent behavior as usual.

ということなので、Backup は、全サーバで、
 # bin/nodeprobe -host 127.0.0.1 flush keyspace_name
 # bin/nodeprobe -host 127.0.0.1 snapshot

削除は、
 # bin/nodeprobe -host 127.0.0.1 clearsnapshot

Restore は、
# service cassandra stop
# rm -fR /var/lib/cassandra/commitlog/*
# rm -fR /var/lib/cassandra/data/keyspace_name/*.db
# mv /var/lib/cassandra/data/keyspace_name/snapshots/snapshot_number/* /var/lib/cassandra/data/keyspace_name/

● Remove Nodes
http://wiki.apache.org/cassandra/Operations
>nodetool decommission to a live node, or nodetool removetoken (to any other machine) to remove a dead one.

● Add Nodes
http://wiki.apache.org/cassandra/Operations
>turn AutoBootstrap on in the configuration file, and start it
>Cassandra does not automatically remove data from nodes that "lose" part of their Token Range to a newly added node.
>Run nodetool cleanup on the source node(s) when you are satisfied the new node is up and working.

▼ Example
[root@server_test_56 ~]# /usr/local/apache-cassandra/bin/nodetool -host 127.0.0.1 ring
Address Status Load Range Ring
135232399386442117766962527369711771046
192.168.2.63 Up 2.93 GB 47285341667652056976755595911753436599 |<--|
192.168.2.56 Up 3.21 GB 135232399386442117766962527369711771046 |-->|

Add new node
[root@server_test_67 ~]# vi conf/storage-conf.xml
...
true
...
[root@server_test_67 ~]# /etc/init.d/cassandra start

[root@server_test_56 ~]# /usr/local/apache-cassandra/bin/nodetool -host 127.0.0.1 ring
Address Status Load Range Ring
135232399386442117766962527369711771046
192.168.2.63 Up 2.93 GB 47285341667652056976755595911753436599 |<--|
192.168.2.240 Up 1.52 GB 91232259285710818461640913214875976980 | |
192.168.2.56 Up 3.21 GB 135232399386442117766962527369711771046 |-->|

Clean up source node.
[root@server_test_56 ~]# /usr/local/apache-cassandra/bin/nodetool -host 127.0.0.1 cleanup

[root@server_test_56 ~]# /usr/local/apache-cassandra/bin/nodetool -host 127.0.0.1 ring
Address Status Load Range Ring
135232399386442117766962527369711771046
192.168.2.63 Up 2.93 GB 47285341667652056976755595911753436599 |<--|
192.168.2.240 Up 1.52 GB 91232259285710818461640913214875976980 | |
192.168.2.56 Up 1.61 GB 135232399386442117766962527369711771046 |-->|

1 comment:

Javin@ example of synchronized block in java said...

Nice article , you have indeed cover the topic with great details. I have also blogged my experience on java How to set Java Classpath in linux . let me know how do you find it.