Hadoop’s HBase with Java API – ||

Configuring standalone Hbase on centos :-
prerequisites if performing operations through windows
a) putty
b) winscp
1) first download Linux based Hbase relaease package :-
a) file to be downloaded – hbase-0.94.23.tar.gz
b) using the following link – http://mirror.metrocast.net/apache/hbase/

2) downloaded package is be copied to a folder of your own
for e.g. I did at
/root/hadoop/
using winscp
3) Now extract the zipped file using following command :-
tar xvfz hbase-0.94.24.tar.gz
this command extracts the zipped folder & would let you access bin, conf and other folders present inside.
4) Now try to find if java is been installed on that machine using :-
readlink -f $(which java)
if not install jdk first, since you are required to set the JAVA_HOME environment variable before starting HBase .
5)  If you get an error indicating that Java is not installed, but it is on your system, perhaps in a non-standard location. Prior to 0.98.5, HBase attempted to detect the location of Java if the variables was not set. You can set the variable via your operating system’s usual mechanism, but HBase provides a central mechanism, conf/hbase-env.sh. , uncomment the line starting with JAVA_HOME, and set it to the appropriate location for your operating system.
What i wrote looks like:-
export JAVA_HOME=/home/installs/jdk1.7.0_51
it’s something you’re setting JAVA_HOME for the Hbase system.

6) Edit conf/hbase-site.xml, which is the main HBase configuration file. You only need to specify the directory on the local filesystem where HBase and Zookeeper write data. By default, a new directory is created under /tmp.
Editting this file refers writing following snippet :-
<configuration>

<property>
<name>hbase.rootdir</name>
<value>file:///$hbase_rootdir/hbase-\$ {user.name}/hbase</value>
</property>

<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/testuser/zookeeper</value>
</property>
</configuration>
If you skip writing this configuration doesn’t mean your system won’t work. i.e. Even if you leave configuration tag empty its complete . Many servers are configured to delete the contents of /tmp upon reboot, so it is always adviced to store data elsewhere.

7) Now time to start Hbase using following command:-
sh start-hbase.sh
you’ll get following output:-

starting master, logging to /root/hadoop/hbase-0.94.24/bin/../logs/hbase-root-master-psd058.psnet.com.out

8) Now run command :-
jps
gives ouput:-

5945 HMaster
6168 Jps

this  verifies that you have one running process called HMaster. In
standalone mode HBase runs all daemons within this single JVM, i.e. the HMaster, a single HRegionServer, and the ZooKeeper daemon.
Use HBase For the First Time
Connect to HBase.
Run :-
sh hbase shell
output:-
$ ./bin/hbase shell
hbase(main):001:0>

  1. Display HBase Shell Help Text.
    Type help and press Enter, to display some basic usage information for HBase Shell, as well as several example commands. Notice that table names, rows, columns all must be enclosed in quote characters.
  2. Create a table.
    Use the create command to create a new table. You must specify the table name and the ColumnFamily name.
    hbase> create ‘pstest’, ‘first_column’     0 row(s) in 1.2200 seconds
  3. List Information About your Table
    Use the list command to
    hbase> list ‘pstest’TABLE pstest 1 row(s) in 0.0350 seconds  => [“pstest”]
  4. Put data into your table.
    To put data into your table, use the put command.
    hbase(main):003:0> put ‘pstest’, ‘row1’, ‘first_column:a’, ‘people’
    0 row(s) in 0.0830 seconds
    hbase(main):006:0> put ‘pstest’, ‘row1’, ‘first_column:a’, ‘strong’
    0 row(s) in 0.0030 seconds
    hbase(main):014:0> put ‘pstest’, ‘row2’, ‘first_column:b’, ‘works’
    0 row(s) in 0.0030 seconds
    Here, we insert three values, one at a time. The first insert is at row1, column first_column:a, with a value of ‘people’. Columns in HBase are comprised of a column family prefix, cf in this example, followed by a colon and then a column qualifier suffix, a in this case.
  5. Scan the table for all data at once.
    One of the ways to get data from HBase is to scan. Use the scan command to scan the table for data. You can limit your scan, but for now, all data is fetched.
    hbase(main):015:0> scan ‘pstest’
    ROW COLUMN+CELL
    row1 column=first_column:a, timestamp=1418299874020, value=people
    row1 column=first_column:b, timestamp=1418299901536, value=strong
    row2 column=first_column:b, timestamp=1418300237140, value=works
    2 row(s) in 0.0170 seconds
  6. Get a single row of data.
    hbase(main):018:0> get ‘pstest’, ‘row1’
    COLUMN CELL
    first_column:a timestamp=1418299874020, value=people
    first_column:b timestamp=1418299901536, value=strong
    2 row(s) in 0.0200 seconds
  7. Disable a table.
    If you want to delete a table or change its settings, as well as in some other situations, you need to disable the table first, using the disable command. You can re-enable it using the enable command.
    hbase(main):025:0> disable ‘pstest’
    0 row(s) in 0.0570 seconds
    hbase(main):028:0> enable ‘pstest’
    0 row(s) in 1.1880 seconds

    Disable the table again if you tested the enable command above:
    hbase(main):029:0> disable ‘pstest’
    0 row(s) in 0.0570 seconds

  8. Drop the table.
    To drop (delete) a table, use the drop command.
    hbase> drop ‘pstest’ 0 row(s) in 0.2900 seconds
  9. Exit the HBase Shell.
    To exit the HBase Shell and disconnect from your cluster, use the quitcommand. HBase is still running in the background.

Procedure Stop HBase

  1. In the same way that the bin/start-hbase.sh script is provided to conveniently start all HBase daemons, the bin/stop-hbase.sh script stops them.
    [root@psd058 bin]# sh stop-hbase.sh
    stopping hbase……………..
  2. After issuing the command, it can take several minutes for the processes to shut down. Use the jps to be sure that the HMaster and HRegionServer processes are shut down.

Other Random Commands :-
hbase(main):030:0> status ‘detailed’
version 0.94.24
0 regionsInTransition
master coprocessors: []
1 live servers
psd058.abc.com:60213 1418295818203
requestsPerSecond=0, numberOfOnlineRegions=3, usedHeapMB=38, maxHeapMB=983
-ROOT-,,0
numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=11, writeRequestsCount=1, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=8, currentCompactedKVs=8, compactionProgressPct=1.0
.META.,,1
numberOfStores=1, numberOfStorefiles=0, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=39, writeRequestsCount=4, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN
pstest,,1418299145748.0a0cf7955f471c67d935d4c9c3aee96d.
numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=0, storefileSizeMB=0, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=1, writeRequestsCount=0, rootIndexSizeKB=0, totalStaticIndexSizeKB=0, totalStaticBloomSizeKB=0, totalCompactingKVs=0, currentCompactedKVs=0, compactionProgressPct=NaN
0 dead servers

hbase(main):038:0> describe ‘pstest’
DESCRIPTION ENABLED
‘pstest’, {NAME => ‘first_column’, DATA_BLOCK_ENCODING => ‘NONE’, BLOOMFILTER => ‘NONE’, REPLICATION_SCOPE true
=> ‘0’, VERSIONS => ‘3’, COMPRESSION => ‘NONE’, MIN_VERSIONS => ‘0’, TTL => ‘2147483647’, KEEP_DELETED_CELL
S => ‘false’, BLOCKSIZE => ‘65536’, IN_MEMORY => ‘false’, ENCODE_ON_DISK => ‘true’, BLOCKCACHE => ‘true’}
1 row(s) in 0.0230 seconds

hbase(main):001:0> drop ‘test’
ERROR: Table test is enabled. Disable it first.’
Here is some help for this command:
Drop the named table. Table must first be disabled: e.g. “hbase> drop ‘t1′”

hbase(main):002:0> disable ‘test’
0 row(s) in 1.1840 seconds
hbase(main):003:0> drop ‘test’
0 row(s) in 1.1160 seconds

stay tuned for more…

Advertisements

Currently Working as Game Developer at Gaussian Networks Private Limited , ( www.adda52.com ). Did B.Tech in computer science Engineering from G.G.S.I.P University , new delhi. Using Java , ActionScript 3.0 and PHP for development. I am a programmer who loves to adapt new platforms for coding. Reading techs & specs of gadgets is my hobby as i am a 24x7 active web crawler. I consider learning as a best helping aid to yourself as well as for others because it's the best means for diversifying your knowledge.

Tagged with: , , ,
Posted in hadoop

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: