Thursday, 6 October 2016

How To Stream Text Data Into HBase Using Apache Flume

Pre-Requisites of Flume Project:

Project Compatibility :
1. hadoop-2.6.0 + hbase-0.98.4 + flume-1.6.0
2. hadoop-2.7.2 + hbase-1.1.2 + flume-1.7.0

NOTE: Make sure that install all the above components

Flume Project Download Links:

`hadoop-2.6.0.tar.gz` ==> link
`apache-flume-1.6.0-bin.tar.gz` ==> link
`kalyan-text-hbase-agent.conf` ==> link
`kalyan-flume-project-0.1.jar` ==> link


1. create "kalyan-text-hbase-agent.conf" file with below content

agent.sources = NETCAT
agent.channels = MemChannel
agent.sinks = HBASE

agent.sources.NETCAT.type = netcat
agent.sources.NETCAT.bind = localhost
agent.sources.NETCAT.port = 3000
agent.sources.NETCAT.channels = MemChannel

agent.sinks.HBASE.type = hbase
agent.sinks.HBASE.table = sample1
agent.sinks.HBASE.columnFamily = cf
agent.sinks.HBASE.serializer = org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
agent.sinks.HBASE.serializer.incrementColumn=inc = MemChannel

agent.channels.MemChannel.type = memory
agent.channels.MemChannel.capacity = 1000
agent.channels.MemChannel.transactionCapacity = 100

2. Copy "kalyan-text-hbase-agent.conf" file into "$FUME_HOME/conf" folder

3. Copy "kalyan-flume-project-0.1.jar" file into "$FLUME_HOME/lib" folder

4. To work with Flume + HBase Integration

Follow the below steps

i. start the hbase using below '' command.

ii. verify the hbase is running or not with "jps" command

iii. connect to hbase using 'hbase shell' command

iv. list out all the tables in hbase using 'list' command

v. create the hbase table name is 'sample1' with column family name is 'cf' using below command.

create 'sample1', 'cf'

vi. read the data from hbase table 'sample1' using below 

scan 'sample1'

5. Execute the below command to `Extract data from Text data into HBase using Flume`

$FLUME_HOME/bin/flume-ng agent -n agent --conf $FLUME_HOME/conf -f $FLUME_HOME/conf/kalyan-text-hbase-agent.conf -Dflume.root.logger=DEBUG,console

6. Connect to Socket Server using below command

telnet localhost 3000

NOTE: send the sample text to flume like below screen

7. Verify the data in console

8. Verify the data in HBase

Execute below command to get the data from hbase table 'sample1'

count 'sample1'

scan 'sample1'

Share this article with your friends.

No comments :

Post a Comment

Related Posts Plugin for WordPress, Blogger...