flume-1.6.0
java-1.7
NOTE: Make sure that install all the above components
Flume Project Download Links:
`hadoop-2.6.0.tar.gz` ==> link
`apache-flume-1.6.0-bin.tar.gz` ==> link
`kalyan-twitter-hdfs-agent.conf` ==> link
`kalyan-flume-project-0.1.jar` ==> link
-----------------------------------------------------------------------------
1. create "kalyan-twitter-hdfs-agent.conf" file with below content
agent.sources = Twitter
agent.channels = MemChannel
agent.sinks = HDFS
agent.sources.Twitter.type = com.orienit.kalyan.flume.source.KalyanTwitterSource
agent.sources.Twitter.channels = MemChannel
agent.sources.Twitter.consumerKey = ********
agent.sources.Twitter.consumerSecret = ********
agent.sources.Twitter.accessToken = ********
agent.sources.Twitter.accessTokenSecret = ********
agent.sources.Twitter.keywords = hadoop, big data, analytics, bigdata, cloudera, data science, data scientiest, business intelligence, mapreduce, data warehouse, data warehousing, mahout, hbase, nosql, newsql, businessintelligence, cloudcomputing
agent.sinks.HDFS.type = hdfs
agent.sinks.HDFS.channel = MemChannel
agent.sinks.HDFS.hdfs.path = hdfs://localhost:8020/user/flume/tweets
agent.sinks.HDFS.hdfs.fileType = DataStream
agent.sinks.HDFS.hdfs.writeFormat = Text
agent.sinks.HDFS.hdfs.batchSize = 100
agent.sinks.HDFS.hdfs.rollSize = 0
agent.sinks.HDFS.hdfs.rollCount = 100
agent.sinks.HDFS.hdfs.useLocalTimeStamp = true
agent.channels.MemChannel.type = memory
agent.channels.MemChannel.capacity = 1000
agent.channels.MemChannel.transactionCapacity = 100
2. Copy "kalyan-twitter-hdfs-agent.conf" file into "$FUME_HOME/conf" folder
3. Copy "kalyan-flume-project-0.1.jar" file into"$FLUME_HOME/lib" folder
4. Execute the below command to `Extract data from Twitter into Hadoop using Flume`
$FLUME_HOME/bin/flume-ng agent -n agent --conf $FLUME_HOME/conf -f $FLUME_HOME/conf/kalyan-twitter-hdfs-agent.conf -Dflume.root.logger=DEBUG,console
5. Verify the data in console
6. Verify the data in hdfs location is "hdfs://localhost:8020/user/flume/tweets"
Share this article with your friends.
ReplyDeleteThanks for the this good information.
Big Data and Hadoop Online Training