Wednesday, 26 October 2016

Twitter Data Sentiment Analysis Using Hive

Pre-Requisites of Twitter Data + Hive + Sentiment Analysis Project:
hadoop-2.6.0
hive-1.2.1
java-1.7

NOTE: Make sure that install all the above components

Twitter Data + Hive + Sentiment Analysis Project Download Links:
`hadoop-2.6.0.tar.gz` ==> link
`apache-hive-1.2.1-src.tar.gz` ==> link
`sentimentanalysis-hive.jar` ==> link
`tweets` ==> link

-----------------------------------------------------------------------------

1. Create `sentimentanalysis` folder in your machine

command: mkdir ~/sentimentanalysis




2. Download sample tweets or Download twitter data using flume to do Sentiment Analysis and copy to '~/sentimentanalysis' folder

Note: Download sample tweets link




Example: Sample Tweets

i am learning hadoop course
i am good in hadoop
i am learning hadoop
i am not feeling well
why we need bigdata
i am not happy with rdbms
ravi is not working today
india got the world cup
learn hadoop from kalyan blog
learn spark from kalyan blog


3. verify using cat command

command: cat ~/sentimentanalysis/tweets




4. start the hadoop using below command

command: start-all.sh





5. verify is running or not using "jps" command





6. Open browser using below url

http://localhost:50070/dfshealth.jsp





7. Load the sample tweets into HDFS

hadoop fs -mkdir -p /kalyan/sentimentanalysis/hive/input







hadoop fs -put ~/sentimentanalysis/tweets /kalyan/sentimentanalysis/hive/input










8. Create kalyan database in hive using below command

CREATE DATABASE IF NOT EXISTS kalyan;

USE kalyan;






9. Create tweets table in hive with sample tweets

CREATE EXTERNAL TABLE IF NOT EXISTS kalyan.tweets (tweet string) LOCATION '/kalyan/sentimentanalysis/hive/input';




10. Display the tweets table data in hive using select query

SELECT * FROM kalyan.tweets;




11. Download `sentimentanalysis-hive.jar` file and copy to '~/sentimentanalysis' folder

Note: Download sentimentanalysis-hive.jar link




12. Load the `sentimentanalysis-hive.jar` into HDFS

hadoop fs -put ~/sentimentanalysis/sentimentanalysis-hive.jar /kalyan/sentimentanalysis/hive







13. Add jar file into hive class path using below command

ADD JAR <PATH OF THE JAR FILE>;

ADD JAR hdfs://localhost:8020/kalyan/sentimentanalysis/hive/sentimentanalysis-hive.jar;





14. Define the sentiment function in hive

Hive supports Temporary function and Permanent function:

i. Create Temporary function using below command

CREATE TEMPORARY FUNCTION <function name> AS 'UDF CLASS NAME WITH PACKAGE';

CREATE TEMPORARY FUNCTION sentiment AS 'com.orienit.kalyan.sentimentanalysis.hive.udf.SentimentUdf';





ii. Create Permanent function using below command

CREATE FUNCTION <db name>.<function name> AS 'UDF CLASS NAME WITH PACKAGE' USING JAR '<PATH OF THE JAR FILE>';

CREATE FUNCTION kalyan.sentiment AS 'com.orienit.kalyan.sentimentanalysis.hive.udf.SentimentUdf' USING JAR 'hdfs://localhost:8020/kalyan/sentimentanalysis/hive/sentimentanalysis-hive.jar';





15. Verify the function in hive using below command

SHOW FUNCTIONS;




















16. Describe the function in hive using below command

DESCRIBE FUNCTION EXTENDED <function name>;

DESCRIBE FUNCTION EXTENDED sentiment;






DESCRIBE FUNCTION EXTENDED kalyan.sentiment;

















17. Analyse the tweets using sentiment function using below commands

SELECT tweet, sentiment(tweet) FROM kalyan.tweets;




18. Create 'sentimenttweets' table in hive using below command


CREATE TABLE IF NOT EXISTS kalyan.sentimenttweets (tweet string, sentiment int) LOCATION '/kalyan/sentimentanalysis/hive/output';





19. Insert Sentiment tweets data into `
sentimenttweets` table

INSERT OVERWRITE TABLE kalyan.sentimenttweets SELECT tweet, sentiment(tweet) FROM kalyan.tweets;





20. Retrieve sentiment tweets data from `sentimenttweets` table

SELECT tweet, sentiment FROM kalyan.sentimenttweets;





21. Retrieve sentiment tweets data from `sentimenttweets` table using case statement

SELECT tweet,
case
when sentiment = 1 then "positive"
when sentiment = 0 then "neutral"
when sentiment = -1 then "negative"
end
FROM kalyan.sentimenttweets;






Share this article with your friends.

5 comments :

Related Posts Plugin for WordPress, Blogger...