Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
shivamshukla12
Contributor
5,813
Hi Just thought to share my learning experience on Streaming of Tweets using Java & inserting all in HANA for further Text/Token Analysis, Idea is simple and straight forward how you can leverage the capabilities/Power of inbuilt capability of text analysis of SAP HANA on some real-time information & I found twitter is better place for collecting some real-time information for understanding the text analysis in better way. So below is a short implementation which I wanted to share with everyone. This has already been implemented by multiple people/organization hence I am just adding my experience/learning & challenges here. So, at the instance you think for implementing text analysis technology – Please keep in mind following things.

 

  • In which language, you are going to write the code. it is Java in my case you can use Python as well.

  • How will you get real time data (Do you have access to any API which can provide you some real-time information) Answer is Twitter API's are ones to provide all the real-time information which you are looking for? e.g. - You can perform analysis on Political tweets, Sports Tweets, Technological Tweets & Geo Tweets.


I opt for analyzing tweets related to SAP HANA (#SAPHANA, #IoT, #SAP) So              these Hash tags will be used later for fetching tweets using Twitter API.




You will be navigated to developer page at Twitter. Click on create New App & fill           the below required information.





Create your Twitter Application



Next step is to keep all the security tokens with you for Consuming Twitter API's,           below is a Snap of the Security tokens of mine.





Now Click on create Access Token

          Your Access token will be generated Successfully.

 

Download latest version of Twitter API for using it into your project. please click             on below to Download latest version of Twitter 4j.

http://twitter4j.org/en/index.html.

below is a snap of latest Twitter4j API -

        Twitter API libraries will be used later.

Install the SAP HANA Client if not installed, Get it from SAP Service Market place         which would be having the jdbc library for accessing the HANA from java.

Go to Service Marketplace -> Software Downloads -> Installation and Upgrades -         > Browse Our Download Catalog -> SAP in Memory (SAP HANA) -> SAP HANA           Platform and download the HANA Client

below is a snap of HDB Client, Important thing to notice is - it must have JDBC             inside this.

            Install HDB Client on your machine(32 or 64 Bit check this before download)

Download Twitter-analysis App here

Once done with above activities open eclipse IDE then open java perspective in           package explorer -> right click here -> Import

 









Click Finish -> You project will be imported into package explore



Switch to HANA Development perspective for creating table which will store the            Tweets information. execute the below commands of SAP HANA SQL Console.

      SET SCHEMA "<YOUR_SCHEMA>";

      CREATE COLUMN TABLE TWEETS(

"ID" INTEGER NOT NULL,

"USER_NAME" NVARCHAR(100),

"CREATED_AT" DATE,

"TEXT" NVARCHAR (140),

"HASH_TAGS" NVARCHAR (100),

PRIMARY KEY("ID"));



After creating the table in HANA, switch to configuration folder - change the config       for HANA & Twitter connectivity. Open Java Configuration file & Perform the                 changes connecting the HANA Server.



 

1- Check if there is any proxy then make the proxy variable true & enter proxy                  details

2- Hana Database Host, Port, User, Schema & Password

3- Twitter tokens received above including Consumer keys & Secret keys.

4- Search Term What you want to fetch from Twitter like #SAP or #SAPHANA

 

After updating above details

Open the TwitterConnection.java & execute the file -

    Test Connection to Twitter



    Test Connection to SAP HANA

    Open theHDBConnection.java & execute the file -



Before executing the TwitterSearch.java file, Configure TwitterApi properly then only    you would be able to execute the Application else you will encounter errors like the      Source of this class is not found hence i thought to mention how to configure source    path for Twitter Api's.

Right Click on Project.



Click on Configure build path -> Click on Java build Path -> Add External Jars -> Go     to libraries folder of Twitter4j -> Select All Jars.

make sure All jars are available in libraries folder.

  

Click on Apply this will make all the classes available for your application. you can see in reference library folder all the Jars are available.

>TweetDAO.java will be used for inserting the tweets data into HANA System, here SQL Statement is prepared first & then executed.



After completing all the config & code now it's time to invoke the twitter API for fetching the data from Twitter & insert the Tweets into HANA System. Execute the TwitterSearch.Java file.



Go to HANA System & and put a select on "Tweets" table



 

Now Leverage the text analysis capabilities of SAP HANA create Full Text Index on Tweets table here is the Syntax for that.

Create FullText Index "TWEETS_FTI" On "TWEETS"("TEXT")

TEXT ANALYSIS ON CONFIGURATION 'EXTRACTION_CORE';

As you execute the above command a FullText Index will be created on this table & text analysis will be on the Data of the table & additionally a $TA_TWEETS_FTI table will be created this table would be containing the token information for the Tweets data table.



Below is the structure of table $TA_TWEETS_FTI -



Now you can preview the data of $TA_TWEETS_FTI for getting the better understanding of the text analysis by SAP HANA.



 

So here is the Analysis done by SAP HANA Text Analysis capability -



In Above image you can see Search term #SAPHANA is highlighted & got the highest count in table now you can build your data model based on this $TA_TWEETS_FTI table & can put different where clause for analysis like Combination of tweets of SAP HANA & IOT or SAP HANA & Cloud etc.



Queries/Questions are most welcome.

 

Thanks.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
21 Comments
Former Member
0 Kudos
Really great article to have an idea  about text analysis with real time data....
rahulanand
Explorer
0 Kudos
Very elaborative and nice blog!
former_member196080
Participant
0 Kudos
Thanks Shivam, its nice article.

 

Can you please advise what type of account we should have in twiiter? I tried to create account on dev site and there its asking for which API we need . Is that search API we request for?

 
shivamshukla12
Contributor
Hi Rubane ,

 

When i created the account it was not asking for any API but yes you can request for SearchAPi if you are going to extract the twitter data based on some token.

 

Thanks,

Shivam
former_member196080
Participant
0 Kudos

Hi Shivam,

 

Somehow I managed to create app in twitter. I used your dump and updated properties from my app.

 

But when I reached to the point where I need to test connection with twitter, I’m getting below error:–>some ssl is required

Exception in thread “main” 403:The request is understood, but it has been refused. An accompanying error message will explain why. This code is used when requests are being denied due to update limits (https://support.twitter.com/articles/15364-about-twitter-limits-update-api-dm-and-following).
message – SSL is required

 

I updated jar files from twitter4j and this error has gone but while testing twitter connection I'm getting

 

Exception in thread "main" java.lang.AssertionError: java.lang.IllegalAccessException: Class twitter4j.internal.logging.Logger can not access a member of class twitter4j.StdOutLoggerFactory with modifiers ""
at twitter4j.internal.logging.Logger.getLoggerFactoryIfAvailable(Logger.java:90)
at twitter4j.internal.logging.Logger.<clinit>(Logger.java:46)
at twitter4j.auth.OAuthAuthorization.<clinit>(OAuthAuthorization.java:46)
at twitter4j.auth.AuthorizationFactory.getInstance(AuthorizationFactory.java:40)
at twitter4j.TwitterFactory.<clinit>(TwitterFactory.java:39)
at com.saphanatutorial.util.TwitterConnection.getInstance(TwitterConnection.java:26)
at com.saphanatutorial.util.TwitterConnection.main(TwitterConnection.java:35)
Caused by: java.lang.IllegalAccessException: Class twitter4j.internal.logging.Logger can not access a member of class twitter4j.StdOutLoggerFactory with modifiers ""
at sun.reflect.Reflection.ensureMemberAccess(Unknown Source)
at java.lang.Class.newInstance(Unknown Source)
at twitter4j.internal.logging.Logger.getLoggerFactoryIfAvailable(Logger.java:83)
... 6 more

 

Any idea how to resolve this?

 

Thanks

 

 

 

Thanks

 

Former Member
Thanks Shivam for the share. For people like me who have shifted focus from technology to business, blogs such as these with screen-shots for each step are a great respite. Thanks again!
hai_murali_here
Product and Topic Expert
Product and Topic Expert
0 Kudos
Hi,

I too got the same error.

Did you manage to resolve that?

 
hai_murali_here
Product and Topic Expert
Product and Topic Expert
Hi,

For your info....

I got it resolved by adding the following line in TwitterConnection.java file

cb.setUseSSL(true)

 

Rgds,

Murali
0 Kudos
Hi!, Thanks for all the information. In the page, you add Download Twitter-analysis App link, but it redirects for an URL where you cant  donwload the file. Can you help me?

Thanks in advance

Kevin
shivamshukla12
Contributor
0 Kudos
Hi Kevin ,

 

Go here please - http://twitter4j.org/archive/twitter4j-4.0.7.zip

Thanks,

Shivam
0 Kudos
Hi Shivam,

thanks for the information.I could not download twitter analysis app, when i tried to download using the link that you provided it showing "the site can't be reached". can you help with this?

bi_lead
Discoverer
0 Kudos
Hi Shivam,

When I try to import Twitter4j-4.0.7.zip file into Eclipse then it said, no project found for import.
shivamshukla12
Contributor
shivamshukla12
Contributor
0 Kudos
make sure you are in Java Perspective or do one thing share screenshot of what you tried to do.
Inayath24
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hello Shivam,

I am not able to access the site. Is it changed? Pl share with me the site to download the source.

Thanks

vkris215
Participant
0 Kudos
Hi Shivam,

When selecting the twitter4j-4.0.7 zip file it is showing the below warning. Can you please suggest how to proceed further. Also, I'm in Java perspective.



Thanks.
harshil_joshi
Contributor
Hello Krishna Chaitanya,

 

Please follow below steps.

(1) Create a New Project (for e.g. TEMP) in Java Perspective

(2)  Right Click --> Show In --> System Explorer --> Copy .classpath and .project file

(3) Paste to twitter4j-4.0.7 zip

(4) Delete the Newly Created Project (TEMP)

(5) Now Try to Import
vkris215
Participant
0 Kudos
Thanks Harshil. It worked now.
vkris215
Participant
0 Kudos
HI HArshil,

 

When I imported the twitter4j-4.0.7 zip file, I couldn't able to see the src folder and the packages under it. Any reason ?



 

 
harshil_joshi
Contributor
0 Kudos
You can create src

 

I am facing below error after importing twitter API, can you help me in that?

 

vkris215
Participant
0 Kudos
Can you let me know how we can do that. As I'm unable to see the any .java files.

Also, where exactly is it that you're facing the above error ?