The Hitchhiker’s Guide to Ruby On Rails Galaxy

Records of my voyage through RoR Galaxy

Posts Tagged ‘twitter’

How to parse a tweet text from Twitter using Ruby to parse-out ‘@’ and ‘#’

Posted by arjunghosh on March 5, 2009

Well lot of us love @twitter and also Ruby, and some time work on both ūüôā

And often we need to do the folowing with a tweet

Well I had to do the following quite often:-

Take out the ‘@’¬†(i.e. @replies )and ‘#’¬†(i.e. hashtags ) from a tweet and separate it from the text part.

For example, we have a tweet:

@myfriend1 @myfriend2 this is a sample text #link #text

Now I want this tweet to be seperated into the following Array:

[‘myfriend1′,’myfriend2’]

[‘link’,’text’]

and the text only – [“this is a sample text “]

So first had to build a RegE, and then using the ever useful .gsub method of Ruby, created the following:

parsed_text = tweet.text.gsub(/ ?(@\w+)| ?(#\w+)/) { |a| ((a.include?(‘#’)) ? tags : replies) << a.strip.gsub(/#|@/,”); ” }

So the parsed_text has the final text only.  tags is an Array which will contain the hashtags and replies is an Array which will contain the @replies.

The RegEx / ?(@\w+)| ?(#\w+)/ extracts and seperates the hashtags & the @replies and place them in two seperate arrays.

The RegEx¬†/#|@/,” only reples the ‘@’ and ‘#’ symbols in the extracted array elements.

And you can download it from Gist here http://gist.github.com/78498

Also while working on creating the above regular expressions, I found this interesting RegEx testing site called www.rubular.com which will help you write regular expressions very easily.

Posted in Uncategorized | Tagged: , , , , , , | 8 Comments »