Which Twitter Python Library to Use?
The first thing you realize: there is no official python libraries for Twitter (only Java, Objective-C & Swift available). There are quite a few third party libraries, but neither is the clear winner nor actively supported. The good news is, these libraries work for common use cases.
Below is some basic stats for twitter python libraries as of 30 Jan 2018.
Name | Github Star | Issues | Last Release |
---|---|---|---|
python-twitter | 2039 | 42 | May 21, 2017 |
tweepy | 4792 | 244 | Nov 20, 2015 |
twython | 1472 | 49 | Apr 30, 2016 |
I can't recall why I selected Twython, but Tweepy is fairly popular (though not updated for more than 2 years).
NOTE: Twitter API is basically resutful HTTP call + JSON.
Setup Twython
Install Twython via pip.
pip install twython
Twitter has 2 types of Authentication:
- OAuth 1: user authentication (tweet, follow, DM, etc.)
- OAuth 2: application authentication (search tweets or read user's public timeline)
You need to register your Twitter App with Twitter. After registration, we need the Consumer Key (API Key)
and Consumer Secret (API Secret)
.
OAuth 1: User Authentication
Use OAuth 1 authentication methd if the application need to tweet, follow, DM, etc on behalf of the user.
Start User Authentication.
NOTE: I am using Flask for webapp: app
, session
, redirect
, request
.
@app.route('/connect_twitter')def connect_twitter(): from twython import Twython APP_KEY = 'YOUR_APP_KEY' APP_SECRET = 'YOUR_APP_SECRET' twitter = Twython(APP_KEY, APP_SECRET) # callback_url is for web application only auth = twitter.get_authentication_tokens(callback_url='https://www.mydomain.com/callback') # save this values in session / temporary storage session['oauth_token'] = auth['oauth_token'] session['oauth_token_secret'] = auth['oauth_token_secret'] # redirect user to this url redirect_url = auth['auth_url'] return redirect(redirect_url)
Handle User Authentication callback.
@app.route('/callback')def callback(): # if user denied authorization is_denied = request.values.get('denied') if is_denied: return "USER DENIED" oauth_verifier = request.values.get('oauth_verifier') if not oauth_verifier: abort(401, 'missing oauth_verifier') twitter = Twython(APP_KEY, APP_SECRET, session['oauth_token'], session['oauth_token_secret']) final_step = twitter.get_authorized_tokens(oauth_verifier) # store these permanently in database OAUTH_TOKEN = final_step['oauth_token'] OAUTH_TOKEN_SECRET = final_step['oauth_token_secret'] # get user credential twitter = Twython(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET) # https://developer.twitter.com/en/docs/accounts-and-users/manage-account-settings/api-reference/get-account-verify_credentials data = twitter.verify_credentials() user_id = data['id_str'] name = data['name'] username = data['screen_name']
OAuth 2: Application Authentication
Use OAuth 2 authentication methd if the application just need to search for tweet or read public user's timeline. OAuth 1 can perform the operation of OAuth 2 as well.
from twython import TwythonAPP_KEY = 'YOUR_APP_KEY'APP_SECRET = 'YOUR_APP_SECRET'twitter = Twython(APP_KEY, APP_SECRET, oauth_version=2)# store access token permanently in databaseACCESS_TOKEN = twitter.obtain_access_token()# use the following to make calls for search, etc.twitter = Twython(APP_KEY, access_token=ACCESS_TOKEN)
Post Tweet
OAuth 1 authentication is required.
from twython import Twython# retrieve OAUTH_TOKEN, OAUTH_TOKEN_SECRET from databasetwitter = Twython(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)twitter.update_status(status="My First Bot Tweet!")
Refer to twython documentation for posting tweet with image and video.
Search Tweet
OAuth 1 and OAuth 2 can be used to perform search.
# retrieve ACCESS_TOKEN from databasetwitter = Twython(TWITTER_APP_KEY, access_token=ACCESS_TOKEN)# https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweetsresults = twitter.cursor(twitter.search, q="#WhatIDoNow", result_type='recent', count=25, tweet_mode='extended')max_str_id = Nonefor _result in results: str_id = _result['id_str'] if str_id > max_str_id: max_str_id = str_id # if tweet_mode='extended', use _result['full_text'] text = _result['text'] if 'text' in _result else _result['full_text'] # check if is retweet is_retweet = True if 'retweeted_status' in _result or 'quoted_status' in _result else False # generate tweet url user_id = _result['user']['id_str'] username = _result['user']['screen_name'] post_id = _result['id_str'] url = "https://twitter.com/{}/status/{}".format(username, post_id) # Mon Sep 24 03:35:21 +0000 2012 created = datetime.datetime.strptime(_result['created_at'], '%a %b %d %H:%M:%S +0000 %Y') # hashtags hashtags = [_hashtag['text'].lower() for _hashtag in _result['entities']['hashtags']]# you might want to save max_str_id if you plan to use since_id in next query.
NOTE: there is a bug with Twython 3.4.0
where it will go into infinite loop with the same tweet being returned in certain circumstances (especially when since_id
parameter is used). Manually apply this patch.
NOTE: use tweet_mode='extended'
for better 280 characters support and check full_text
for tweet. If tweet_mode='extended'
is not provided, text
return truncated tweet with url to the next portion of the tweet. I assume this is done to maintain backward compatibility. If tweet_mode='extended'
is not provided, entities.hashtags
might not be accurate if the hashtag is in the second portion of the truncated tweet.
NOTE: to check if a tweet is a retweet, check retweeted_status
and quoted_status
(quoted retweet).
NOTE: if since_id
paremeter is not provided and result_type='recent'
is used when performing search, the results return shall be from latest to older tweets.
NOTE: even though we specified count=25
, the twitter.cursor
might return more than 25 results as count=25
is used as paging (25 results per HTTP call happens under the hood).
NOTE: check Search Tweets for search parameters.
NOTE: Search Tweets doesn't seem to return retweet/quoted retweet, though older stackoverflow post suggest otherwise.
NOTE: twitter.search
return results as per twitter search api, while twitter.cursor(twitter.search ...
is a convinient method to handle paging and automatically retrieve next result.
You probably don't want to search the same query where twitter keep returning the same results. Use the above code to peform the first query, then save the max_str_id
. Use max_str_id
for since_id
parameter in the next query.
The following will show new results since max_str_id
(excluding max_str_id
).
results = twitter.cursor(twitter.search, q="#WhatIDoNow", result_type='recent', count=25, tweet_mode='extended', since_id=max_str_id)for _result in results: print(_result)
NOTE: if since_id
paremeter is provided, the results return shall be from older tweets to latest tweets.
Query User Tweets
We shall use OAuth 2 to query user's own timeline (tweet posted by current user).
twitter = Twython(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)results = twitter.cursor(twitter.get_home_timeline, tweet_mode='extended')
NOTE: refer to Get Tweet timelines for paremeters and returned results.
NOTE: retweet is included results, replies is excluded by default (can use exclude_replies
parameter).
https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-home_timeline
You should be able to use either OAuth 1 or OAuth 2 to query specific user's timeline.
results = twitter.cursor(twitter.get_user_timeline, screen_name='lovewhatidonow', result_type='recent', tweet_mode='extended')