Let's play with Python and the python-twitter
library to figure out how many Twitter followers I share with my friends.
The analysis is summarized in this Notebook, but you have the full Flask app available to run it yourself if you fork this project.
RequirementsΒΆ
First things first, let's install the library. Alternatively, you can set the library as a custom requirement for this project in the settings π
!pip install python-twitter==3.4.2
The Twitter clientΒΆ
To run this experiment, you will need to have a Twitter App set up. If you still don't, visit this page and follow the instructions: https://apps.twitter.com/
Once you get your App credentials, you can now initialize the Twitter client like this:
NOTE: I'm reading credentials from env vars. You can do the same by defining custom env vars in the Project settings.
import os
import twitter
api = twitter.Api(
consumer_key=os.environ['TWITTER_CONSUMER_KEY'],
consumer_secret=os.environ['TWITTER_CONSUMER_SECRET'],
access_token_key=os.environ['TWITTER_ACCESS_TOKEN_KEY'],
access_token_secret=os.environ['TWITTER_ACCESS_TOKEN_SECRET'],
sleep_on_rate_limit=True # we will talk about this later
)
api.VerifyCredentials()
We are good to go! π Our Twitter API client is set up. Now we can start requesting the Twitter API as we need.
AccountsΒΆ
Let's define the list of accounts we want to use. That means, the list of friends we want to analyze shared friends with.
I will list my friends below, but you probably want to replace the list with your own friends.
ACCOUNTS = [
'santiagobasulto',
'martinzugnoni',
'ivanzugnoni',
'yosoymatias',
'jperelli',
'bruno_dimartino',
# add more friends here.
]
Twitter API rate limits noteΒΆ
It's important to mention that, if you use accounts with a large amount of followers (>75.000), the app might take a long time to load.
The reason is Twitter API's rate limits. We're using the GetFollowerIDs
API method, which returns pages of 5000 account ID's at max. As explained here, Twitter allows 15 API requests per 15 minutes. Meaning that, if all your listed accounts sum up to 75000 followers (15 requests, 5000 each), you will hit the API rate limit and the app will sleep for 15 minutes until the next request.
That's what the sleep_on_rate_limit
option is used for. When you reach the Twitter API rate limit, the app will sleep for a while instead of raising an exception.
Shared friendsΒΆ
Alright! Without further introduction, let's compute the list of shared followers.
from itertools import permutations, combinations
FOLLOWERS_CACHE = {}
# warm up followers cache.
# requesting Twitter followers (depending on the amount) might be very slow
# save them in a local cache to avoid successive requests of the same data.
for account in ACCOUNTS:
if not account in FOLLOWERS_CACHE:
FOLLOWERS_CACHE[account] = api.GetFollowerIDs(screen_name=account)
result = []
for i in range(1, len(ACCOUNTS) + 1):
# compute combinations of all accounts listed above.
# the result should look something like:
# ('santiagobasulto',)
# ('santiagobasulto', 'martinzugnoni')
# ('santiagobasulto', 'martinzugnoni', 'ivanzugnoni')
# ('santiagobasulto', 'martinzugnoni', 'ivanzugnoni', 'yosoymatias')
# ...
for comb in combinations(ACCOUNTS, i):
sets = [set(FOLLOWERS_CACHE[account]) for account in comb]
# when the set contains more than one account, compute the followers
# intersection. Otherwise count the amount of followers it has.
if len(sets) > 1:
intersec = sets[0].intersection(*sets[1:])
result.append({'sets': list(comb), 'size': len(intersec)})
else:
result.append({'sets': list(comb), 'size': len(sets[0])})
Let's take a look at the result
list, and try to understand what it means.
The list will look something similar to this:
[{'sets': ['santiagobasulto'], 'size': 542},
{'sets': ['martinzugnoni'], 'size': 216},
{'sets': ['ivanzugnoni'], 'size': 86},
{'sets': ['yosoymatias'], 'size': 244},
{'sets': ['jperelli'], 'size': 124},
{'sets': ['bruno_dimartino'], 'size': 129},
{'sets': ['santiagobasulto', 'martinzugnoni'], 'size': 65},
{'sets': ['santiagobasulto', 'ivanzugnoni'], 'size': 9},
{'sets': ['santiagobasulto', 'yosoymatias'], 'size': 9},
...
Each dict in the list will represent the amount of followers shared between accounts listed under the 'sets'
key.
For example:
{'sets': ['santiagobasulto', 'martinzugnoni'], 'size': 65},
means that santiagobasulto
and I share 65 followers on Twitter.
Why this format? Simply because we want to use Venn.js library to plot our results.
Plot the resultΒΆ
As I've mentioned above, we will be using https://github.com/benfred/venn.js/ library to plot our results in the HTML template. The library is pretty straignforward to use, that's why I won't get into much details about how to do it.
For futher details you may want to take a look at the venn_friends.html
template.
If you did everything right, your plot result should look similar to this:
Running the Flask appΒΆ
Wanna run the app yourself? Just make sure the requirements.txt
are installed, and execute this in the terminal:
python app.py
You should see the output saying
Running on http://0.0.0.0:5000/
That's all! π Now open the webview down below and see if the app is working π