Vulnerable By Design

16. How to build a timeline (Twitter's version)

May 22 '23

tl;dr

Twitter open sourced its tweet recommendation algorithm. Sort of. Is this the beginning of a new era of platform transparency? Chris isn’t so sure.

Transcript

Hello and welcome to Vulnerable By Design with me, Chris Onrust. In this episode: curious adventures in open source. Are you ready?

Twitter published code for its tweet recommendation algorithm

Something momentous happened on the social media platform Twitter the other day. Errr … no, sorry, there still isn’t a default edit button. No, instead what happened, is that Twitter open sourced—as in: made public—the code for its tweet recommendation algorithm. Which is the code that determines which tweets will show up in the ‘For you’-tab on Twitter.

In theory, with this code being public, that means that if you were so inclined, you can just go to Twitter’s account on the collaborative software development platform GitHub (which is owned by Microsoft), and look for a code repository (which is basically a folder) called ‘the-algorithm’ and inspect how Twitter decides which tweets you get to see, and in what order.

In theory, this move is massive. I haven’t seen Tik Tok, YouTube or Instagram making any of their algorithms public. Quite the contrary, I’d say. Most often, with these for-profit platforms, code and data are treated as a massive secret—something that users need not know about, and something that competitors should be kept as far away from as possible.

So let’s ask: Why did Twitter want to make its Twitter recommendation algorithm public? I’d say one possible reason is: Always keep your promises. Twitter’s billionaire owner had conducted a Twitter poll about whether Twitter’s algorithms should be public. They said they would honour the poll’s results. The poll came out 83% in favour. So, naturally, they’re going to deliver on that, aren’t they?

But a bigger picture reason points to some really lofty ideals. Transparency! Accountability! Public scrutiny! Making Twitter the least gameable system on the Internet!

Permit me to link this to the original spirit of the free software movement. There the idea is that if you make the source code of a system freely and publicly available, then anyone—friends, foes, whoever, you and I included—anyone will have the freedom to do a bunch of things. Such as: study how the system works. Use the system in whatever way you wish, for any purpose. Even the freedom to change and improve it in whatever way you like, and to distribute copies of the system—original or modified—to others, so that everyone can benefit.

In publishing the source code of its recommendation algorithm, Twitter said it would usher in a new era of transparency, and that it even anticipated that people would suggest improvements to make Twitter better. So let’s take a peek and see what we can learn from the published code and its accompanying blog post.

Data and models

We learn that Twitter’s tweet recommendation system uses lots and lots of data. Data about: users, who they follow, and the tweets they engage with. It also uses huge models to predict how likely you are to click on, read, like, retweet or reply to a given tweet; and how likely you are to interact with a given user; as well as which users and tweets are similar to what you’re already interacting with.

By the way, Twitter has decided that it can use this idea of similarity as a stand in for relevance. Aaaah! Relevance! Shall we do a brief let-that-sink-in-moment? Using similarity as a standard for relevance means that when Twitter says it wants to show you the relevant tweets, it will just show you more of the same. In a world in which similarity stands in for relevance, forget about new horizons. Forget about breaking free from old habits. Forget about diving into something completely fresh and unknown. This is a world that likes its users to stay in their lane, be more predictable, be more like the rest. Maybe because it can then more reliably squish you into an advertising target audience?

Anyway. With its data and its models in place, Twitter constructs the ‘For you’-timeline in three stages. First, it finds tweets it could potentially show you which is the candidate sourcing stage. Second, it puts those tweets in the right order—the ranking stage. And third, it removes tweets that it deems you do not want to, or should not, access—which is the filtering stage. Now, it might seem a bit odd to do the ranking before the filtering, because why would you want to spend energy on ranking tweets that you’re not even going to show? But that’s how it’s organised in here.

Stage 1: Candidate sourcing

So let’s take a look at what happens in each of these stages. And we’ll start with candidate sourcing. Assuming, for the sake of argument of course, that you are a Twitter user. Then when you would open your Twitter ‘For you’-feed, then Twitter in some way has to determine which of the billions of tweets out there it should show you. I mean, imagine the horror scenario when, out of the blue, you would only get fed tweets by the owner of Tesla!

An (in my opinion) reasonable way to solve this timeline construction problem is to say: Which accounts does this user follow? Present the tweets of those accounts in reverse chronological order (newest first). And if two tweets were posted at exactly the same time, then stick on some randomiser to put one of them on top.

In the good old days, Twitter used to be part of this highly respected (I mean: respected by me) show-only-posts-from-followed-accounts gang. But as anyone who’s dipped into Twitter over the past years will tell you: no longer. This was known, but Twitter’s code publication and its accompanying blog post reveal exactly to what extent things have changed.

Today, at most, 50%—fif-ty per-cent—of the tweets on the ‘For you’-timeline will be from accounts that you’ve actually chosen to follow. The other 50% comes from accounts of whom you’ve never indicated in any way that you want to see any of the tweets whatsoever, but which Twitter decided to shove in your face anyway. Even more so, apparently, for accounts that are paying the 8 US dollars a month for a Twitter Blue subscription. Which to me sounds awfully like paid promotion? As a contributor to the online forum HackerNews commented: ‘I have spent significant effort creating a network and there you go, putting in 50% crap. I don’t want to see? I despise your algorithm.’

And beware! The more likely Twitter thinks you are to engage with a certain account, the more of their tweets you will see. This is why, for your own safety and security, even though the urge may be strong, you should never, ever, ever bite when someone posts a barmy philosophy joke on Twitter. Because you will be stuck with barmy philosophy jokes for the rest of your life.

At the end of the candidates sourcing stage, Twitter has a selection of roughly 1500 tweets that it could show you. Now it needs to decide in what order to show them. Enter: the ranking stage.

Stage 2: Ranking

For ranking, Twitter users a machine learning model. For each tweet, this machine learning model takes into account thousands of features. It then outputs ten labels for how likely you are to engage with the tweet, meaning: clicking, reading, favouriting, replying, and so on. All of that information then is combined and weighted into an overall score per tweet. The higher the score, the higher up the tweet will appear in your feed.

In terms of this weighting, at the time of writing this episode, a tweet that is likely to get favourited will get a smallish boost up in the ranking. A likely retweet is a bit more valuable in this ranking game, with one retweet being equivalent to two favourites. A predicted reply is a different league altogether though: one reply is worth 27 favourites. And if a reply to a tweet in turn is likely to get engaged with by the author of the original tweet, then that is worth a ranking boost equivalent to a massive 150 favourites. On the flip side, if a tweet is predicted to be likely to get reported, then that cancels out the boosting effect of 738 favourites. So yeah, reply and report with caution, my friends.

None of this ranking matters though, if the tweet doesn’t make it past the third stage: filtering.

Stage 3: Filtering

During the filtering stage, Twitter removes tweets from accounts you’ve blocked or muted or tweets that Twitter for whatever reason, does not want to show you, including tweets that it deems offensive or that have been placed on a ban list.

Curious (to me) is that Twitter also has a filtering policy set up for misinformation, which includes the category of ‘GovernmentRequested’. Now what to make of that? It suggests to me that Twitter takes into account government requests to classify certain tweets as misinformation, and that Twitter will then offer a helping hand to block that information out. If true, then well … I have got views. Given that quite a number of governments have themselves been a massive source of deception, misinformation, if not outright lying—including, up to this day, spreading public health, misinformation around COVID, and measures needed to tackle our planetary emergency—you could well question the wisdom of making governments the judge of what is and isn’t misinformation. But yeah, not so at Twitter.

Mix and serve

Once the filtering stage is complete and candidate tweets have been sourced, ranked, and filtered, Twitter will throw in just a sprinkle of non-tweet material that you didn’t ask for, such as ads and following recommendations. Et voila! The ‘For you’-feed is done. Get ready to be engaged!

Publishing its tweet recommendation algorithm was meant to be the start of Twitter’s big leap into transparency. But was it?

Maybe the code that Twitter published can give you some glimmers of insights. Insights about some factors that go into timeline ranking. Insights that may even help you pierce through some of the spurious advice that sometimes gets spewed in clickbaity how-to-beat-the-algorithm articles. For example: ‘Use hashtags, so that your post can be found’ is actually not that useful advice, because it turns out that having multiple hashtags will actually dampen or de-boost your post. So maybe that is good to know for some people in some cases.

Crucial bits missing

But … there is just a tiny complication that I neglected to mention so far. The code is kaputt. Redacted. What Twitter published from the ‘For you’ algorithm is incomplete.

Missing are at least: the data used to train Twitter’s recommendation algorithm. Most of the underlying models with which you could evaluate the behaviour of the algorithm, but also much of the code related to Twitter’s trust and safety systems. For example, the code shows that Twitter has a ‘Not-safe-for-work’-classifier (NSFW), but it omits the code that would allow you to see how Twitter decides what is and isn’t safe for work. Twitter says I’ve removed this information to protect users. Maybe? But regardless of whatever the reason is, the result is: that code is not here.

Crucial bits of information that you would need to run the system are actually missing. Hence, lofty potential benefits such as: being able to run the system as you wish, modifying it to help others—those were never even going to be an option here. In fact, we have no way of telling for sure whether the code that Twitter published is what is actually delivering tweets to you ‘in production’, as they say.

Twitter’s path away from open, free, transparent

Now, let’s place that in perspective. Twitter’s incomplete so-called open sourcing comes on the back of a couple of moves where Twitter actually has become less open, less free, and less transparent. Do you care to hear them?

In November last year, Twitter fired its Machine Learning Ethics, Transparency and Accountability team that had been in charge of making Twitter’s algorithms fairer and more transparent. This spring, Twitter decided that access to its API (its application programming interface, which is what you can use to work with tweets and tweet data at a large scale) Twitter decided that access to its API, which used to be free, is now going to come with a price tag of between 42,000 to 21,0000 USD per month.

Who would be willing to pay that? Who would be able to pay that? If you are, by some miracle, willing and able to pay that, please consider spending that money on getting people access to clean drinking water, food security, and housing instead. In any case, the reality is that this will leave many people basically priced out of doing any large scale research on, or doing audits of, Twitter.

I’ve seen quite some academic researchers be upset about Twitter’s move to paywall its API. But my dear friends: What did you expect? Did you really think that out of the sheer benevolence of its own good heart, a for-profit company would continue to give you freebies?

[Sound of glass breaking, metal being smashed]

This is what happens, Larry. You see what happens, Larry?

– Walter Sobchak [fictional character]

This is what happens, Larry, if you let your global communications infrastructure be run by a billionaire or oligopolistic. On top of that, Twitter has been suspending accounts of journalists of anti-fascists (also known as an-tee-fa), and has been blocking links to competitor platforms such as Mastodon or Substack.

Twitter says it wants to open up its algorithms for public scrutiny. But when earlier this year, a GitHub user named ‘FreeSpeechEnthusiast’ published what is believed to have been Twitter source code, Twitter responded with that good old repressive tool of the copyright infringement notice under the United States’s Digital Millennium Copyright Act, demanding the information be taken down.

Open washing and transparency theatre

Okay, I get it. Many people today say they are tired of being manipulated online. Companies register that. They register that users—or hey, let’s be radical! call them ‘people’? They register that people prefer brands with a reputation of being transparent, open, green, small, local and organic (ha ha), over brands with a reputation for being secretive manipulators. So if that is what the people want, then that is what the people will get. Resulting in a proliferation of open washing, transparency theatre, and that curious species of fish called the red herring.

Of course, it’s not just Twitter doing this sort of thing. We’ve seen similar moves many times in recent months. A Large-Language-Models-as-a-Service provider that calls itself ‘OpenAI’, while not being open at all. The Twitter spinoff Bluesky, that claims to be decentralised but asks users of the Bluesky app to sign up for a centrally administered waiting list. Or Meta who, according to recent reports, is now dabbling in federated systems. Like … what? Meta giving up total control?

As someone wiser than me once commented: selective disclosure is often propaganda. Twitter’s supposed open sourcing of its tweet recommendation algorithm has given us some nice selective disclosure. Draw your conclusions.

Listening to community input

Twitter’s owner said that once the code were public, Twitter might get some useful community feedback about possible improvements or bug fixes—whatever might help make Twitter better. Well, it turns out they did!

At the time of writing this episode, the ‘the-algorithm’ repository already has 190 so-called pull requests (which is when someone proposes certain code changes, which the developer who controls the repository can then ‘pull in’ to the original code). The most popular such pull request, in terms of number of positive comments and thumbs up, is by a GitHub user named Cosmelon, who claims to have ‘optimized the algorithm’. Optimized, by deleting all of the 1,782 files in Twitter’s ‘the-algorithm’ GitHub repository.

Never too late to start listening to community input, is it?

Closing

Thank you for listening to a Vulnerable By Design. If you would like to hear more or get in touch, we would love to hear from you! You will find all of our episodes and information on vulnerablebydesign.net. I’m Chris Onrust. Thank you for listening, and bye for now.