It’s never a bad time to back up your data. But for Twitter users, it’s arguably more urgent than usual, what with the platform’s recent … unpredictability. Mass firings and resignations, whiplash policy changes and crippled infrastructure don’t instill a ton of confidence that Twitter will remain stable well into the distant future. That’s why it’s worth considering archiving your account for posterity.
Twitter’s long offered a tool for archiving account data, which at present allows you to copy your data in a machine-readable format that’s portable to a select few other services. But while the tool works well for simple backups, the archives it creates aren’t particularly user friendly. There isn’t an obvious way to quickly organize the tens to thousands of tweets an archive might contain, for instance, or drill down within an archive for specific types of tweets and embedded media.
Fortunately, thanks to the open source community, there’s freely available tools for those who wish to exercise more control over their Twitter archives. They don’t subvert Twitter’s archive request process — you’ll need an account archive directly from Twitter to use many of the tools — but they make working with Twitter archives less painful while expanding the archives’ usefulness, at least in theory.
Note that not all of the tools are necessarily easy to use for non-developers. Many require knowledge of Python and other programming languages, and any tool that accesses Twitter’s API needs keys from a Twitter Developer account. (Disclaimer: Don’t provide tools access to your account if you don’t fully trust them.) But the tools at the very least provide basic setup instructions to help novices get up and running.
Perhaps the most comprehensive of the bunch is the Twitter Archive Parser, which aims to fix and/or work around some of the more egregious flaws in Twitter’s archiving system (e.g., shortening links, storing tweets in a complex code structure, etc.) The tool converts tweets and even direct messages into markdown, the markup language supported by most content management systems and editors, and also HTML — complete with embedded images, videos and links.
The Twitter Archive Parser goes beyond the Twitter archiver’s barebones functionality to replace shortened URLs with their original versions, copy tweeted images to a folder (for easier sorting), output lists of followers and people you’re following and download images in their original sizes. (By default, Twitter’s archiver swaps out full-sized images in tweets for smaller ones.)
If a more user-friendly archive viewing experience is all you’re after, however, Twitter Archive Browser fits the bill. It displays your entire Twitter timeline going back to the very first Tweet and lets you browse direct message history offline. As you’d expect, Twitter Archive Browser will remain completely functional in the event you delete your Twitter account, showing any media you’ve uploaded including images and videos.
Another Python-based tool, Taupe, is more limited in its capabilities than, say, Twitter Archive Parser. But it does exactly what it advertises: extracts the URLs of your tweets, retweets, replies, quote tweets and “likes” from a personal Twitter archive. (Taupe is a loose acronym for “Twitter archive URL parser”).
Taupe takes a Twitter archive, extracts the URLs corresponding to the tweets, retweets and such, and outputs the results in a spreadsheet format that can be used with other software and services, such as Internet Archive’s Wayback Machine. While Taupe has limitations — for example, because the Twitter archive format for “likes” doesn’t contain a timestamp, Taupe can’t know or show exactly when individual tweets were liked — it’s one of the simplest ways to quickly convert historical Twitter data into a more useable format.
As a complement to Taupe, there’s the self-descriptive Export your Twitter Bookmarks tool, which saves all your Twitter bookmarks — including photos and videos and fully expanded URLs attached to tweets — in a markdown file. (Twitter archives don’t include bookmarked tweets.) And for bulk-deleting tweets in an archive, Twitter Archive Browser comes in handy. It can auto-delete tweets in a certain timeframe or containing certain keywords.
Folks in need of more exhaustive Twitter timeline pruning will want to try Twitter Cleaner, which can automatically delete tweets, retweets and favorites from an archive. Twitter Cleaner can also remove entries from an active timeline but that requires a Twitter developer account.
What if you’re only interested in specific artifacts from your Twitter account, like photos? While there’s no way around downloading your entire account archive, some tools help to surface only the items of interest within that archive.
For example, Twitter Photo Downloader processes your Twitter archive to create a local database containing all your photos. It’ll even work for image galleries and photos from retweets, albeit not for videos and GIFs. (You’ll see a single still frame in place of a GIF.) Twitter Archive Parser is even more stripped down. The tool converts individual tweets in an archive into PDFs for storage purposes.
Well — that’s all the open source tools we’ve spotted for managing Twitter archives so far. If we missed any, feel free to send us an email and we’ll see about adding it to the list.
Source @TechCrunch