Facilitates replication of Twitter-based research by handling common programming tasks needed when downloading tweets. Specifically, it ensures a user does not exceed Twitter’s rate limits, and it saves tweets in moderately sized files. While a user could perform these tasks in their own code, doing so may be beyond the capabilities of many users.
This R package facilitates replication of Twitter-based research by providing a convenient function to download lists of tweets.
The input for the package is a list of tweet ID numbers. See https://archive.org/details/gaza-tweets for an example.
The output of the package are the tweets downloaded as a tibble or as JSON files. Examples for both are below.
This package limits the rate of tweet downloading so Twitter's 90,000 tweet/15 minute limit is not exceeded. If you choose to download the tweets to JSON files, then a new JSON file will be created for every 90,000 tweet ID numbers.
Tweets that have been deleted or made private cannot be downloaded.
Users must acquire a consumer key, consumer secret, access token, and access token secret from https://developer.twitter.com on their own.
kevincoakley, with zacharyst sending annoying e-mails.
Added a parameter called group_start that takes the list of split tweet IDs and keeps only those from group_start to the final list. That way, if a download is interrupted, which is likely for large corpuses, the user can restart the download at the group_start chunk, not from the beginning.
Added a line to print an estimate of how long a download will take
This project is licensed under the BSD License - see the LICENSE.md file for details