Richard Wen
rrwen.dev@gmail.com
Module for extracting Twitter data to PostgreSQL databases.
- Install PostgreSQL
- Install Node.js
- Install twitter2pg via
npm
npm install --save twitter2pg
For the latest developer version, see Developer Install.
The usage examples show how to get Twitter data into a PostgreSQL table named twitter_data
with a tweets
jsonb column:
row | tweets |
---|---|
1 | {...} |
2 | {...} |
3 | {...} |
... | ... |
Create an appropriate PostgreSQL table with psql before running the usage examples:
-h
: host address-p
: port number-d
: database name-U
: user name with table creation permissions-c
: PostgreSQL query
psql -h localhost -p 5432 -d postgres -U postgres -c "CREATE TABLE twitter_data(tweets jsonb);"
- Search for tweets with keyword
twitter
using a GET request - Filter tweets with jsonata to only return the array inside
statuses
- Insert the filtered tweets into a PostgreSQL table named
twitter_data
- Each row of the
tweets
column in thetwitter_data
table contains one tweet
var twitter2pg = require('twitter2pg');
options = {
pg: {},
twitter: {},
jsonata: 'statuses' // filter tweets for statuses array only
};
// (options_twitter) Twitter API options
options.twitter = {
method: 'get', // get, post, delete, or stream
path: 'search/tweets', // api path
params: {q: 'twitter'} // query tweets
};
// (options_twitter_connection) Twitter API connection keys
options.twitter.connection = {
consumer_key: '***', // default: process.env.TWITTER_CONSUMER_KEY
consumer_secret: '***', // default: process.env.TWITTER_CONSUMER_SECRET
access_token_key: '***', // default: process.env.TWITTER_ACCESS_TOKEN_KEY
access_token_secret: '***' // default: process.env.TWITTER_ACCESS_TOKEN_SECRET
};
// (options_pg) PostgreSQL options
// In query, $1 are the JSON tweets
options.pg = {
table: 'twitter_data',
column: 'tweets',
query: 'INSERT INTO $options.pg.table($options.pg.column) SELECT * FROM json_array_elements($1);'
};
// (options_pg_connection) PostgreSQL connection details
options.pg.connection = {
host: 'localhost', // default: process.env.PGHOST
port: 5432, // default: process.env.PGPORT
database: 'postgres', // default: process.env.PGDATABASE
user: 'postgres', // default: process.env.PGUSER
password: '***' // default: process.env.PGPASSWORD
};
// (twitter2pg_rest) Query tweets using REST API into PostgreSQL table
twitter2pg(options).catch(err => {
console.error(err.message);
});
- Stream tweets to track keyword
twitter
- When a tweet is available, insert the tweet into a PostgreSQL table named
twitter_data
- Each tweet is inserted as one row in the
tweets
column of thetwitter_data
table
var twitter2pg = require('twitter2pg');
options = {};
// (options_twitter) Twitter API options
options.twitter = {
method: 'stream', // get, post, delete, or stream
path: 'statuses/filter',// api path
params: {track: 'twitter'} // track tweets
};
// (options_twitter_connection) Twitter API connection keys
options.twitter.connection = {
consumer_key: '***', // default: process.env.TWITTER_CONSUMER_KEY
consumer_secret: '***', // default: process.env.TWITTER_CONSUMER_SECRET
access_token_key: '***', // default: process.env.TWITTER_ACCESS_TOKEN_KEY
access_token_secret: '***' // default: process.env.TWITTER_ACCESS_TOKEN_SECRET
};
// (options_pg) PostgreSQL options
// In query, $1 are the JSON tweets
options.pg = {
table: 'twitter_data',
column: 'tweets',
query: 'INSERT INTO $options.pg.table($options.pg.column) VALUES($1);'
};
// (options_pg_connection) PostgreSQL connection details
options.pg.connection = {
host: 'localhost', // default: process.env.PGHOST
port: 5432, // default: process.env.PGPORT
database: 'postgres', // default: process.env.PGDATABASE
user: 'postgres', // default: process.env.PGUSER
password: '***' // default: process.env.PGPASSWORD
};
// (twitter2pg_stream) Stream tweets into PostgreSQL table
var stream = twitter2pg(options);
stream.on('error', function(error) {
console.error(error.message);
});
See Documentation for more details.
Reports for issues and suggestions can be made using the issue submission interface.
When possible, ensure that your submission is:
- Descriptive: has informative title, explanations, and screenshots
- Specific: has details of environment (such as operating system and hardware) and software used
- Reproducible: has steps, code, and examples to reproduce the issue
Code contributions are submitted via pull requests:
- Ensure that you pass the Tests
- Create a new pull request
- Provide an explanation of the changes
A template of the code contribution explanation is provided below:
## Purpose
The purpose can mention goals that include fixes to bugs, addition of features, and other improvements, etc.
## Description
The description is a short summary of the changes made such as improved speeds or features, and implementation details.
## Changes
The changes are a list of general edits made to the files and their respective components.
* `file_path1`:
* `function_module_etc`: changed loop to map
* `function_module_etc`: changed variable value
* `file_path2`:
* `function_module_etc`: changed loop to map
* `function_module_etc`: changed variable value
## Notes
The notes provide any additional text that do not fit into the above sections.
For more information, see Developer Install and Implementation.
Install the latest developer version with npm
from github:
npm install git+https://github.com/rrwen/twitter2pg
Install from git
cloned source:
- Ensure git is installed
- Clone into current path
- Install via
npm
git clone https://github.com/rrwen/twitter2pg
cd twitter2pg
npm install
- Clone into current path
git clone https://github.com/rrwen/twitter2pg
- Enter into folder
cd twitter2pg
- Ensure devDependencies are installed and available
- Run tests with a
.env
file (see tests/README.md) - Results are saved to tests/log with each file corresponding to a version tested
npm install
npm test
Use documentationjs to generate html documentation in the docs
folder:
npm run docs
See JSDoc style for formatting syntax.
- Ensure git is installed
- Inside the
twitter2pg
folder, add all files and commit changes - Push to github
git add .
git commit -a -m "Generic update"
git push
- Update the version in
package.json
- Run tests and check for OK status (see tests/README.md)
- Generate documentation
- Login to npm
- Publish to npm
npm test
npm run docs
npm login
npm publish
The module twitter2pg uses the following npm packages for its implementation:
npm | Purpose |
---|---|
twitter2return | Connections to the Twitter API REST and Streaming Application Programming Interfaces (APIs) using twitter, and Filters with jsonata before inserting into PostgreSQL |
pg | Insert Twitter data to PostgreSQL tables |
twitter2return <-- Extract Twitter data from API and Filter JSON data
|
pg <-- Insert filtered Twitter data into PostgreSQL table