Category Archives: Storify

Good Bye Storify

For almost a year I’ve been helping social media curation tool Storify as a software engineer with their Node.js apps, Backbone.js front-end development as well as supporting Storify API, implementing Twitter API v1.1 intergration, writing blog posts and answering Storify API questions. We had some great moments and a few weeks ago I summed them up in a post.

Continue reading

Express.js Tutorial: Instagram Gallery Example App with Storify API

Note: This text is a part of Express.js Guide: The Comprehensive Book on Express.js.

An example of an Express.js app using Storify API as a data source is a continuation of introduction to Express.js tutorials.

Continue reading

Node.js OAuth1.0 and OAuth2.0: Twitter API v1.1 Examples

Recently we had to work on modification to accommodate Twitter API v1.1. The main difference between Twitter API v1.1 and, soon to be deprecated, Twitter API v1.0 is that most of the REST API endpoints now require user or application context. In other words, each call needs to be performed via OAuth 1.0A or OAuth 2.0 authentication.

Node.js OAuth

Node.js OAuth

At Storify we run everything on Node.js so it was natural that we used oauth module by Ciaran Jessup: NPM and GitHub. It’s mature and supports all the needed functionality but lacks any kind of examples and/or interface documentation.

Here are the examples of calling Twitter API v1.1, and a list of methods. I hope that nobody will have to dig through the oauth module source code anymore!

OAuth 1.0

Let start with a good old OAuth 1.0A. You’ll need four values to make this type of a request to Twitter API v1.1 (or any other service):

  1. Your Twitter application key, a.k.a., consumer key
  2. Your Twitter secret key
  3. User token for your app
  4. User secret for your app

All four of them can be obtained for your own apps at dev.twitter.com. In case that the user is not youself, you’ll need to perform 3-legged OAuth, or Sign in with Twitter, or something else.

Next we create oauth object with parameters, and call get() function to fetch a secured resource. Behind the scene get() function constructs unique values for the request header — Authorization header. The method encrypts URL, timestamp, application and other information in a signature. So the same header won’t work for another URL or after a specific time window.

var OAuth = require('OAuth');
var oauth = new OAuth.OAuth(
      'https://api.twitter.com/oauth/request_token',
      'https://api.twitter.com/oauth/access_token',
      'your Twitter application consumer key',
      'your Twitter application secret',
      '1.0A',
      null,
      'HMAC-SHA1'
    );
    oauth.get(
      'https://api.twitter.com/1.1/trends/place.json?id=23424977',
      'your user token for this app', 
      //you can get it at dev.twitter.com for your own apps
      'your user secret for this app', 
      //you can get it at dev.twitter.com for your own apps
      function (e, data, res){
        if (e) console.error(e);        
        console.log(require('util').inspect(data));
        done();      
      });    
});

OAuth Echo

OAuth Echo is similar to OAuth 1.0. If you’re a Delegator (service to which requests to Service Provider are delegated by Consumer) all you need to do is just pass the value of x-verify-credentials-authorization header to the Service Provider in Authorization header. Twitter has a good graphics on OAuth Echo.

There is OAuthEcho object which inherits must of its methods from normal OAuth class. In case if you want to write Consumer code (or for functional tests, in our case Storify is the delegator) and you need x-verify-credentials-authorization/Authorization header values, there is a authHeader method. If we look at it, we can easily reconstruct the headers with internal methods of oauth module such as _prepareParameters() and _buildAuthorizationHeaders(). Here is a function that will give us required values based on URL (remember that URL is a part of Authorization header):

  function getEchoAuth(url) { 
  //helper to construct echo/oauth headers from URL
    var oauth = new OAuth('https://api.twitter.com/oauth/request_token',
      'https://api.twitter.com/oauth/access_token',
      "AAAAAAAAAAAAAAAAAAAA",
      //test app token
      "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB", 
      //test app secret
    '1.0A',
    null,
      'HMAC-SHA1');
    var orderedParams = oauth._prepareParameters(
      "1111111111-AAAAAA", //test user token
    "AAAAAAAAAAAAAAAAAAAAAAA", //test user secret
    "GET",
    url
    );
    return oauth._buildAuthorizationHeaders(orderedParams);
  }

From your consumer code you can maker request with superagent or other http client library (e.g., node.js core http module’s http.request):

var request = require('super agent');

request.post('your delegator api url')
  .send({...}) 	
  //your json data
  .set(
    'x-auth-service-provider',
    'https://api.twitter.com/1.1/account/verify_credentials.json')
  .set(
    'x-verify-credentials-authorization',
    getEchoAuth("https://api.twitter.com/1.1/account/verify_credentials.json"))
  .end(function(res){console.log(res.body)});

OAuth2

OAuth 2.0 is a breeze to use comparing to the other authentication methods. Some argue that it’s not as secure, so make sure that you use SSL and HTTPS for all requests.

 var OAuth2 = OAuth.OAuth2;    
 var twitterConsumerKey = 'your key';
 var twitterConsumerSecret = 'your secret';
 var oauth2 = new OAuth2(
   twitterconsumerKey,
   twitterConsumerSecret, 
   'https://api.twitter.com/', 
   null,
   'oauth2/token', 
   null);
 oauth2.getOAuthAccessToken(
   '',
   {'grant_type':'client_credentials'},
   function (e, access_token, refresh_token, results){
     console.log('bearer: ',access_token);
     oauth2.get('protected url', 
       access_token, function(e,data,res) {
         if (e) return callback(e, null);
         if (res.statusCode!=200) 
           return callback(new Error(
             'OAuth2 request failed: '+
             res.statusCode),null);
         try {
           data = JSON.parse(data);        
         }
         catch (e){
           return callback(e, null);
         }
         return callback(e, data);
      });
   });

Please note the JSON.parse() function, oauth module returns string, not a JavaScript object.

Consumers of OAuth2 don’t need to fetch the bearer/access token for every request. It’s okay to do it once and save value in the database. Therefore, we can make requests to protected resources (i.e. Twitter API v.1.1) with only one secret password. For more information check out Twitter application only auth.

Node.js oauth API

Node.js oauth OAuth

oauth.OAuth()

Parameters:

  • requestUrl
  • accessUrl
  • consumerKey
  • consumerSecret
  • version
  • authorize_callback
  • signatureMethod
  • nonceSize
  • customHeaders

Node.js oauth OAuthEcho

oauth.OAuthEcho()

Parameters:

  • realm
  • verify_credentials
  • consumerKey
  • consumerSecret
  • version
  • signatureMethod
  • nonceSize
  • customHeaders

OAuthEcho sharers the same methods as OAuth

Node.js oauth Methods

Secure HTTP request methods for OAuth and OAuthEcho classes:

OAuth.get()

Parameters:

  • url
  • oauth_token
  • oauth_token_secret
  • callback

OAuth.delete()

Parameters:

  • url
  • oauth_token
  • oauth_token_secret
  • callback

OAuth.put()

Parameters:

  • url
  • oauth_token
  • oauth_token_secret
  • post_body
  • post_content_type
  • callback

OAuth.post()

Parameters:

  • url
  • oauth_token
  • oauth_token_secret
  • post_body
  • post_content_type
  • callback

https://github.com/ciaranj/node-oauth/blob/master/lib/oauth.js

Node.js oauth OAuth2

OAuth2 Class

OAuth2()

Parameters:

  • clientId
  • clientSecret
  • baseSite
  • authorizePath
  • accessTokenPath
  • customHeaders

OAuth2.getOAuthAccessToken()

Parameters:

  • code
  • params
  • callback

OAuth2.get()

Parameters:

  • url
  • access_token
  • callback

https://github.com/ciaranj/node-oauth/blob/master/lib/oauth2.js

The authors of node.js oauth did a great job but currently there are 32 open pull requests (mine is one of them) and it makes me sad. Please let them know that we care about improving Node.js ecosystem of modules and developers community!

UPDATE: Pull request was successfully merged!

Useful Twitter API v1.1 Resources

Just because they are vast and not always easy to find.

Tools

Decreasing 64-bit Tweet ID in JavaScript

As some of you might know, JavaScript is only able to handle integers up to 53-bit in size. This post, Working with large integers in JavaScript (which is a part of Numbers series) does a great job at explaining general concepts on dealing with large numbers in JS.

64-bit Tweet ID is "rounded" in JS

64-bit Tweet ID is “rounded” in JS

I had to do some research on the topic when I was re-writing some JavaScript code responsible for handling Twitter search in Storify editor: we had tweet duplicates in results! In this article, Working with Timelines, Twitter official documentation says:

Environments where a Tweet ID cannot be represented as an integer with 64 bits of precision (such as JavaScript) should skip this step.

So true, because id and id_str fields in a Twitter API response were different. Apparently, JavaScript engine just “rounds” inappropriately large numbers. :-( The task was complicated by the fact that I needed to subtract 1 from the last tweet’s ID to prevent its reappearance in a second search response. After the subtraction I could have easily passed the value to max_id parameter of Twitter API.

I’ve come across different solutions, but decided to write my own function which is simple to understand and not heavy on resources. Here is a script to decrease tweet ID which is a 64-bit number in JavaScript without libraries or recursion, to use with max_id or since_id in Twitter API:

function decStrNum (n) {
    n = n.toString();
    var result=n;
    var i=n.length-1;
    while (i>-1) {
      if (n[i]==="0") {
        result=result.substring(0,i)+"9"+result.substring(i+1);
        i --;
      }
      else {
        result=result.substring(0,i)+(parseInt(n[i],10)-1).toString()+result.substring(i+1);
        return result;
      }
    }
    return result;
}

To check if it works, you can run these logs:

console.log("290904187124985850");
console.log(decStrNum("290904187124985850"));
console.log("290904187124985851");
console.log(decStrNum("290904187124985851"));
console.log("290904187124985800");
console.log(decStrNum("290904187124985800"));
console.log("000000000000000001");
console.log(decStrNum("0000000000000000001"));

Alternative solution which I’ve found in a StackOverflow question was suggested by Bob Lauer, but it involves recursion and IMHO is more complicated:

function decrementHugeNumberBy1(n) {
    // make sure s is a string, as we can't do math on numbers over a certain size
    n = n.toString();
    var allButLast = n.substr(0, n.length - 1);
    var lastNumber = n.substr(n.length - 1);

    if (lastNumber === "0") {
        return decrementHugeNumberBy1(allButLast) + "9";
    }
    else {      
        var finalResult = allButLast + (parseInt(lastNumber, 10) - 1).toString();
        return trimLeft(finalResult, "0");
    }
}

function trimLeft(s, c) {
    var i = 0;
    while (i < s.length && s[i] === c) {
        i++;
    }

    return s.substring(i);
}

Now, if you’re the type of person who likes to shoot sparrows with a howitzer, there are full-blown libraries to handle operations on large numbers in JavaScript; just to name a few: BigInteger, js-numbers and javascript-bignum.

MongoDB migration with Node and Monk

Recently one of our top users complained that their Storify account was unaccessible. We’ve checked the production database and it appeares to be that the account might have been compromised and maliciously deleted by somebody using user’s account credentials. Thanks to a great MongoHQ service, we had a backup database in less than 15 minutes.
There were two options to proceed with the migration:

  1. Mongo shell script
  2. Node.js program

Because Storify user account deletion involves deletion of all related objects — identities, relationships (followers, subscriptions), likes, stories — we’ve decided to proceed with the latter option. It worked perfectly, and here is a simplified version which you can use as a boilerplate for MongoDB migration (also at gist.github.com/4516139).

Restoring MongoDB Records

Restoring MongoDB Records

Let’s load all the modules we need: Monk, Progress, Async, and MongoDB:

var async = require('async');
var ProgressBar = require('progress');
var monk = require('monk');
var ObjectId=require('mongodb').ObjectID;

By the way, made by LeanBoost, Monk is a tiny layer that provides simple yet substantial usability improvements for MongoDB usage within Node.JS.

Monk takes connection string in the following format:

username:password@dbhost:port/database

So we can create the following objects:

var dest = monk('localhost:27017/storify_localhost');
var backup = monk('localhost:27017/storify_backup');

We need to know the object ID which we want to restore:

var userId = ObjectId(YOUR-OBJECT-ID); 

This is a handy restore function which we can reuse to restore objects from related collections by specifying query (for more on MongoDB queries go to post Querying 20M-Record MongoDB Collection. To call it, just pass a name of the collection as a string, e.g., "stories" and a query which associates objects from this collection with your main object, e.g., {userId:user.id}. The progress bar is needed to show us nice visuals in the terminal.

var restore = function(collection, query, callback){
  console.info('restoring from ' + collection);
  var q = query;
  backup.get(collection).count(q, function(e, n) {
    console.log('found '+n+' '+collection);
    if (e) console.error(e);
    var bar = new ProgressBar('[:bar] :current/:total :percent :etas', { total: n-1, width: 40 })
    var tick = function(e) {
      if (e) {
        console.error(e);
        bar.tick();
      }
      else {
        bar.tick();
      }
      if (bar.complete) {
        console.log();
        console.log('restoring '+collection+' is completed');
        callback();                
      }
    };
    if (n>0){
      console.log('adding '+ n+ ' '+collection);
      backup.get(collection).find(q, { stream: true }).each(function(element) {
        dest.get(collection).insert(element, tick);
      });        
    } else {
      callback();
    }
  });
}

Now we can use async to call the restore function mentioned above:

async.series({
  restoreUser: function(callback){   // import user element
    backup.get('users').find({_id:userId}, { stream: true, limit: 1 }).each(function(user) {
      dest.get('users').insert(user, function(e){
        if (e) {
          console.log(e);
        }
        else {
          console.log('resored user: '+ user.username);
        }
        callback();
      });
    });
  },

  restoreIdentity: function(callback){  
    restore('identities',{
      userid:userId
    }, callback);
  },

  restoreStories: function(callback){
    restore('stories', {authorid:userId}, callback);
  }

  }, function(e) {
  console.log();
  console.log('restoring is completed!');
  process.exit(1);
});

The full code is available at gist.github.com/4516139 and here:

var async = require('async');
var ProgressBar = require('progress');
var monk = require('monk');
var ms = require('ms');
var ObjectId=require('mongodb').ObjectID;

var dest = monk('localhost:27017/storify_localhost');
var backup = monk('localhost:27017/storify_backup');

var userId = ObjectId(YOUR-OBJECT-ID); // monk should have auto casting but we need it for queries

var restore = function(collection, query, callback){
  console.info('restoring from ' + collection);
  var q = query;
  backup.get(collection).count(q, function(e, n) {
    console.log('found '+n+' '+collection);
    if (e) console.error(e);
    var bar = new ProgressBar('[:bar] :current/:total :percent :etas', { total: n-1, width: 40 })
    var tick = function(e) {
      if (e) {
        console.error(e);
        bar.tick();
      }
      else {
        bar.tick();
      }
      if (bar.complete) {
        console.log();
        console.log('restoring '+collection+' is completed');
        callback();                
      }
    };
    if (n>0){
      console.log('adding '+ n+ ' '+collection);
      backup.get(collection).find(q, { stream: true }).each(function(element) {
        dest.get(collection).insert(element, tick);
      });        
    } else {
      callback();
    }
  });
}

async.series({
  restoreUser: function(callback){   // import user element
    backup.get('users').find({_id:userId}, { stream: true, limit: 1 }).each(function(user) {
      dest.get('users').insert(user, function(e){
        if (e) {
          console.log(e);
        }
        else {
          console.log('resored user: '+ user.username);
        }
        callback();
      });
    });
  },

  restoreIdentity: function(callback){  
    restore('identities',{
      userid:userId
    }, callback);
  },

  restoreStories: function(callback){
    restore('stories', {authorid:userId}, callback);
  }

  }, function(e) {
  console.log();
  console.log('restoring is completed!');
  process.exit(1);
});
           

To launch it, run npm install/update and change hard-coded database values.

Querying 20M-Record MongoDB Collection

Storify saves a lot of meta data about social elements: tweets, Facebook status updates, blog posts, news articles, etc. MongoDB is great for storing such unstructured data but last week I had to fix some inconsistency in 20-million-record Elements collection.

The script was simple: find elements, see if there are no dependencies, delete orphan elements, neveretheless it was timing out or just becoming unresponsive. After a few hours of running different modifications I came up with the working solution.

Here are some of the suggestions when dealing with big collections on Node.js + MongoDB stack:

Befriend Shell

Interactive shell, or mongo, is a good place to start. To launch it, just type mongo in your terminal window:

$ mongo

Assuming you have correct paths set-up during your MongoDB installation, the command will start the shell and present angle brace.

>

Use JS files

To execute JavaScript file in a Mongo shell run:

$ mongo fix.js --shell

Queries look the same:

db.elements.find({...}).limit(10).forEach(printjson);

To output results use:

print();

or

printjson();

To connect to a database:

db = connect("<host>:<port>/<dbname>")

Break Down

Separate your query into a few scripts with smaller queries. You can output each script to a file (as JSON or CSV) and then look at the output and see if your script is doing what it is actually supposed to do.

To execute JavaScript file (fix.js) and output results into another file (fix.txt) instead of the screen, use:

$ mongo fix.js > fix.txt --shell

or

$ mongo --quiet fix.js > fix.txt --shell

Check count()

Simply run count() to see the number of elements in the collection:

 db.collection.count();

or a cursor:

 db.collection.find({…}).count();

Use limit()

You can apply limit() function to your cursor without modifying anything else in a script to test the output without spending too much time waiting for the whole result.

For example:

 db.find({…}).limit(10).forEach(function() {…});

or

 db.find({…}).limit(1).forEach(function() {…});

is better than using:

 db.findOne({…})

because findOne() returns single document while find() and limit() still returns a cursor.

Hit Index

hint() index will allow you to manually use particular index:

 db.elemetns.find({…}).hint({active:1, status:1, slug:1});

Make sure you have actual indexes with ensureIndex():

 db.collection.ensureIndex({…})

Narrow Down

Use additional criteria such as $ne, $where, $in, e.g.:

db.elements.find({ $and:[{type:'link'}
  ,{"source.href":{$exists:true}}
  ,{'date.created':{$gt: new Date("November 30 2012")}}
  ,{$where: function () {
    if (this.meta&&this.data&&this.data&&this.data.link) {
      return this.meta.title!=this.data.link.title;
    } else {
      return false;
    }}} 
  , {'date.created': {$lt: new Date("December 2 2012")}}]}).forEach(function(e, index, array){
    print(e._id.str);
    });

My First Week At Storify

Last week I joined Storify — a destination for curated social media news. Storify helps you sort through the noise to find the voices online that matter. To find more about Storify take a look at the guided tour.

Storify co-founder Burt and I met a couple months ago for the first time and I’m glad that we did. There were three main reasons for me to come on board: great team, awesome product and company vision, and cool tech stack that I’m passionate about: Node.js+Express+MongoDB.

Storify on Nodejs.org

Storify on Nodejs.org

The first week at Storify exceeded my expectations! So far there were: 4 team lunches, one birthday party, two (!) break-ins. In addition, I’ve worked on the front-page on my second day and had a chance to SSH to production servers.

A few word about the office, besides free snacks and espresso and being close to everything, there are two other startups, Buffer and HomeLight. The funny thing is that I’ve discovered and fallen in love with Buffer just a few weeks ago and now I’ve met with Leo and sit next to their brilliant team!

By the way, Storify is hiring bright minds: Operations Engineer and Front-End Engineer. If you want to do work on interesting things check out full job description.