Tag Archives: MongoDB

Querying 20M-Record MongoDB Collection

Storify saves a lot of meta data about social elements: tweets, Facebook status updates, blog posts, news articles, etc. MongoDB is great for storing such unstructured data but last week I had to fix some inconsistency in 20-million-record Elements collection.

The script was simple: find elements, see if there are no dependencies, delete orphan elements, neveretheless it was timing out or just becoming unresponsive. After a few hours of running different modifications I came up with the working solution.

Here are some of the suggestions when dealing with big collections on Node.js + MongoDB stack:

Befriend Shell

Interactive shell, or mongo, is a good place to start. To launch it, just type mongo in your terminal window:

$ mongo

Assuming you have correct paths set-up during your MongoDB installation, the command will start the shell and present angle brace.

>

Use JS files

To execute JavaScript file in a Mongo shell run:

$ mongo fix.js --shell

Queries look the same:

db.elements.find({...}).limit(10).forEach(printjson);

To output results use:

print();

or

printjson();

To connect to a database:

db = connect("<host>:<port>/<dbname>")

Break Down

Separate your query into a few scripts with smaller queries. You can output each script to a file (as JSON or CSV) and then look at the output and see if your script is doing what it is actually supposed to do.

To execute JavaScript file (fix.js) and output results into another file (fix.txt) instead of the screen, use:

$ mongo fix.js > fix.txt --shell

or

$ mongo --quiet fix.js > fix.txt --shell

Check count()

Simply run count() to see the number of elements in the collection:

 db.collection.count();

or a cursor:

 db.collection.find({…}).count();

Use limit()

You can apply limit() function to your cursor without modifying anything else in a script to test the output without spending too much time waiting for the whole result.

For example:

 db.find({…}).limit(10).forEach(function() {…});

or

 db.find({…}).limit(1).forEach(function() {…});

is better than using:

 db.findOne({…})

because findOne() returns single document while find() and limit() still returns a cursor.

Hit Index

hint() index will allow you to manually use particular index:

 db.elemetns.find({…}).hint({active:1, status:1, slug:1});

Make sure you have actual indexes with ensureIndex():

 db.collection.ensureIndex({…})

Narrow Down

Use additional criteria such as $ne, $where, $in, e.g.:

db.elements.find({ $and:[{type:'link'}
  ,{"source.href":{$exists:true}}
  ,{'date.created':{$gt: new Date("November 30 2012")}}
  ,{$where: function () {
    if (this.meta&&this.data&&this.data&&this.data.link) {
      return this.meta.title!=this.data.link.title;
    } else {
      return false;
    }}} 
  , {'date.created': {$lt: new Date("December 2 2012")}}]}).forEach(function(e, index, array){
    print(e._id.str);
    });

Rapid Prototyping with JS is out!

The Book is on LeanPub

Rapid Prototyping with JS is a hands-on book which introduces you to rapid software prototyping using the latest cutting-edge web and mobile technologies including NodeJS, MongoDB, BackboneJS, Twitter Bootstrap, LESS, jQuery, Parse.com, Heroku and others.

The book has 84 pages (in PDF format) or 13,616 words to be precise, step-by-step set-up, best practice advices, web development overview, 11 code examples (also available ready-to-go in GitHub repository azat-co/rpjs), flexible pricing ($9.99–19.99).

Order your copy of Rapid Prototyping with JS at LeanPub: leanpub.com/rapid-prototyping-with-js.

Rapid Prototyping with JS

Rapid Prototyping with JS: Learn how to build web and mobile apps using JavaScript and Node.js

LeanPub platform allows readers to receive infinite future updates (current version of the book is 0.3) and read the book in the most popular digital formats: PDF, ePub/iPad, MOBI/Kindle. The PDF version has footnote links which make it suitable for printing.

Download a free sample at samples.leanpub.com/rapid-prototyping-with-js-sample.pdf.

What Readers Say

Rapid Prototyping with JS is being successfully used at StartupMonthly as a training manual. Here are some of our trainees’ testimonials:

“Thanks a lot to all and special thanks to Azat and Yuri. I enjoyed it a lot and felt motivated to work hard to know these technologies.” — Shelly Arora

“Thanks for putting this workshop together this weekend… what we did with Bootstrap + Parse was really quick & awesome.” — Mariya Yao

“Thanks Yuri and all of you folks. It was a great session – very educative, and it certainly helped me brush up on my Javascript skills. Look forward to seeing/working with you in the future.” — Sam Sur

Who This Book is For

The book is designed for advanced-beginner and intermediate level web and mobile developers: somebody who has just started programming and somebody who is an expert in other languages like Ruby on Rails, PHP, and Java and wants to learn JavaScript and Node.js.

Rapid Prototyping with JS, as you can tell from the name, is about taking your idea to a functional prototype in the form of a web or a mobile application as fast as possible. This thinking adheres to the Lean Startup methodology. Therefore, this book would be more valuable to startup founders, but big companies’ employees might also find it useful, especially if they plan to add new skills to their resume.

Prerequisite

Mac OS X or UNIX/Linux systems are highly recommended for this book’s examples and for web development in general, although it’s still possible to hack your way on a Windows-based system.

Contents

Acknowledgment

Introduction

  1. Who This Book is For
  2. Prerequisite
  3. What to Expect
  4. Notation
  5. Web Basics: Hyper Text Markup Language, Cascading Style Sheets, JavaScript
  6. Agile Methodologies: Scrum, Test-Driven Development, Continuous Deployment, Paired Programming
  7. Node.js
  8. NoSQL and MongoDB
  9. Cloud Computing
  10. HTTP Requests and Responses
  11. RESTful API

Getting Started

  1. Development Folder
  2. Browsers
  3. IDEs and Text Editors
  4. Version Control Systems
  5. Local HTTP Servers
  6. Database: MongoDB
  7. Other Components: NodeJS, jQuery, LESS
  8. SSH Keys
  9. GitHub
  10. Windows Azure
  11. Heroku
  12. Cloud9

Building Front-End Application

  1. JSON
  2. AJAX
  3. Cross-Domain Calls
  4. jQuery
  5. Twitter Bootstrap
  6. LESS
  7. BackboneJS
  8. Example of using Twitter REST API and jQuery
  9. Parse.com
  10. Message Board with Parse.com
  11. Message Board with Parse.com: REST API and jQuery version
  12. Pushing to GitHub
  13. Deployment to Windows Azure
  14. Deployment to Heroku
  15. Message Board with Parse.com: JavaScript SDK and BackboneJS version
  16. Deploying Message Board to PaaS
  17. Enhancing Message Board
  18. Building Back-End Application

Building “Hello World” in NodeJS

  1. NodeJS Core Modules
  2. Node Package Manager
  3. Deploying “Hello World” to PaaS
  4. Deploying to Windows Azure
  5. Deploying to Heroku
  6. Message Board: Run-Time Memory version
  7. Test Case for Message Board
  8. MongoDB Shell
  9. MongoDB Native Driver
  10. MongoDB on Heroku: MongoHQ MongoHQ URL
  11. BSON
  12. Message Board: MongoDB version

Putting it All Together

  1. Different Domain Deployment
  2. Changing Endpoints
  3. Message Board Application
  4. Deployment
  5. Same Domain Deployment

Further Reading

About the Author

Order your copy of Rapid Prototyping with JS at LeanPub: leanpub.com/rapid-prototyping-with-js.

Pilot Rapid Prototyping with JavaScript and NodeJS Class

Traditional Computer Science education sucks big time when it comes to modern agile technologies like Ruby on Rails, Django, NodeJS, and NoSQL databases. Last time I checked, the maximum that was offered were classes in Web Design I, Web Design II and Photoshop Basics. WTF?! Don’t get me wrong. I have Master’s degree in Information Systems Technology and value fundamentals, but I was never taught anything up-to-date. There was some ASP, some C++, some SQL, but most of my learning I had to do on my own. Sure there are tons information online and in books, but not everybody has time, dedication, focus and self-discipline to master a new technical skill this way. Reading a book or watching a screencast is just not enough. The best learning comes from 25% books, 25% peer-to-peer communication and discussion, 25% student-to-teacher relationship; the last 25% is the time and practice on your own.

I saw a huge need for effective technical trainings and decided to validate my idea. I already had plenty of teaching experience from college years, during which I wrote my first textbook, had it published on a curriculum for my classmates a year later, and from teaching yoga classes. I needed a pilot class, so I approached startup accelerator and fund, StartupMonthly, and offered to develop and teach the “Rapid Prototyping with JavaScript and NodeJS” training.

I chose JavaScript and NodeJS because students will be able to use the same language both for front-end and back-end development. Their brains don’t have to switch thus saving time and speeding the learning process. NodeJS is becoming more and more popular due to its real-time support and I’m very passionate about this technology. The training runs over a long weekend, starting on Friday night with an optional Q&A session on setting up your environment. Then, we have two full days on Saturday and Sunday, making the course 16 hours total. This way, people who have full time jobs don’t have to take time off to attend. The class is very hands-on and, as much as possible, inline with the principles of Flipped Teaching.

Rapid Prototyping with JavaScript and NodeJS - Day 1

Day 1

The goal was not to make a profit. So we priced the training very aggressively twice or thrice lower than the market price of our competitors in order to attract students. The results were amazing! The goal was to sell at least 10 seats and we had 15 people in our first class! Big thanks to Yuri Rabinovich, killer StartupMonthly team and its vast network of people interested in technology :)

Rapid Prototyping with JavaScript and NodeJS - Day 2

Day 2

Then the hard work began. In a true spirit of lean startup methodology (hey, this is what we teach, right?) the manual had only a bare minimum of information and was tailored towards intermediate web and JavaScript developers. The majority was doing well, but I couldn’t say that for everyone. This was a good feedback for me, and helped to improve the manual by including many simple steps and additional terminal commands for deployment and Git.

Optimize but not too over optimize

“Optimize, but not over optimize”

Overall, students were tired, but happy with the number of new technologies they’ve tried. It was sort of a Chinese Buffet of Programming. You don’t have to try everything, you only pick what you want and indulge in it :) Here is the list of topics to give you an idea:

  • Agile, Continuous Deployment, TDD, Pair Programming
  • Basic front-end technologies: JavaScript, HTML, CSS
  • NodeJS and its advantages. Event driven programming.
  • MongoDB and Document Store and Key-Value concepts.
  • JSON, structure and examples.
  • Could computing. Cloud platforms: Windows Azure, Heroku.
  • Structure of HTTP Request and Response: headers, body, methods
  • RESTful API, examples and advantages.
  • Overview of HTML: structure, tags and syntax. Inclusion of CSS, JavaScript files/tags.
  • jQuery: AJAX, cross-domain calls and JSONP
  • Twitter Bootstrap: grid layout, form components, icons
  • LESS: mixins, variables and compilation.
  • BackboneJS: structure, events, view, sub-views, models, collections and event listeners and event binding.
  • Parse.com: plain REST API calls with jQuery ajax function and JavaScript SDK with Backbone compatible library.
  • Generating of SSH keys, configuring Git, GitHub, Heroku and Windows Azure for deployment.
  • Installation and basic configuration of NodeJS and MongoDB in local environment.
  • Deployment of NodeJS and MongoDB and static/front-end applications to PaaS cloud services like Windows Azure and Heroku with Git.
  • Building sample applications with NodeJS, jQuery, BackboneJS, Twitter Bootstrap, MongoDB, Parse.com and other tools/technologies. Deploying it to cloud services.
  • Building your own idea/prototype and presenting it. Deploying it to cloud services.
  • Practicing Paired Programming and Test-Driven Development techniques.
Next Billion-Dollar Idea

Next Billion-Dollar Idea

By the end of the weekend, we had 3 teams with 2 to 3 people in each. The teams built or started to build applications using their own ideas. One of them was a remake of Reddit with better UX/UI and the other was a service for angry ex-girlfriends to post (mostly negative I suspect) feedbacks on their ex-boyfriends :)

Here are some testimonials from the students:

“Thanks Yuri and all of you folks. It was a great session – very educative, and it certainly helped me brush up on my Javascript skills. Look forward to seeing/working with you in the future” – Sam Sur.

“Thanks for putting this workshop together this weekend… what we did with Bootstrap + Parse was really quick & awesome” – Mariya Yao.

“Thanks a lot to all and special thanks to Azat and Yuri.
I enjoyed it a lot and felt motivated to work hard to know these technologies” – Shelly Arora.

Q&A Session

Q&A Session

Next weekend, August 10–12 2012, I’m teaching the second class of “Rapid Prototyping with JavaScript and NodeJS”. I’m exited to share my experience and passion with another 10–20 smart people and make a small dent in technical education!

“Advanced Prototyping with JavaScript and NodeJS” and “Mobile Prototyping with JavaScript” trainings are coming on the weekend of August 25–26 2012. We have other cities like Los Angeles and New York in a pipeline and, (knock on wood) the future for “Rapid Prototyping” series looks very promising .