another year, another bot

My 2015 has been filled with long and arduous international and domestic travel, plenty of family business, a cross-country move, and a deferred to-do list a mile high. I haven't been keeping regular hours since the beginning of the year, so I'm glad that the dust has settled and I have a little free time to learn and work on new things again.

I finally made a version of the bang bot that is police-oriented (and less pervy-sounding). The Washington Post's fatal police shootings database is reliably updated, and it's on GitHub so it was much easier for me to get raw data from (otherwise I would have to do some manual scraping-- not hard, just time-consuming).

The bot's checking mechanism was the simplest one I could come up with:

  1. Pull the latest .csv file on WaPo's GitHub
  2. Check if the number of lines is greater than that of the last stored version, and
  3. Tweet that number if it is.

The implementation's still got a few kinks (apparently you can't use Twitter's API to tweet the same thing twice, which may be a problem soon), but my previous experience with Node made building it fairly easy.

It was the deployment that gave me some trouble. I used Heroku to run my app this time, and it took me a while to realize that the reason my files weren't properly saving was because Heroku workers don't do data persistence (derp). So I had to pick a data storage addon. I would have liked to use MongoDB to make the app JS end-to-end, but it wasn't free. And I'm pretty familiar with Postgres after having used it for a class earlier this year, but I ultimately went with Redis, another NoSQL choice that's fast and has an easy-to-use Node.js client. I hear that relational databases are the way to go if you have a data structure that's even remotely complicated, but I only had to store two things.

A screenshot of callback hell. Can you tell I'm using Node?

Besides storage, I still haven't quite gotten the hang of JavaScript's asynchronous processes yet (please refer to the callback hell in the screenshot above). I tried to keep things readable, but I think I'm gonna just bite the bullet next time and use async.

The reason I started this project was to get an update on Twitter whenever WaPo confirms another death by police shooting, and I'm glad I reached that goal. But seeing the bot tweet for the first time on production was the first time I had ever felt sad about seeing something I built actually work.

Now that I'm getting into full-stack development, I'm exploring what else I can do with open data. My final project for the back-end class that I took this spring used data made available by San Francisco's Human Services Agency. Similar municipal open data initiatives are happening in New York City, which is where I'm based now. I hope to build bigger and more useful applications with some of this information in the future.