Natural Language Processing at Foursquare

Last month, Maryam Aly and I gave a workshop for NYU tech week where we spoke about how Natural Language Processing is integrated into the Foursquare app and our technology stack. Later, we gave the students a hands-on introduction to the nltk toolkit. Hakka Labs took video of the first part of the workshop and … Continue reading Natural Language Processing at Foursquare

On this 4sqDay, we’re celebrating our amazing global community of superusers. Happy 4sqDay!

On this 4sqDay, we’re celebrating our amazing global community of superusers. Happy 4sqDay! foursquare: Five years ago on 4/16, some of Foursquare’s biggest fans in Tampa started what would become the tradition of 4sqDay (the day was chosen because it’s 4/4²… four squared!) It quickly became a day to gather as a community and celebrate … Continue reading On this 4sqDay, we’re celebrating our amazing global community of superusers. Happy 4sqDay!

Digging into the Dirichlet Distribution

This is a link to my talk on the Dirichlet Distribution at the machine learning meetup: http://www.hakkalabs.co/articles/the-dirichlet-distribution/ The open source project I’ve referenced lives here: https://github.com/maxsklar/BayesPy Feel free to jump in if you’re interested!  I have a paper on it that unfortunately did not get accepted to aistats (they cited lack of impact; I disagree).  I’ll … Continue reading Digging into the Dirichlet Distribution

Data Gotham Slides

By popular demand, here are my slides from Data Gotham. The video is posted below, but the slides are more readable on their own.

Terrible statistical writing: NatGeo Global Warming Article

I find it really hard to inform myself on certain issues when the mathematical arguments presented in articles just make no sense. I was reading this recent article by national geographic called “does the ‘global warming pause’ debate miss big picture” It starts out by stating that there’s been a decrease in the rate of … Continue reading Terrible statistical writing: NatGeo Global Warming Article

DataGotham 2013 Talk, or Japanese vs Russian reviews

/I gave a talk recently at DataGotham (http://www.youtube.com/watch?v=1KfK0zOSo5U), and I’ve gotten a lot of questions about one particular/tr stat that I gave in that talk.  If you write a tip in Russian, then you’re 3 times as likely to hate the place than if you write it in Japanese.  Where does that come from? Well, … Continue reading DataGotham 2013 Talk, or Japanese vs Russian reviews

Casino Random Number Generator

Here’s how it works: the outcome of each round of the game is either 0 or 1.  Before the outcome is decided, players place bets on either side.  The total amounts bet on each side are confidential. After betting is closed, the outcome is calculated as the one with the least amount bet on it. … Continue reading Casino Random Number Generator

Thoughts on Data Science, Local Recommendations, New York City, and Technology Innovation