Python

Predicting Yelp ratings using textual reviews

Internet is truly full of free and fascinating datasets! I found this Yelp Dataset Challenge the other day that includes, among others, over 1 million reviews (most of which are recent) along with their respective 5-star ratings - excellent text mining material! Although to enter the competition (which ends on 12/31/14), you have to be a current student (which I’m not), but everyone is welcome to play around with the data.

Analyzing 八零后 (China's post-80s )

If you are not from China or living there, you are probably not familiar with the term 八零后, or post-80s, but if you are, like me, I think you’ll agree that this is probably one of the most widely used and abused terms in modern China. Quite literally, it refers to Chinese people who were born in the 1980s (me included) and the reason it gained so much attention and exposure as compared to, say, 九零后 (post-90s) or 零零后 (post-00s), I think, stems from the fact that our generation has simply seen and been through way too many things that have never been seen or experienced by prior generations and are simply taken as norms for later ones.

What mining my own emails told me about myself

On Tuesday last week, I attended a data visualization meetup organized by Data Science LA and the topic was about the most recent Eyeo Festival. Of all the talks that Amelia shared with us, what impressed me and inspired me the most was Nicholas Felton’s personal data projects. In case you are not familiar with him, every year he publishes an annual report that documents his personal data projects / experiments conducted throughout the year.

An analysis of Artsy’s twitter followers

This weekend I decided to learn more about twitter and its handy API. My subject of the analysis is Artsy, a fine-art website that provides a pandora-like service. The subjects I was curious to find out are where their followers are from, what their twitter activities are like, what other interests they have, and, specifically, what kind of stereotypes clusters they fall into because, you know, it’s important and I didn’t have anything better to do.