Technical Posts

Building a story-telling Twitter bot

Background Recently, I came across this blog post written by Vicki Boykis in which she documented her process of building a Twitter bot that tweets Soviet artworks in scheduled intervals. Inspired by the idea (and motivated by boredom), I decided to build a Twitter bot myself that tells questionably coherent stories through a series of tweets. I loved this idea because first of all, I love literature, especially the classics.

Do I really need attention in my seq2seq model?

Background Since the origin of the idea of attention (Bahdanau et al., 2015), it has become a norm to try to insert it in a seq2seq model, especially in translations. It is such an intuitive and powerful idea (not to mention the added benefit of peaking into an otherwise blackbox model) that many tutorials and blog posts made it sound like one should not even bother with a model without it as the results would for sure be inferior.

Second Language Acquisition Modeling using Duolingo's Data

Background Recently, I have discovered the Second Language Acquisition Modeling (SLAM) challenge hosted by Duolingo earlier this year, where they had asked the participants to predict the per-token error rate for a given language learner based on his/her past learning history. To conclude the competition, the team has also written a paper to summarize the results and the approaches that were taken by the various participants and their respective effectiveness.

Second attempt at building a language translator

Background A few weeks ago, I experimented with building a language translator using a simple sequence-to-sequence model. Since then, I had been itchy to add an extra attention layer to it that I had been reading so much about. After many, many research, I came across (quite accidentally) this MOOC series offered by fast.ai, where on Lesson 13, instructor Jeremy Howard walked the students through a practical implementation of the attention mechanism using PyTorch.

First attempt at building a language translator

Background After having tried my hands on LSTM and built a text generater, I became interested in the sequence-to-sequence models, particularly their applications in language translations. It all started with this TensorFlow tutorial where the authors demonstrated how they built an English-to-French translator using such a model and successfully translated “Who is the president of the United States?” into French with the correct grammar (“Qui est le président des États-Unis?

When Jane Austen, Oscar Wilde, and F. Scott Fitzgerald walk into a bar

Background Lately I’ve been spending a lot of time learning about deep learning, particularly its applications in natural language processing, a field I have been immensely interested in. Before deep learning, my foray into NLP has been mainly about sentiment analysis1 and topic modeling.2 These projects are fun but they are all limited in analyzing an existing corpus of text whereas I’m also interested in applications that generate texts themselves.

An analysis of foreign language learning using Duolingo's data

Background Earlier this year, I decided to learn French, something I’ve been thinking about for a long time. Foreign language learning has always been something magical to me: I had a great time learning English when I was at school (my mother tongue is Mandarin Chinese), so much that I would devote all my time to it and ignore my other subjects (not recommended). Hence, when I signed up for a beginner’s class in my local Alliance Française and started taking classes regularly, it felt homecoming to me.

Predicting Yelp ratings using textual reviews

Internet is truly full of free and fascinating datasets! I found this Yelp Dataset Challenge the other day that includes, among others, over 1 million reviews (most of which are recent) along with their respective 5-star ratings - excellent text mining material! Although to enter the competition (which ends on 12/31/14), you have to be a current student (which I’m not), but everyone is welcome to play around with the data.

Notes to machine learning

Data transformation Resampling techniques Regression models Smoothing Neural networks Support vector machines K-nearest neighbors Trees Random forests Gradient boosting trees Cubist Measuring performace in classification models Linear classificatoin models Latent Dirichlet allocation These are the notes I took while reading An Introduction to Statistical Learning and Applied Predictive Modeling. Some of the notes also came from other sources, but the majority of them are from these two books.

Quora challenge - answer classification

Last night I tried my hands on a Quora challenge that classifies user-submitted answers into ‘good’ and ‘bad.’ All the information is anonymized, including the variable names, but you can tell by looking at their values what some of them may represent. For example, some appear to be count data or some summary statistics based on them, and, given that many of the values are 0 and heavily right-skewed, they seem to be some measure of the writers’ reputations, the number of upvotes an answer received, or the follow-up comments.