Learning to Manipulate Symbols

February 18, 2015
Speaker: Wojciech Zaremba, Doctoral Candidate, Courant Institute, NYU/Google


Machine learning approaches have proven highly effective for statistical pattern recognition problems, such as those encountered in speech or vision. However, contemporary statistical models barely deal with high level reasoning problems, and symbolic manipulation. I consider two tasks, which require such skills (1) finding mathematical identities, and (2) evaluating computer programs.

We explore how learning can be applied to the discovery of mathematical identities. Specifically, we propose methods for finding computationally efficient versions of a given target expression. That is, finding a new expression which computes an identical result to the target, but has a lower complexity (in time and/or space).

Execution of computer programs requires dealing with multiple nontrivial concepts. To execute a program, a system has to understand numerical operations, the branching of if-statements, the assignments of variables, the compositionality of operations, and many more. We show that Recurrent Neural Networks (RNN) with Long-Short Term Memory (LSTM) units can accurately evaluate short simple programs. The LSTM reads the program character-by-character and computes the program's output.

Hosted by Colin Raffel.

500 W. 120th St., Mudd 1310, New York, NY 10027    212-854-3105               
©2014 Columbia University