High Dimensional Graphical Model Selection

August 31, 2011
CS open area (Mudd 4th floor)
Speaker: Anima Anandkumar (EECS Dept, U.C. Irvine)
Hosted by:  Columbia EE/CS Networking Seminar


Capturing complex interactions among a large set of variables is a challenging task. Probabilistic graphical models or Markov random fields provide a graph-based framework for capturing such dependencies. Graph estimation is an important task, since it reveals important relationships among the variables. I will present a unified view of graph estimation and propose a simple local algorithm for graph estimation using only low-order statistics of the data. We establish that the algorithm has consistent graph estimation with low sample complexity for a class of graphical models satisfying certain structural and parameter criteria. We explicitly characterize these model classes and point out interesting relationships between the graph structure and the parameter regimes, required for tractable learning. Many graph families such as the classical Erdos-Renyi random graphs, random regular graphs, and the small-world graphs can be learnt efficiently under our framework.

The second part of the work is motivated by the following question: can we discover hidden influences acting on the observed variables? We consider latent tree models for capturing hidden relationships. We develop novel algorithms for learning the unknown high-dimensional latent tree structure. Our algorithm is amenable to efficient implementation of the Bayesian Information Criterion (BIC) to tradeoff the number of hidden variables with the accuracy of the model fitting. Experiment on the S&P 100 financial data reveals sectorization of the companies and experiment on the newsgroups data automatically categorizes words into different topics.

Speaker Biography

Anima Anandkumar has been a faculty at the EECS Dept. at U.C.Irvine since Aug. 2010. She was previously at the Stochastic Systems Group at MIT as a post-doctoral researcher. She received her B.Tech in Electrical Engineering from IIT Madras in 2004 and her PhD from Cornell University in 2009. She is the recipient of the 2011 ACM Sigmetrics Best Paper Award, 2009 ACM Sigmetrics Best Thesis Award, 2008 IEEE Signal Processing Society Young Author Best Paper Award, and 2008 IBM Fran Allen PhD fellowship. Her research interests are in the area of high-dimensional statistics, networking and information theory with a focus on probabilistic graphical models.

500 W. 120th St., Mudd 1310, New York, NY 10027    212-854-3105               
©2019 Columbia University