Vishal Misra Develops Revolutionary Search Tool for Data-Hungry ESPN Cricket Fans
Misra’s “Ask Here First” offers easy data access for all users and can be applied to any data-rich field, from sports and e-commerce to investment banking
ESPNcricinfo has been the most popular single sports website in the world for decades and now millions of avid cricket fans have an exciting new tool--“AskCricinfo,” an easy and fast way to access the vast amount of ESPNcricinfo data. “AskCricinfo” helps cricket fans quickly find answers to all kinds of questions, from “when was the last time three left-arm pacers played for a team in an IPL game?" to “what is the record of Babar Azam against wrist spinners in the middle overs in the PSL?” to “Virat Kohli vs Jasprit Bumrah and Kagiso Rabada.” The recently launched search engine answers queries on scoring data and the beta version is now up on the web for cricket fans to geek out over.
Columbia Engineering's Vishal Misra developed the technology for the search engine together with the stats and product teams at ESPNcricinfo. Cricket is a statistics-rich sport and while ESPNcricinfo has a huge database known as Statsguru, it is accessed only rarely, mostly by true nerds comfortable with its somewhat daunting interface.
The original Cricinfo website was co-founded in the 1990s by Misra, a longtime cricket fan who is a professor of computer science and electrical engineering. He is also friends with veteran cricket journalist Sambit Bal, who happens to be the editor in chief of ESPNcricinfo.
Our new technology will democratize access to all kinds of data, not just cricket statistics--everyone can extract deep insights from structured databases just by ‘asking’ it in English, through text or voice.
Vishal Misra
Professor of Computer Science and Electrical Engineering
“Sambit and I were chatting last year about why the Statsguru database is so difficult to use and if there was anything we could do to improve it,” said Misra, who is also a member of Columbia’s Data Science Institute.
Weeks after the conversation, a new deep-learning language model GPT-3 was released and Misra saw on Twitter how the model was able to translate a natural language question (i.e. questions in English) into a structured query language (SQL) that machines can understand.
“I thought I could apply the capabilities of newer transformer-based language models like GPT-3 to this ESPNCricinfo database project,” Misra explained, “but after I started playing with GPT-3, I realized it could solve only 5% of the problem on its own. After struggling with GPT-3 for a few days I had a ‘eureka moment’ when I developed an architecture where I could leverage language models like GPT-3 to do part of the task, and use other language modeling and parsing techniques to push that 5% closer to 100%.”
Misra recognized that the architecture required an intimate knowledge of both language models as well as parsing and compilers so he expanded his technical team and spun out a company called “Ask Here First” and got to work. He turned to two of his former PhD students Abhinav Kamra and Hanhua Feng who graduated back in 2007 and have extensive experience building commercial products for help. Said Misra, “Abhinav, Hanhua, and I form the core technical team of the company. We speak the same mathematical language, complement each other’s skills, and make rapid progress on ideas with intellectually stimulating brainstorming sessions.”
Extracting information from databases is often unwieldy and deals with cumbersome web form interfaces--many databases require an analyst who can use specialized programming languages such as SQL, Java, C++, or Python to access the answer. With his architecture, Misra saw that he could turn the problem around, and take human input--questions or search queries--and convert the queries into a structured language like SQL that machines understand. His “Ask Here First” technology can “translate” the question into a programming language, find the answer, and quickly send it back to the user.
The model’s first application in industry is with ESPNcricinfo’s AskCricinfo search engine which can now answer search queries about cricket scoring data and more in a matter of seconds.
Misra has developed a very broad natural language interface that works not just for AskCricinfo, but also for any kind of backend (e.g. SQL, NoSQL, REST, or any other application programming interface (API) which a backend database understands), whether it be sports, e-commerce, or finance. And anyone can get started in less than 10 minutes with just a few training examples. Misra applied for a patent in September 2020 and, along with releasing the query engine on ESPNCricinfo, is currently in talks and test deployments with several major Fortune-20 companies.
“There is always talk about data, how much there is, how it can be harnessed, what can and cannot be done with it,” said Misra. “Our new technology will democratize access to all kinds of data, not just cricket statistics—everyone can extract deep insights from structured databases just by ‘asking’ it in English, through text or voice. ‘Ask Here First’ will enable everyone around the world—from casual fans of a game to C-level executives—to use data to its full potential.”