Profile photo
Location
New York, NY

Welcome

I am Alex (Oleksandr) Polozov, a senior staff research scientist at Google, formerly at X, the moonshot factory. I teach machines to write and analyze source code, and broadly interested in program synthesis from examples and natural language, AI-assisted software engineering, and neuro-symbolic reasoning. You might want to check out my publications, notable writing, talks, or blog posts. I also spend an inordinate amount of time complaining about technology and reviewing movies on Twitter.

Previously, I was a principal researcher in the Deep Learning group at Microsoft Research, Redmond. There, I helped create PROSE, a framework for mass-market development of programming-by-example technologies, and shipped multiple program synthesis driven tools.

Before that, I completed my Ph.D. in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, advised by Sumit Gulwani and Zoran Popović. Originally from 🇺🇦 Ukraine.


Latest news

November 2023
Wow, what a year. So much has happened, I lost track of updating news. Briefly:
April 2022
“PaLM: Scaling Language Modeling with Pathways” released on arXiv. Our team at X has collaborated with Google Research on 🌴​ PaLM – a single 540B-parameter dense language model for multiple domains and tasks, trained over two TPUv4 Pods. We created PaLM-Coder – an adaptation of PaLM fine-tuned on code and evaluated on software engineering tasks. As we found out, a single PaLM-Coder model can write code, translate code between languages, follow chains of reasoning, and fix build errors better than dedicated models despite being trained on 11X less code than its closest competitors. Moreover, these remarkable abilities keep improving with scale and further training.
Google AI blog: Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance.
January 2022
“Synchromesh: Reliable code generation from pre-trained language models” will (virtually) appear at ICLR’22.
December 2021
“Neurosymbolic Programming” , our survey of techniques and representations for bridging neural and symbolic approaches to AI and programming, will be published in Foundations and Trends® in Programming Languages. Jointly written with Swarat Chaudhuri (UT Austin), Kevin Ellis (Cornell), Rishabh Singh (Google X), Armando Solar-Lezama (MIT), and Yisong Yue (Caltech).
October 2021
I moved to San Francisco and joined a team at X, the moonshot factory!
August 2021
“Programming Puzzles” will (virtually) appear at NeurIPS’21, the Datasets & Benchmarks track.
May 2021
“KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers” will (virtually) appear at ACL’21.
Check out our blog post “Conversations with data: Advancing the state of the art in language-driven data exploration” at Microsoft Research Blog, summarizing SCoRe , StruG , and RAT-SQL .
March 2021
“Structure-Grounded Pretraining for Text-to-SQL” will (virtually) appear at NAACL’21.
January 2021
“SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing” will (virtually) appear at ICLR’21.
December 2020
“SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing” and “Learning to Infer Run-Time Invariants from Source Code” presented at the Computer-Assisted Programming (CAP) workshop at NeurIPS 2020.
October 2020
“Structure-Grounded Pretraining for Text-to-SQL” released on arXiv.
July 2020
I gave a talk “Neuro-Symbolic Program Synthesis from Natural Language and Demonstrations” at the 9th Workshop on Synthesis (SYNT).
June 2020
“Neuro-Symbolic Visual Reasoning: Disentangling “Visual” from “Reasoning”” will (virtually) appear at ICML’20.
April 2020
“Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context” will (virtually) appear at ACL’20.
November 2019
“RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers” released on arXiv.
September 2019
I gave a talk “From Examples to Natural Language and Back” at the “State of the Art in Program Synthesis” workshop hosted by Synthetic Minds.
“Program Synthesis and Semantic Parsing with Learned Code Idioms” to appear at NeurIPS’19.
July 2019
I gave a talk on “Program Understanding, Synthesis, and Verification with Graph Neural Networks” at the Learning & Reasoning with Graph-Structured Representations workshop at ICML 2019. Talk recording and slides are available online.
June 2019
“Program Synthesis and Semantic Parsing with Learned Code Idioms” released on arXiv.
May 2019
At ICLR 2019 in New Orleans, we presented our recent work on generative code modeling with GNNs . Also, Gustavo Soares and I showed a first public demo of a our new tool for automating repetitive source code editing on the fly, powered by the PROSE framework.
March 2019
“Are My Invariants Valid? A Learning Approach” released on arXiv.
December 2018
“Generative Code Modeling with Graphs” to appear at ICLR’19.
September 2018
Our new neuro-symbolic technique, execution-guided decoding , has helped two Microsoft Research models to take the top two spots on the WikiSQL leaderboard!
“IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles” released on arXiv.
“Robust Text-to-SQL Generation with Execution-Guided Decoding” released on arXiv.
July 2018
New blog post: “Program Synthesis in 2017-18” .
June 2018
“Execution-Guided Neural Program Decoding” to appear at NAMPI’18.
FlashProfile to appear at OOPSLA’18.
New site layout.
May 2018
“Generative Code Modeling with Graphs” released on arXiv.
April 2018
I will be attending ICLR 2018 in Vancouver to present our work on neural-guided deductive search . Let me know if you want to meet up!
February 2018
Presented “Program Synthesis via Neural-Guided Deductive Search” at the Machine Learning + Programming Languages Workshop at UW.
Neural-Guided Deductive Search to appear at ICLR’18.