Phone: +1 (425) 623-4121
Office: CSE 362 


I am Alex (Oleksandr) Polozov, a fifth-year graduate student in the Computer Science & Engineering Department at the University of Washington. My advisors are Sumit Gulwani and Zoran Popović. I work on inductive program synthesis (also known as programming by example), formal logic, and their applications in the areas of education and data wrangling.

I received my B.S. in System Analysis with honors from the National Technical University of Ukraine "Kyiv Polytechnic Institute" in 2012.

I am on the market for research positions in Autumn 2017!
Please check my CV and the Research Statement.


My interests lie in the applications of formal methods (software engineering, program synthesis, answer set programming) for math/programming education, artificial intelligence, and data manipulation. Currently, I am working on the following projects:

  • PROSEa meta-framework for program synthesis in domain-specific languages from inductive specifications (input-output examples)
  • Personalized mathematical word problem generation using answer set programming
  • LaSE: end-user query languages for semi-structured data extraction

Please see my CV and the projects page for more details.

Publications and Talks

  • Learning Syntactic Program Transformations from Examples   [ICSE 2017]
    Reudismam Rolim, Gustavo Soares, Loris D'Antoni, Oleksandr Polozov, Sumit Gulwani, Rohit Gheyi, Ryo Suzuki, Bjoern Hartmann
    [abstract]    [preprint]

    IDEs, such as Visual Studio, automate common transformations, such as Rename and Extract Method refactorings. However, extending these catalogs of transformations is complex and time-consuming. A similar phenomenon appears in intelligent tutoring systems where instructors have to write cumbersome code transformations that describe "common faults" to fix similar student submissions to programming assignments.

    We present Refazer, a technique for automatically generating program transformations. Refazer builds on the observation that code edits performed by developers can be used as examples for learning transformations. Example edits may share the same structure but involve different variables and subexpressions, which must be generalized in a transformation at the right level of abstraction. To learn transformations, Refazer leverages state-of-the-art programming-by-example methodology using the following key components: (a) a novel domain-specific language (DSL) for describing program transformations, (b) domain-specific deductive algorithms for synthesizing transformations in the DSL, and (c) functions for ranking the synthesized transformations.

    We instantiate and evaluate Refazer in two domains. First, given examples of edits used by students to fix incorrect programming assignment submissions, we learn transformations that can fix other students' submissions with similar faults. In our evaluation conducted on 4 programming tasks performed by 720 students, our technique helped to fix incorrect submissions for 87% of the students. In the second domain, we use repetitive edits applied by developers to the same project to synthesize a program transformation that applies these edits to other locations in the code. In our evaluation conducted on 59 scenarios of repetitive edits taken from 3 C# open-source projects, Refazer learns the intended program transformation in 83% of the cases.

  • Program Synthesis in the Industrial World: Inductive, Incremental, Interactive   [SYNT 2016]
    Oleksandr Polozov, Sumit Gulwani, and the rest of the PROSE team
    [project info]    [extended abstract]    [slides]
  • FlashMeta: A Framework for Inductive Program Synthesis   [SPLASH (OOPSLA) 2015]
    Oleksandr Polozov, Sumit Gulwani
    [abstract]    [project info]    [pdf]    [slides]

    Inductive synthesis, or programming-by-examples (PBE) is gaining prominence with disruptive applications for automating repetitive tasks in end-user programming. However, designing, developing, and maintaining an effective industrial-quality inductive synthesizer is an intellectual and engineering challenge, requiring 1-2 man-years of effort.

    Our novel observation is that many PBE algorithms are a natural fall-out of one generic meta-algorithm and the domain-specific properties of the operators in the underlying domain-specific language (DSL). The meta-algorithm propagates example-based constraints on an expression to its subexpressions by leveraging associated witness functions, which essentially capture the inverse semantics of the underlying operator. This observation enables a novel program synthesis methodology called data-driven domain-specific deduction (D4), where domain-specific insight, provided by the DSL designer, is separated from the synthesis algorithm.

    Our FlashMeta framework implements this methodology, allowing synthesizer developers to generate an efficient synthesizer from the mere DSL definition (if properties of the DSL operators have been modeled). In our case studies, we found that 10+ existing industrial-quality mass-market applications based on PBE can be cast as instances of D4. Our evaluation includes reimplementation of some prior works, which in FlashMeta become more efficient, maintainable, and extensible. As a result, FlashMeta-based PBE tools are deployed in several industrial products, including Microsoft PowerShell 3.0 for Windows 10, Azure Operational Management Suite, and Microsoft Cortana digital assistant.

  • User Interaction Models for Disambiguation in Programming by Example   [UIST 2015]
    Mikaël Mayer, Gustavo Soares, Maxim Grechkin, Vu Le, Mark Marron, Oleksandr Polozov, Rishabh Singh, Ben Zorn, Sumit Gulwani
    [abstract]    [pdf]   

    Programming by Examples (PBE) has the potential to revolutionize end-user programming by enabling end users, most of whom are non-programmers, to create small scripts for automating repetitive tasks. However, examples, though easy to provide, are an ambiguous specification of the user's intent. Because of that, a key impedance in adoption of PBE systems is the lack of user confidence in the correctness of the program that was synthesized by the system.

    We present two novel user interaction models that communicate actionable information to the user to help resolve ambiguity in the examples. One of these models allows the user to effectively navigate between the huge set of programs that are consistent with the examples provided by the user. The other model uses active learning to ask directed example-based questions to the user on the test input data over which the user intends to run the synthesized program.

    Our user studies show that each of these models significantly reduces the number of errors in the performed task without any difference in completion time. Moreover, both models are perceived as useful, and the proactive active-learning based model has a slightly higher preference regarding the users' confidence in the result.

  • Personalized Mathematical Word Problem Generation   [IJCAI 2015]
    Oleksandr Polozov, Eleanor O'Rourke, Adam M. Smith, Luke Zettlemoyer, Sumit Gulwani, Zoran Popović
    [abstract]    [project info]    [pdf]    [slides]

    Word problems are an established technique for teaching mathematical modeling skills in K-12 education. However, many students find word problems unconnected to their lives, artificial, and uninteresting. Most students find them much more difficult than the corresponding symbolic representations. To account for this phenomenon, an ideal pedagogy might involve an individually crafted progression of unique word problems that form a personalized plot.

    We propose a novel technique for automatic generation of personalized word problems. In our system, word problems are generated from general specifications using answer-set programming (ASP). The specifications include tutor requirements (properties of a mathematical model), and student requirements (personalization, characters, setting). Our system takes a logical encoding of the specification, synthesizes a word problem narrative and its mathematical model as a labeled logical plot graph, and realizes the problem in natural language. Human judges found our problems as solvable as the textbook problems, with a slightly more artificial language.

  • LaSEWeb: Automating Search Strategies over Semi-structured Web Data   [KDD 2014]
    Oleksandr Polozov, Sumit Gulwani
    [abstract]    [project info]    [pdf]    [slides]

    We show how to programmatically model processes that humans use when extracting answers to queries (e.g., "Who invented typewriter?", "List of Washington national parks") from semi-structured Web pages returned by a search engine. This modeling enables various applications including automating repetitive search tasks, and helping search engine developers design micro-segments of factoid questions.

    We describe the design and implementation of a domain-specific language that enables extracting data from a webpage based on its structure, visual layout, and linguistic patterns. We also describe an algorithm to rank multiple answers extracted from multiple webpages.

    On 100,000+ queries (across 7 micro-segments) obtained from Bing logs, our system LaSEWeb answered queries with an average recall of 71%. Also, the desired answer(s) were present in top-3 suggestions for 95%+ cases.