LARA

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
projects [2016/11/09 14:39]
vkuncak
projects [2019/12/20 19:37]
vkuncak
Line 1: Line 1:
 ====== Available Student Projects in LARA - EPFL Internal Page ====== ====== Available Student Projects in LARA - EPFL Internal Page ======
  
-**(c) 2016 Laboratory for Automated Reasoning and Analysis. All Rights Reserved. Any reproduction of the parts of this page in printed or electronic form requires a prior written permission from Viktor ​Kuncak.**+**(c) 2016-2019 Laboratory for Automated Reasoning and Analysis. All Rights Reserved. Any reproduction of the parts of this page in printed or electronic form requires a prior written permission from Viktor ​Kunčak.**
  
-These projects are suitable for both **BSc** and **MSc** projects (including ​ **master'​s thesis** projects) for students in the **IC school** of **EPFL**, and are suitable for both **computer science** and **communication systems** students. Each project is for **one student** in a given semester. Students report directly to  [[http://​lara.epfl.ch/​~kuncak|Prof. Viktor ​Kuncak]], even if PhD students are involved to help with various aspects of the project. The information on whether the project is available or not is also maintained only by Viktor Kuncak, so please contact him if you are interested in the project. ​+These projects are suitable for both **BSc** and **MSc** projects (including ​ **master'​s thesis** projects) for students in the **IC school** of **EPFL**, and are suitable for both **computer science** and **communication systems** students. Each project is for **one student** in a given semester. Students report directly to  [[http://​lara.epfl.ch/​~kuncak|Prof. Viktor ​Kunčak]], even if PhD students ​or post-doctoral researchers ​are involved to help with various aspects of the project.
  
-===== You are the right candidate =====+==== Nature of these projects ​====
  
-Laboratory for Automated Reasoning and Analysis has BSc and MSc projects ​available (for PhD program see http://​phd.epfl.ch/​edic ​). We are looking for students with high grades that are excellent in advanced functional programming and ideally strong in some of the core mathematical methods ​such as discrete mathematics,​ numerical analysis, logic, and probabilistic reasoning. ​ Are you inspired by building amazing programs such as automatic theorem provers, tools that automatically synthesize code, process natural language, perform tasks that seemingly require creativity, or find bugs in programs? If these are some of your passions, there may be a project for you.+Below we list several ​projects ​reflecting the current interests of our research groupAll projects can be adapted to fit the scope of a semester project (bachelor & masteror master'​s projectTypically we expect ​the knowledge of courses ​such as [[cc|Computer Language Processing]].
  
-===== There is an interesting project ​for you =====+**Licence ​for code:** Please note that we release software under open source licenses and we request our contributors to agree that EPFL LARA holds copyright of the code that you contribute.
  
-Below we list several example projectsAll projects ​can be adapted ​to fit the scope of a semester project (bachelor & master) or master'​s ​project.+**Disclaimer:​** If you have your own idea, feel free to check with us, but please note that we some times cannot accommodate your own project suggestions even if the student is brilliant and the project is excellentMoreover, even if you select one of the projects ​listed below, we reserve the right to not offer you the project. By deciding to talk to us you agree that you will not require detailed justification on why we made our negative decision.
  
-Many of these project are based on the [[http://​lara.epfl.ch/​w/​leon|Leon system]] for synthesis and verification. We suggest that you get the basic idea of Leon in any case. Please note that we release software under open source licenses and we request our contributors to agree that EPFL LARA holds copyright of the code that you contribute.+Several example projects follow.
  
-=== Easily editable ​and exportable hand-writing presentation videos ​===+==== Parsing ​and Printing Combinators ====
  
-Do you know Wandida videos? These famous videos explain concepts using natural drawing, pictures, voiceAlthough very popular, it is hard to produce themTheir greatest limitation is that they are hard to modify and long to producewith a lot of post-processing. +LARA has developed the ScaLL1on parsing framework: https://​github.com/​epfl-lara/​scallion/​ that is used in several course as well as in our researchWe have several projects related ​to advancing this frameworkincluding: 
-To simplify this process, we will adopt a vectorized format more suitable to integration with other presentation elements ​(inline quizztextual slidestext-to-speech,​ embedding, hypertext linking, etc.) +  * Implementing new parsing algorithms ​(such as SLRLALRCYK or Earley'​s algorithm) and benchmarking their performance
-The student will have define this format, create sample explanation videos, and possibly adapt leon.epfl.ch ​to let users produce ​such videos very quickly. +  * Offering a unified API for parsing and lexingsuch that both the parser ​and lexer can be described in a single step. 
-As a long-term result this project will enable the creation of much more such videos which are very appreciated by students.+  * Working on the pretty printing capabilities,​ implementing strategies ​to produce ​nicely indented and spaced pretty printed code.
  
-=== Reversing Embedded Domain Specific ​Languages ​(DSL) ===+==== Relating Natural ​Languages ​to Programming and Specification Languages ====
  
-To make programming easier, people use languages adapted to a particular domain, in order to write less. For example, when creating web pages from Scala code, one can start to define DSL so that instead of writing `Element("​div",​ List(TextElement("​hello "​+"​world"​)))`,​ one can simplify write `<​.div("​hello "​+"​world"​)`. +=== Grammatical Framework Controlled Natural Language for Method Return Statements ===
-We recently wrote an engine which, given a modification of text *on the web page* (ex: hello to bonjour), is able to modify the string from the original expression, whatever the way it was built <​.div("​bonjour "​+"​world"​),​ even asking questions in the case of ambiguities. +
-The student would push this feature towards having full HTML tree edition, but at the same time reconstructing the smallest DSL expression that can support the change in the output. +
-As an immediate result one will be able to create and modify presentation using reveal.js.+
  
-=== Database filling ​in Programming by Example ===+Use [[https://​www.grammaticalframework.org/​|Grammatical Framework]] to design a Controlled Natural Language (a subset of English) that is suitable for describing conditions of multiple return statements of Java methods ​in a single sentence. The data will be taken from [[http://​leclair.tech/​data/​funcom/​|FunCom]] and [[https://​github.com/​github/​CodeSearchNet#​data|CodeSearchNet]] parallel corpora of method-comment pairs. The designed grammar should generate (simplified) sentences such as "​@return changes or null if the input is empty" for cases like one below (FunCom entry 1339885):
  
-We recently designed ​model for modifying program-generated webpages so that it modifies the original program, yielding a very useful programming environnement where both the program ​and the output can be modified. +<code java> 
-However, we would like to extract data from the program (e.g; text, svg data, animations, audio...) so that the parsing becomes faster and the data is stored in a structured way. +/** 
-For that, the student will have to come up with creative ways to 1) Store such data in the server 2) Retrieve it from a program 3) Modify this data when the output changes 4) Modify the *structureof the data if the program change+     * Draw line and add the changes ​to the undo queue 
-As an immediate result one will be able to record audio in html slides, or add SVG animations.+     ​@param line 
 +     @return ​the changes
 +     */
  
-=== Verified Parameterized-Length Bitvectors in Leon ===+public ChangeSet draw(Collection<​LineSegment>​ line) { 
 +        if (line.isEmpty()) { 
 +            return null; 
 +        } 
 +        ChangeSet changes ​new ChangeSet(this);​ 
 +        for (LineSegment ls : line) { 
 +            addSegment(ls,​ changes); 
 +        } 
 +        this.changes.add(changes);​ 
 +        return changes; 
 +    } 
 +</​code>​
  
-Define and prove correct a library ​for performing arithmetic (+, -, *, /, %) on modular arithmetic with large integers. Such operations are analogous to Int in that they perform modular arithmetic (unlike the unbounded BigInt), but the number of bits can be user-specified (unlike Int).+=== Grammatical Framework Controlled Natural Language ​for Data Structure Operations ===
  
-=== Verifying Eliptic Curves using Leon ===+Use [[https://​www.grammaticalframework.org/​|Grammatical Framework]] to design a Controlled Natural Language (a subset of English) that is suitable for writing JavaDoc-like descriptions of operations on data structure implementations (such as lists, stacks, queues, hash tables, etc.). For example, the grammar should accept (simplified versions of) descriptions such as this one ([[https://​github.com/​google/​guava/​blob/​master/​guava/​src/​com/​google/​common/​collect/​MinMaxPriorityQueue.java|from MinMaxPriorityQueue of Guava]]: 
 +<code java> 
 +/** 
 +   * Adds the given element to this queue. If this queue has a maximum size, after adding {@code 
 +   * element} the queue will automatically evict its greatest element (according to its comparator),​ 
 +   * which may be {@code element} itself. 
 +   */ 
 +  @CanIgnoreReturnValue 
 +  @Override 
 +  public boolean offer(E element)  
 +</​code>​
  
-Verify correctness ​of implementing group addition operations on elements on elliptic and Edwards curves over certain finite fields ​(such as those over the prime field of size 2^255-19). +Note that we would like to represent the structure ​of natural language sentences ​(such as "​adds ​the given element to this queue"), in addition to method signature, modifiers and @code elements.
-See +
-  * [[wp>​Elliptic_curve]] +
-  * [[wp>​Edwards_curve]] +
-  * [[wp>​Elliptic_curve_cryptography]]+
  
-This project likely requires the previous project on parameterized-length bitvectors.+==== Further Projects ====
  
-=== Verify Electronic Contract Code ===+=== Verifying JavaScript ​===
  
-Verify correctness ​of electronic contract codeSeefor example: +Design a translation ​of a clean subset of JavaScript including higher-order functions and (mostly immutable) objects into StainlessMap numbers into Double (apply overflow checks as needed)and enable annotations or special function calls to describe types and contracts.
-  * [[wp>​Ethereum]] +
-  * http://​www.wired.com/​2014/​01/​ethereum/ ​+
  
-=== Verify Security Protocol ===+Note that `(f:​TObj) ​=> e` can be encoded as `TObj(x ​=> e)`.
  
-Propose a way to model security protocols ​in Leon and perform a case study to verify example protocols.+=== Verified Parameterized-Length Bitvectors ​in Stainless ===
  
-=== Forgiving Parser ​for Leon ===+Define and prove correct a library ​for performing arithmetic (+, -, *, /, %) on modular arithmetic with large integers. Such operations are analogous to Int in that they perform modular arithmetic (unlike the unbounded BigInt), but the number of bits can be user-specified (unlike Int).
  
-Develop a stand-alone parser for Leon programs, which aims to accept almost any sequence of characters as input and interpret it to the best of its knowledge as a Leon program. Optionally also allow syntax supported in other programming languages. A requirement is that a valid Leon program should be interpreted in the same way as in the current front end.+=== Verify Security Protocol ===
  
-More information:​ +Propose a way to model security protocols in Stainless and perform a case study to verify example protocols.
-  * [[http://​lara.epfl.ch/​~kuncak/​papers/​GveroKuncak15SynthesizingJavaExpressionsFreeFormQueries.pdf|Synthesizing Java Expressions from Free-Form Queries]]+
  
-=== Libraries and Applications in Leon ===+=== Libraries and Applications in Stainless ​===
  
-Develop and specify additional applications and/or libraries for Leon system, along with their specifications. Examples of applications include parsing algorithms, compilers, email clients, web applications (e.g. a news portal). Examples of libraries include data structures, linear algebra libraries, user interface libraries, as well as sound and graphics processing libraries.+Develop and specify additional applications and/or libraries for Stainless ​system, along with their specifications. Examples of applications include parsing algorithms, compilers, email clients, web applications (e.g. a news portal). Examples of libraries include data structures, linear algebra libraries, user interface libraries, as well as sound and graphics processing libraries.
  
-Among possible directions is to use http://​www.scala-js.org/​ to generate Javascript from Leon. To make such code useful, explore building ​Leon libraries that interface to Javascript APIs (focusing to the extent possible on stateless APIs) and build Leon specifications and stubs that approximate the behavior of these APIs to enable verification and synthesis.+Among possible directions is to use http://​www.scala-js.org/​ to generate Javascript from Stainless. To make such code useful, explore building ​Stainless ​libraries that interface to Javascript APIs (focusing to the extent possible on stateless APIs) and build Leon specifications and stubs that approximate the behavior of these APIs to enable verification and synthesis.
  
-=== Improve Leon Web-Based IDE ===+=== Low-Level Back-End for a Stainless Subset ​===
  
-Develop new IDE functionality and interactive program transformations for Leon. The transformations should subsume refactorings present in e.g. Eclipse IDE, but should also make use of Leon's ability to check correctness conditions to ensure soundness ​of transformations.+Port the **GenC** back end from Leon to StainlessThis generates C code from a subset ​of Leon, and evaluate it on case studies for embedded code, IoT code, or fragments ​of OS code functinality.
  
-=== Statistically Biased Code Generation ​===+=== Embedding Scala into Isabelle ​===
  
-Develop ​statistical models for a class of useful Leon programs and use them to build code completion functionality as well as to improve deductive synthesis of Leon. +Develop ​an embedding ​of a subset of Scala into Isabelle ​and enables its verification using Isabelle/​HOL, as in the following work: 
- +  ​* ​https://www.isa-afp.org/entries/Isabelle_C.html
-For more information,​ please check the notion of a Java language model from the report [[http://​infoscience.epfl.ch/​record/​201606/​files/​main.pdf|On Synthesizing Code from Free-Form Queries]], and also consult the notion of statistical language model from natural language translation literature (e.g. http://​www.cs.colorado.edu/​~martin/​slp.html ). +
- +
-=== C Back-End for a Leon Subset === +
- +
-Advance our **GenC** back for Leon, which generates C code from a subset of Leon, and evaluate it on case studies for embedded codeIoT code, or fragments of OS code functinality. +
- +
-=== Improve Our General-Purpose Grammar Parsing Library and grammar.epfl.ch === +
- +
-For teaching and research we are developing a library that can parse arbitrary context-free grammars, used in computer language processing course for exercises and the project, and in the future to be used in research. It is currently primarily based on generalizations of CYK algorithm ​as well as LL(1). Your task would be to introduce GLL, GLR, and Packrat parsing algorithms into this framework. Thanks to deep embedding, our library enables pre-analysis of grammars including the computation of first and follow sets. +
- +
-Another project ​in this scope is to further advance ​the grammar.epfl.ch web service. +
- +
-A version of Scala combinator library interface built on top of this library is also of interest, see e.g. (https://github.com/scala/scala-parser-combinators ).  +
- +
-=== THERE IS MORE === +
- +
-If a project you would like to do is not listed, but you are a student with high grades of substantial experience and have other ideas or interests, let us talk and explore if we can find a project that will interest you.+
  
 ===== Contact us ===== ===== Contact us =====
  
-If one of the projects lists sounds interesting, or if you have other interests that you would like to pursue, do not hesitate to contact ​Viktor Kuncak, http://​lara.epfl.ch/​~kuncak via email to make an appointment. In your email, please include your CV, the list of your grades, and a description of projects that you did at EPFL or outside of EPFL. +If one of the projects lists sounds interesting ​and you believe ​you are a suitable candidateplease ​do not hesitate to contact ​us. In your email, please include your CV, the list of your grades, and a description of projects that you did at EPFL or outside of EPFL.
- +
-Feel free to also talk to the other members of LARA group about their work and project suggestions:​ http://​lara.epfl.ch ​+