A Course on Correctness by Construction

Motivation and Goals

Software is becoming increasingly complex and responsible for critical tasks. Any technology aimed at ensuring the reliability and quality of software will be increasingly relevant, if not utterly necessary.

Only rigorous (e.g., mathematically sound) approaches can certify software with the highest possible assurance. These approaches include, among others, the use of specification languages, high-level programming languages (including equational, functional, and logic languages), the use of model checking and deductive verification, language-based approaches often interacting with theorem provers.

In this course we will give a hands-on introduction to rigorous software development methods that follow a “correctness-by-construction” approach.

While the course is not heavy in theory, everyone is expected to have a good understanding of first-order logic and programming experience. We will explore several methodologies that have approaches and underlying technical bases, but which share a common overarching goal: develop programs while making sure that non-trivial properties, expressing high-level design requirements regarding correctness, fairness and sometimes efficiency, are continuously respected.

To follow this course, you should have programming experience (three years or more), familiarity with formal (first-order) logic, formal proofs, logic programming, and concurrent programming. If you think you do not meet these requirements, please get in touch ASAP with one instructor to recommend you reading material.

Course Log and Pointers to Class Material

Click on the dates to expand the lecture’s content.

10/2/2021: General Introduction (Click on me!)

We introduced ourselves and made a general introduction to the problems associated with complex, modern software. We motivated how a shift in the way software is developed can bring substantial improvements to both software quality and the cost associated with its development. We used these slides and recorded a video. The password to access will be sent separately. Unfortunately I forgot to restart video recordings immediately after the breaks, but I believe that for this particular session that won’t be critical.

First Part: System Modelling using Refinement and Interactive Theorem Proving

17/02/2021: Models of systems.

Reactive software and the environment. Pitfalls of decomposition. Refinement. Discrete models and programs. Event B: events, operational semantics, searching in a vector. Constants, variables, invariants, axioms, events. Zero is a natural number! Defining and implementing division. Slides so far and recording of the lecture.

24/02/2021: Calculus for logic and its application.

Invariants and invariant preservation. Proofs. Gentzen-style single-conclusioned sequent calculus. Inference rules. Applying inference rules to invariant preservation proofs. Inductive and non-inductive variants. Rationale for proof by contradiction. Slides and recorded lecture.

3/3/2021: Rodin.

Using Rodin. The model and proofs for integer division (also in PDF). Sequential programs: preconditions and postconditions. Mapping to an Event-B model. Locating an item in an array: first model and refinement. Jointly edition of the model, up to linear search (also in PDF). Slides and recording.

17/3/2021: Additional POs.

FIS and WD POs. Search in sorted arrays. Invariants. Refinement proposal and model. POs. Refinement: guard strengthening and simulation. Pending POs: strategy and how to discharge them in Rodin. Theorems and WD in specific types of theorems. Slides and recording. And the model with which we finished.

17/3/2021: One-way bridge (1)

A concurrent, reactive system that interacts with the environment: controller for a one-way bridge. For reference, these are the first and second models with which we finished, but please try to finish them on your own. Slides and recording.

24/3/2021: First homework: solutions – SHORT SESSION

This lecture was ony 1 hour long because I had to attend an overlapping meeting. Therefore I only had time to review the solutions to the first homework. The current plan is to look for a 2-hour slot to recover these 2 lost hours, as the available time in the course is limited.

In the lecture we only could review the solutions for the first homework and clarify what the relative deadlock freedom formulas w.r.t. the regular DLF formula. Probably the highlight was the clarification of the OR-L rule: why it is necessary to open branches for each component of a hypothesis containing and OR. I think that the following picture summarizes well what we discussed:

26/3/2021: One-way bridge (2) – EXTRA LECTURE

We reviewed the second and third refinements. Traffic lights were introduced in the first refinement, and several new events were necessary, as well as invariants to (a) make sure several requirements were respected and (b) as a technical help to make proving several properties possible. From some point on in the lecture, and to make it possible to finish the material, we stopped entering the model in everyone’s Rodin development environment. Even if the full model is provided below, you really should try to finish the model writing it in Rodin following the slides’ guidance.

The complete slides for this section are available, as well as the recording. We have also available the model with which we started (refinement 1) and the model of the full system (refinement 3). If you attended the session, please continue with the model where we left it (or start again with refinement 1). If you did not attend the session, start with the model we had after the 17/3/2021 session, or with refinement 1, and follow the slides to finish the model. The final refinement 3 is provided as a reference for study.

Refinements 0 to 3 need about 240 proof obligations. Only a few (2 or 3, IIRC) need light manual intervention. Most of the others need, at most, using explicitly the PP prover.

31/3/2021: NO LECTURE – EASTER HOLIDAYS.

7/4/2021: NO LECTURE – HOLIDAYS COMPENSATION.

This day will have the lectures of a Monday at UPM.

14/4/2021: Brush up your theory and synchronize a distributed system

We reviewed some concepts of logic, sets, relations, functions. We also (in the same set of slides) showed how to represent dynamic data structures (lists and trees) either finite or infinite. From the loop-absence condition of lists we derived a scheme for (strong) induction on lists that can be generalized to other pointer-based data structures.

We also started a new topic: modeling a distributed algorithm for the synchronization of processes that have restrictions on their communication. We started the models and finished the session in the middle of the first refinement.

As usual, the recording is available.

21/4/2021: Finishing the synchronization of a distributed system

We finished the correctness proofs of the synchronization of distributed system with some restrictions on the communication among processes. The recording is in the usual place, and the whole model with all proofs is also available. Two of the final proofs are more involved than what we had seen usually, and while they are sketched in the slides, I made videos of the two proofs. For one of them I made two versions, one adding a theorem and the other one doing more work with the interactive theorem prover.

Second Part: Program Development with Logic-Based Languages and Program Analysis

28/4/2021: From automated deduction to programming with logic

After some introductions we go through a motivational presentation of the overall objectives of this part. We then review unification and resolution and finally take some first steps in logic programming. These are the recordings for the introductions and the first session. These are the slides and code for the motivation and the slides and code for the introduction to pure logic programming. Please try to develop/run the examples on your own. You can find here some slides on the basic use of a logic and constraint programming system.

5/5/2021: Pure Logic Programming with Ciao

We review the concepts and use of terms as data structures and logic variables as declarative pointers. We then explain different data types of increasing complexity (naturals, lists, trees, graphs) seeing first how the types are defined and how they can be used as generators and checkers. These are followed by examples of programs that operate on these types. We also cover functions and the uses of functional syntax. This is the recording of this second session and these are the slides and code for this part. Please try to develop/run the examples on your own. Remember you can find here some slides on the basic use of a logic and constraint programming system.

12/5/2021: Semantics & some advanced features and techniques

We start by reviewing the model and fixpoint semantics of logic programs and define program correctness and completeness. We also do a quick overview of some more advanced logic programming techniques using the Prolog language within Ciao. In particular: structure inspection, higher-order, pruning, aggregates, dynamic program modification, meta-interpreters, incomplete data structures and grammars. This is the recording of this third session and these are the slides for the semantics part and the slides and code for the Prolog part. Please try to develop/run the examples on your own.

19/5/2021: CLP & program verification via abstract interpretation

The first part of this class is a very brief introduction to CLP as an extension of LP. The second part is an overview of the basic concepts of abstract interpretation and its application in program verification. We then apply abstract interpretation to logic program using both the bottom-up semantics and the top-down (i.e., resolution-based) semantics. We also see tat analyzers for logic (Horn-clause) programs can be applied to analyzing different programming languages since their semantics can be translated easily to Horn clauses as intermediate representation. We conclude the course with an introduction to the CiaoPP abstract interpreter and verifier and use it on different examples. This is the recording of this fourth session. These are the slides and code for the CLP part, and the slides for the abstract interpretation part.

Project Presentation Sessions

We plan a minimum of two presentation sessions. We may have to use additional sessions depending on the number of project teams.

26/5/2021: NO LECTURE

2/6/2021: Project Presentations I
  • HMS [16:05-16:30]
    • Hoang Nguyen Phuoc Bao
    • Matías Carlander-Reuterfelt Gallo
    • Sajid Saleem
  • AVR [16:30-16:55]
    • Roland Coeurjoly Lechuga
    • Alejandro Hernández Cerezo
    • Víctor Pérez Carrasco
  • MAD [17:05-17:30]
    • Ana Marija Ereš
    • Dominik Kos
    • Martin Sršen
  • DAD [17:30-17:55]
    • Daniel Jurjo
    • David Munuera
    • Andoni Rodríguez
  • FPI [18:05-18:30]
    • Iñaki Landa Sainz.
    • Francisco Morcillo Vizuete
    • Pablo Pérez Rodríguez
9/6/2021: Project Presentations II

  • MBR [16:05-16:30]
    • Carlos Bosch
    • Andrew Mitchell
    • Alvaro Rubio
  • F(A)Q [16:30-16:55]
    • Xiangyu Fan
    • Yuchen Qin
  • FÓÓ [17:05-17:30]
    • Óscar Font
    • Óscar Méndez
    • Francisco del Real Escudero


Location, Schedule, Administrivia

Time

During the academic year 2020-2021 this course will take place remotely via Zoom. Download the Zoom desktop client for maximum flexibility. If that is a problem for you, please let the coordinator know to seek a solution.

Place

The class will meet on Wednesdays from 4pm to 7pm (see the course log for exceptions).

Teachers

Manuel Carro (coordinator)

Office 035 at the IMDEA Software Institute – mcarro |at| fi DOT upm DOT es.

Manuel Hermenegildo

Office 386 at the IMDEA Software Institute (under appointment) – herme |at| fi DOT upm DOT es.

Communication

The mailing list archives are at https://software.imdea.org/mailman/private/cbc/ .

For security reasons, you cannot subscribe to the mailing list by yourself. You should have been subscribed by some instructor, and you should have received a welcome message with the initial subscription. Note that you can only post to the list from the mail address that has been subscribed. If you want to change it, please let an instructor know. In normal situations, all important classroom announcements will go through the course mailing list, so please be sure to read the subscribed address regularly.


Course Policy

To keep this landing page short, the course policy appears in a separate page. This does not mean it is less important. Please make sure to read it.


Academic Resources

Please have a look as well at the Assorted Resources. It contains not-strictly-academic (but interesting) material!

Logic

  • Lawrence Paulson’s Logic and Proof are the course notes of the author for a Logic course in Cambridge. Highly recommended, as they are both rigorous and very concise. They provide very good background material for both parts of the course.
  • A very good book on the use of logic in computer science is Logic in Computer Science, by Huth and Ryan. It seems to be out of print, but the Computer Science School should have several copies. You may also consider locating an electronic copy on the Internet, if possible of the second edition.
  • Mathematical Logic for Computer Science. Mordechai Ben-Ari. There should be copies in the School’s library.
  • Sweet Reason: a Field Guide to Modern Logic.  James M. Henle, Jay L. Garfield, Thomas Tymoczko. This book explains several topics on logic and logic reasoning with many entertaining non-technical examples from many sources. It does not focus on logic and computation, however.
  • Class notes on Gentzen systems and single-conclusioned Gentzen systems and refinement logic (the sequent calculus we use in the lectures) from the Spring 2009 CS 4860 (Applied Logic) course in Cornell. Syntax node: these classnotes sometimes use “⊃” to denote implication, when we (and many others) use “⇒”.

Formal Development

  • Faultless Systems: Yes, we Can! is a short article by Jean-Raymond Abrial, the creator of Event-B (among other systems and proposals for rigorous software development) that explains the ideas behind the Event-B methodology.

Event B Reference

  • The definitive reference for Event B is Modeling in Event-B: System and Software Engineering, by Jean-Raymond Abrial.
  • The richest information point for Event B is the Event B wiki. A summary of the inference and equality rules, axioms, proof obligations, and syntax of Event B can be found in these slides.
  • This reference card has a (very useful) summary of the Event B notation.
  • The mathematical toolkit of Event B is explained in a report.
  • An introduction to the Event-B method with a description of its phases.

Tools

Rodin

The essential tool to perform development with Event B is Rodin, an Eclipse-based tool. It includes an editor for the components of an Event B project that keeps track of the pending proof obligations and tries to discharge them on the fly. It has many plugins (installable directly from Rodin) that provide advanced theorem proving capabilities (to discharge proof obligations on demand and, hopefully, with only a button press), model checkers, animation, printout generation, and much more. You need to install it, as it will be use extensively during the course.

Please read this quick guide to installing Rodin. It includes some tips and instructions to perform several common tasks. This will save you time! At point, you will need to interact with the theorem provers. A page with tips for proving will be handy!

  • Homepage of the Rodin versions. Please make sure to download the latest version.
  • Installation instructions for RODIN.
  • The handbook for Rodin. It does not correspond the latest tool version: some details differ, but the basic ideas remain.
  • The Atelier B Provers plugin is necessary for any non-trivial development. Install it by going to Help ⇒ Install new software ⇒ select Atelier B Provers ⇒ Select in box ⇒ Click Next ⇒ Follow instructions. If you do not install these provers, many course examples will not work.
  • Relevant sections of the manuals:
    • How to set up a Rodin project (we will see it during the lectures).
    • Hints on discharging proofs using RODIN. Read it: it contains many hints and information on how to use the built-in and external theorem provers.
    • An explanation of the proving perspective from the user interface point of view.
    • A catalog of the proof obligations generated by RODIN and their meaning.
  • A list of the inference rules and rewriting rules in the default Event B prover (extracted from the Event B website).

Ciao

During the second part of the course you will run the examples and exercises discussed in class on Ciao, a programming system in the Prolog family which allows using pure logic programming, standard Prolog, functions, constraints, and several other paradigms and language extensions.

The following are useful resources for Ciao:

This technical paper provides a good overall overview of the system as well as the motivations behind Ciao’s design.