Types, Classes

(Maybe even typeclasses?)

— … or … —

Design by Contract and the Liskov Substitution Principle

Oliver Wong (olivwong@) — Amazon 2014

Goals of the presentation:

  • (Convince you that) static type checking is awesome.
  • Show how to use the type system of languages like Java to your advantage.
  • Maybe expose you to ideas from lots of different programming languages?

Mars Climate Orbiter

Launched on December 11th, 1998

Disintegrated in upper atmosphere

$125 million

“People sometimes make errors. The problem here was not the error, it was the failure of NASA's systems engineering, and the checks and balances in our processes to detect the error. That's why we lost the spacecraft.”
Dr. Edward Weiler, NASA's Associate Administrator for Space Science

The Mars Climate Orbiter Bug

(Reenactment)

def getImpulse(): double {
  //...
}

def updateThrusters(impulse: double) {
  //...
}

updateThrusters(getImpulse());
		

/**
 * ...
 * @return the impulse in pound * seconds.
 */
def getImpulse(): double {
  //...
}

/**
 * ...
 * @param impulse the impulse in newton * seconds.
 */
def updateThrusters(impulse: double) {
  //...
}

updateThrusters(getImpulse());
		

/**
 * ...
 * @return the impulse in pound * seconds.
 */
def getImpulse(): PoundSecond {
  //...
}

/**
 * ...
 * @param impulse the impulse in newton * seconds.
 */
def updateThrusters(impulse: NewtonSecond) {
  //...
}

updateThrusters(getImpulse()); //Compile error
		

def getImpulse(): Impulse {
  //...
  return new PoundSecond();
}

def updateThrusters(impulse: Impulse) {
  var x = Impulse.getNewtonSecond();
  //...
}

updateThrusters(getImpulse()); //Autoconversion via polymorphism
		
Fortress:

v : ℝ m/s = (3 meters + 4 meters )/5 seconds (* OK *)
v : ℝ m/s = (3 meters + 4 seconds)/5 seconds (* type error *)
v : ℝ m/s = (3 meters + 4 meters )/5         (* type error *)
		
http://www.eecis.udel.edu/~cavazos/cisc879-spring2008/papers/fortress.pdf

Units Specification JSR-108 & JSR-275

Claim:

Bugs happen when a programmer expects a piece of code to do something, but the code does something else.

“People sometimes make errors. The problem here was not the error, it was the failure of NASA's systems engineering, and the checks and balances in our processes to detect the error. That's why we lost the spacecraft.”
Dr. Edward Weiler, NASA's Associate Administrator for Space Science

Automate the check and balances in our process:

Have the computer check the program

Code:

public ; class void } 3<null>
		
 

Not grammatical

No idea what you're trying to say.

Code:

int foo = (int) "Hello";
		
 

Type error

I know what you're trying to do, but I'm not gonna let you do it because it looks like a bug.

What are type systems?

A type system is a collection of rules that assign a property called a type to the various constructs […] a computer program is composed of. The main purpose of a type system is to reduce bugs in computer programs by defining interfaces between different parts of a computer program, and then checking that the parts have been connected in a consistent way.
http://en.wikipedia.org/wiki/Type_system

I expect the code to do something:


def executeAction(user) {
  //...
}
val user = User.findById(userId);
executeAction(user);
	

But the code does something else:


/**
 * user: the id of user making the request.
 */
def executeAction(user: Int) {
  //...
}
val user = User.findById(userId);
executeAction(user); //BUG
	

What are type systems?

A type system is a set of types […] along with the rules that govern whether or not a program is legal with respect to [those] types.
http://www-plan.cs.colorado.edu/diwan/3155-f04/TypeChecking.pdf
The rules in a given type system allows some programs and disallows other programs. For example, unlike assembly language, one cannot write a program in Java that grabs a random word from memory and treats it as if it were an integer.
http://www-plan.cs.colorado.edu/diwan/3155-f04/TypeChecking.pdf
Thus, every type system represents a tradeoff between expressiveness and safety.
http://www-plan.cs.colorado.edu/diwan/3155-f04/TypeChecking.pdf
  • Reject all programs! (e.g. HQ9+)
  • Guaranteed no bugs ever in production (unless there's a bug in the compiler)
  • Don't trust the (human) programmers
  • Accept all programs! (e.g. machine code)
  • Lets you do things the compiler is unsure is possible
  • Fully trust the (human) programmers

Less extreme example — Java:

  • Guarantees right number of parameters are passed in a method invocation
  • Don't trust the programmers to perform pointer arithmetic
  • Condition of an if statement must be boolean
  • Checked exceptions must be caught or declared as thrown.

Choosing a language is implicitly choosing a type system.

(We'll get back to this later).
A type system is a collection of rules that assign a property called a type
http://en.wikipedia.org/wiki/Type_system
A type system is a set of types […] along with the rules…
http://www-plan.cs.colorado.edu/diwan/3155-f04/TypeChecking.pdf
Definition: a type is a set of values.
http://www-plan.cs.colorado.edu/diwan/3155-f04/TypeChecking.pdf
A type is a set of values… and the set of operations allowed on those values.
me
(We'll also get back to this later).

Intuition:


class Foo {
}

class Bar extends Foo {
  public void frobulate();
}
		
Foo f = new Foo();
Bar b = new Bar();
Foo Bar f b

Static type checking vs Static analysis?

The distinction is not important for this talk.

Types are sets of values, ∴ if you completely specify the types of your program, you completely specify the allowed values that variables can have, that methods can take as input, etc.

In other words, you are completely specifying all possible states the program can have while it is running, and thus all possible behavior it can exhibit.

Criticisms of Static Type Checking

Static typing fanatics try to make us believe that “well-typed programs cannot go wrong”. While this certainly sounds impressive, it is a rather vacuous statement. […] [Static type checking] is necessarily only partially sound and incomplete. This means that programs can still go wrong because of properties that are not tracked by the type-checker.
Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed

Gödel's (First) Incompleteness Theorem

Any effectively generated theory capable of expressing elementary arithmetic cannot be both consistent and complete.
http://en.wikipedia.org/wiki/G%C3%B6del's_incompleteness_theorems

(Also: Halting Problem, Rice's Theorem, etc.)

When programmers say “I need […] static typing”, they really mean […] “I want contracts”

Static typing provides a false sense of safety, […] even if a program does not contain any static type-errors, this does not imply that you will not get any unwanted runtime errors.

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed
Current type-checkers basically only track the types of expressions and make sure you do not assign a variable with a value of an incompatible type, that the arguments to primitive operations are of the right type (but not that the actual operation succeeds), and that the receiver of a method call can be resolved statically (but not that the actual call succeeds).

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed (No date on PDF, but includes citations from 2004)

However in general programmers want to express more advanced contracts about their code. For example, an invariant that the value of one variable is always less than the value of another, a precondition that an argument to a method call is within a certain range, or a postcondition that the result of a method call satisfies some condition.

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed

However in general programmers want to express more advanced contracts about their code. For example, an invariant that the value of one variable is always less than the value of another, a precondition that an argument to a method call is within a certain range, or a postcondition that the result of a method call satisfies some condition.

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed

Possible via Dependent Types

A dependent type is a type that depends on a value.

[…]

Two common examples of dependent types are dependent functions and dependent pairs. […] A dependent pair may have a second value that depends on the first. It can be used to encode a pair of integers where the second one is greater than the first.

http://en.wikipedia.org/wiki/Dependent_type

Dependent Types ⇒ Agda, Coq, ... Java?!

You can encode the natural numbers in the Java type system

Thanks, Peano!
Real, honest to goodness, Java:

abstract class Nat<Me extends Nat> {
  Succ<Me> next() { return new Succ<Me>(); }
}
class Zero extends Nat<Zero> { }
class Succ<Prev extends Nat> extends Nat<Succ<Prev>> { }
		

Note that you “pass in” the “current” value to Nat, but the "previous" value to Succ

  • Five ⊆ Nat<Five>
  • Five ⊆ Succ<Four> ⊆ Nat<Succ<Four>>
Real, honest to goodness, Java:

static <N extends Nat> void acceptsConsecutive(N first, Succ<N> second) {
  System.out.println("Yup, they're consecutive allright.");
}

//...

Zero zero = new Zero();
Succ<Zero> one = zero.next();
Succ<Succ<Zero>> two = one.next();
Succ<Succ<Succ<Zero>>> three = two.next();

//acceptsConsecutive(zero, zero); //compile error
acceptsConsecutive(zero, one);
//acceptsConsecutive(zero, two); //compile error
//acceptsConsecutive(zero, three); //compile error

//acceptsConsecutive(one, zero); //compile error
//acceptsConsecutive(one, one); //compile error
acceptsConsecutive(one, two);
//acceptsConsecutive(one, three); //compile error

//acceptsConsecutive(two, zero); //compile error
//acceptsConsecutive(two, one); //compile error
//acceptsConsecutive(two, two); //compile error
acceptsConsecutive(two, three);
		

However in general programmers want to express more advanced contracts about their code. For example, an invariant that the value of one variable is always less than the value of another, a precondition that an argument to a method call is within a certain range, or a postcondition that the result of a method call satisfies some condition.

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed

“An argument to a method call is within a certain range”

Foo Bar

Presumably this range is a subset of some larger type (e.g. only integers between 10 and 20).

Therefore, you can use subtyping to express this.

However in general programmers want to express more advanced contracts about their code. For example, an invariant that the value of one variable is always less than the value of another, a precondition that an argument to a method call is within a certain range, or a postcondition that the result of a method call satisfies some condition.

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed

What kind of conditions?

That the output is larger than the input (or vice versa?)


abstract class Stack<E, N extends Nat<N>> {
  public Stack<E, Succ<N>> push(E element) {
    return new StackNode<E, N>(element, this);
  }
}
class EmptyStack<E> extends Stack<E, Zero> { }
class StackNode<E, N extends Nat<N>> extends Stack<E, Succ<N>> {
  private final E head;
  private final Stack<E, N> next;
  public StackNode(E head, Stack<E, N> next) {
    this.head = head; this.next = next;
  }
  public Stack<E, N> pop() { return next; }
  public E get() { return head; }
}
	

class StackOperations {
  static <E, N extends Nat<N>> Stack<E, N> pop(Stack<E, Succ<N>> stack) {
    assert stack instanceof StackNode;
    return ((StackNode)stack).pop();
  }
  static <E, N extends Nat<N>> E get(Stack<E, Succ<N>> stack) {
    assert stack instanceof StackNode;
    return ((StackNode<E, N>)stack).get();
  }
}

class Test2 {
  public static void main(String[] args) {
    Stack<String,Zero> zero = new EmptyStack();
    //StackOperations.pop(zero); //Compile Error
    //StackOperations.get(zero); //Compile Error
    
    Stack<String,Succ<Zero>> one = zero.push("A");
    StackOperations.get(one);
    Stack<String,Zero> zeroAgain = StackOperations.pop(one);
    
    //StackOperations.get(zeroAgain); //Compile Error
    //StackOperations.pop(zeroAgain); //Compile Error
  }
}
	

However in general programmers want to express more advanced contracts about their code. For example, an invariant that the value of one variable is always less than the value of another, a precondition that an argument to a method call is within a certain range, or a postcondition that the result of a method call satisfies some condition.

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed

What kind of condition?

That the output lies within a certain range?

Foo Bar

Again, use subtyping to express this

Choosing a language is implicitly choosing a type system.


Succ<Succ<Succ<Succ<Succ<Succ<Succ<N>>>>>>> seven = ...
	

Writing dependent-type code in a Java sucks.

It's a little bit better in Scala, but still not great; see https://github.com/milessabin/shapeless
  • 😻 Preventing bugs via the type systems is awesome.
  • 😿 Fighting an unexpressive type system sucks.
  • 😾 What do we do if we're stuck with an unexpressive type system? (Besides rewriting everything from scratch in Scala / Haskell / Agda / Hipster-language-of-the-month?)

However in general programmers want to express more advanced contracts about their code. For example, an invariant that the value of one variable is always less than the value of another, a precondition that an argument to a method call is within a certain range, or a postcondition that the result of a method call satisfies some condition.

Erik Meijer and Peter Drayton — Static Typing Where Possible, Dynamic Typing When Needed

Bertrand Meyer

Created Design by Contract in 1986-1997, trademarked in 2003-2004.

Design by contract

  • Preconditions
  • Postconditions
  • Invariants
What it looks like in Eiffel:

	put (x: ELEMENT; key: STRING) is
	  require
	    count <= capacity
	    not key.empty
	  do
	    -- Implementation of method here.
	  ensure
	    contains(x)
	    get(key) = x
	    count = old count + 1
	  end
	  

In Scala, use require and ensuring


	def put(x: Element, key: String) {
	  require(count <= capacity)
	  require(! key.empty)
	  //implementation of method here
	} ensuring { (result) =>
	  result.contains(x) &&
	  result.get(key) == x
	}
		

Various DbC frameworks for Java:

  • Contracts for Java
  • iContract2/JContracts
  • Contract4J
  • jContractor
  • C4J
  • CodePro Analytix
  • STclass
  • Jass preprocessor
  • OVal with AspectJ
  • JML
  • Jtest
  • SpringContracts
  • Modern
  • Custos using AspectJ
  • JavaDbC
  • JavaTESK
  • chex4j using javassist
  • java-on-contracts

In-line unit tests in D


class Sum {
  int add(int x, int y) {
    return x + y;
  }

  unittest {
    Sum sum = new Sum;
    assert(sum.add(3,4) == 7);
    assert(sum.add(-2,0) == -2);
  }
}
		
http://dlang.org/unittest.html

Doctest in Python


def list_to_0_index(lst):
  """
  Given a list, lst, say for each element the 0-index where it appears for
  the first time. For example:
  
  >>> x = [0, 1, 4, 2, 4, 1, 0, 2]
  >>> list_to_0_index(x)
  [0, 1, 2, 3, 2, 1, 0, 3]
  >>>
  """
  return [lst.index(i) for i in lst]
		
http://en.wikipedia.org/wiki/Doctest
Contracts are declarative

/**
 * @return the sum of x and y, mod Integer.MAX_VALUE.
 */
public int add(int x, int y) {
	
Unit tests are specific examples.

assertEquals(7, add(3, 4));
	
  • Static Type systems are verified at compile time.
  • Unit tests are verified at (pseudo-)compile time.
  • "Real" contracts are verified at runtime, require special tooling.
  • Javadoc contracts are only verified when a human reads them and thinks about them.

Barbara Liskov

Liskov Substitution Principle

Let q(x) be a property provable about objects x of type T. Then q(y) should be provable for objects y of type S where S is a subtype of T.

In other words, think of types as sets of values, and think of "is a subtype of" as meaning "is a subset of".

Subtyping as Subsets

Foo Bar

Any property that is true about any instance of Foo should also be true about any instance of Bar, or else Bar is not a subtype/subset of Foo!

Lots of people get Liskov Substitution Principle wrong.

I've seen LSP-violations in blogs posts, web tutorials, and published books on OO design.

I was part of a thread in java-dev@amazon.com just one week ago, where someone got LSP wrong.

Examples of LSP Violations

Rectangle & Square

A square is a (IS-A) rectangle, Wikipedia says so!


public class Rectangle {/*...*/}
public class Square extends Rectangle {/*...*/}

public void methodExpectingRectangle(Rectangle r) {
  r.setWidth(3);
  r.setHeight(7);
  assertThat(r.getWidth(), is(3));
  assertThat(r.getHeight(), is(7));
}

methodExpectingRectangle(new Square(/*...*/));
	

Examples of LSP Violations

Rectangle & Square

A rectangle is just a square, with a few other fields and methods, right?


public class Square {/*...*/}
public class Rectangle extends Square {/*...*/}

public void methodExpectedSquare(Square s) {
  assert "The diagonals of s bisect each other at right angles", //...
}

methodExpectedSquare(new Rectangle(/*...*/));
	

Examples of LSP Violations

A mutable list is just an immutable list, but with setters, right?


public class ImmutableList { /*...*/}
public class MutableList extends ImmutableList {
	public void add(E element) {/*...*/}
}
	

An immutable list is just a mutable list, whose setters throw UnsupportedOperationException, right?


	//see http://docs.oracle.com/javase/7/docs/api/java/util/Collection.html
	

(To be fair, this last one isn't an LSP violation, just a terrible API design.)

How do we fix LSP violations?

Let q(x) be a property provable about objects x of type T. Then q(y) should be provable for objects y of type S where S is a subtype of T.

Two techniques:

  1. Don't have q(x) be a property about objects x of type T.
  2. Don't have S be a subtype of T.
Note: It's okay to have properties q(y) that are true for instances of the subtype which are not true for the supertype!

Preconditions are contravariant, post-conditions are covariant

In other words:

  • A subtype's preconditions can be weaker than the supertype's.
  • A subtype's postcondition can be stronger than the supertype's.

Examples

IntAdder: must be given two ints as input. Guarantees output is an int.

  • RoundingAdder: must be given two reals as input. Guarantees output is an int. OK.
  • RealAdder: must be given two reals as input. Guarantees output is a real. WRONG.
  • PositiveAdder: must be given two positive ints as input. Guarantees output is a positive int. WRONG.
  • AbsoluteAdder: must be given two ints as input. Guarantees output is a positive int. OK.
  • NonsenseAdder: must be given two objects as input. Guarantees output is always -1. OK.

Don't have q(x) be a property about objects x of type T

  • Don't expose API in a supertype, if that API cannot be implemented by all subtypes.
  • Don't claim something in the contract of a supertype, if that claim is not true in subtypes.

Don't have S be a subtype of T

Example:


/**
 * Instances might be mutable or immutable
 */
abstract class Data { public int get() { /*...*/ } }
/**
 * New contract: these instances are guaranteed to be immutable.
 */
class ImmutableData { /* No new API defined */ }
/**
 * New contract: these instances are guaranteed to be mutable.
 */
class MutableData { public void set(int newValue) { /*...*/ } }
	

LSP cannot be verified by static analysis!

(Proof: let the property q(x) be "always terminates", reduces to halting problem.)

We (humans) must manually verify LSP.

We (humans) need to distinguish between implementation details and intended contract.

(e.g. "there is no setter" vs "the contract states that instances of this class are immutable")

Write the contract in the javadoc!

Immutability does not make you immune to LSP


interface ImmutableCircle {
	float diameter();
	float area();
}
interface ImmutableEllipse extends Circle {
	float otherDiameter();
}

ImmutableCircle notReallyACircle = new ImmutableEllipse();
assert notReallyACircle.area() == π * (notReallyACircle.diameter() / 2) ^ 2
		

Subtyping as Subsets

Rectangles Squares

Wait, why is this diagram wrong?

Types are not just sets of values; they are sets of values, plus the operations that are allowed on those values.

Liskov is all about the properties of the elements, not the elements themselves. I.e. Liskov is all about the contract.

Takeaway points

  • Bugs happen when the code doesn't do what you expect.
  • Use the type system to force the code to do what you expect.
  • Use the type system to force you to expect what the code does.
  • If the type system can't (easily) express your contract, use contracts.
  • Use LSP to make sure nobody's expectations are being violated.
  • No static analysis tool can verify LSP for you.
  • You need manual, human verification.
Images used with permission:
  • Mars Climate Orbiter: Public Domain, by NASA
  • Mars Climate Orbiter Trajectory: CC BY 3.0, by Xession
  • Challenge Accepted: Public Domain
  • Bertrand Meyer: CC BY-SA 3.0, by David.Monniaux
  • Barbara Liskov: CC BY 2.0, by Dennis Hamilton