Friday, January 6, 2012

A source of inconvenience removed

Unqualified variable names in Frege source code are looked up in the innermost lexical scope. If the name is not found there, the next higher scope is tried, until the search arrives at the top level name space.

If the variable appeared in an expression that lives in the where clause of an instance, class or data declaration, this is no different, just that the corresponding name space is searched before the top level.

This simple, logical approach that does not require special code for special cases has some unfortunate consequences, though. Consider:

data T = T Int

instance Show T where
    show (T i) = "T " ++ show i

Here, the show on the right hand side resolves to the show just being defined, and this takes arguments of type T, not Int.  This results in a type error:

E type error in expression i
    type is   Int
    used as   T

But this is not the worst case yet. It can bite also when one defines something completely different, like in the following (admittedly slightly contrived) example:

data T a = T a where
   isNegative :: T a  -> Bool
   isNegative (T x) = (show x).charAt 0 == '-'

derive Eq    T a
derive Show  T a

This gives the following seemingly unjustified errors from the type checker:

E type `T t3496` is not as polymorphic as suggested 
    in the annotation where just `a` is announced.
E type error in expression x
    type is   a
    used as   T t3496
E type error in expression
    charAt (show x) 0
    type is   Char
    used as   T t3495
E type error in expression
    type is   Char
    used as   T t3495
E inferred type is more constrained than expected type
    inferred:  (Eq t3495,Show t3497) => T a -> Bool
    expected:  T a -> Bool

The reason is that the derive declarations silently populate the name space T with definitions for show and == among others. Hence the local x must be of type (T b), due to appearing as argument of T's show. Likewise, the (==) insists on taking two T's.

The workaround is to qualify all class operations that should not resolve to the type specific incarnations introduced by instance declarations (note that derive declarations are just expanded to instance declarations). Hence, correct ways to write the definition in the first example are:

show (T i) = "T " ++


show (T i) = "T " ++ i

or even

show (T i) = "T " ++ i

Experience has shown that this is an error one commits again and again. From user feedback I know that it's not just me. Everybody trained in Haskell will most likely fall into that trap.
Therefore, in order to make it more convenient and more Haskell compatible and because I am tired from explaining that it works as designed, the following rule has recently been implemented when resolving simple identifiers:
If a simple name resolves to an implementation of a type class operation in some instance and the same name, if used on the top level, would resolve to a type class operation, then this name resolves to the said type class operation, thereby skipping the lexical scope of the enclosing data or instance where clause.
With this rule, the examples above will be automagically interpreted so as if  all class operations were qualified with the name of the class that introduced them. This makes the program acceptable to the type checker.

Note that this change does not break existing code. Such code will have employed one of the 3 workarounds shown above, but  the qualified notation remains of course still valid, and is unaffected by the new rule. Existing code is just slightly more explicit and verbose than needed, that's all.

The changed compiler is available in versions greater 3.18.108.