A Pretty Printer in F#
Introduction
F# features some tools for pretty printing. StructuredFormat is a
customizable layout engine. Types that wish to customize their
structured format specification should implement the IFormattable
interface. This interface must provide a method (GetLayout) that
outputs a layout value. The StructuredFormat
library has several functions and operators to simplify output.
To print a value using this engine, you can use one of the following
functions:
- print_any ('a -> unit)
- prerr_any ('a -> unit)
- output_any (out_channel -> 'a -> unit)
- any_to_string ('a -> string)
StructuredFormat is also the engine used by the interactive mode.
A layout describes how tokens will fit next to each other and where
the breaks, spaces and indentations will be (or can be). Objects such
as numbers and strings can be left unformatted, which lets the engine
control how they will be printed (for example, culture specific
format). The engine can also be customized: print width, depth,
culture and floating-point number formatting...
Functions
You should have a look in
sformat documentation and see the list of available functions. To
convert an atom to a layout, we might use following functions:
- objL is layout for most
values (int, float according to culture...). Since this function
requires an object, you might need to box the value first.
- wordL, sepL, leftL and
rightL convert a string into a Layout. wordL
is a string leaf, that will be separated by spaces. leftL and rightL
are used when the string behaves like a left (or right) parenthesis:
there's a space only on the right (or the left). sepL is used for
separators: no spaces is needed.
- listL, spaceListL,
commaListL... are nice functions for printing
lists.
Several operators are available to group different layouts. You
can choose if layouts are unbreakable (no carriage return), breakable
(the engine chooses what's the best) or broken (to start on a new
line). You can also choose an indentation, when return line is
possible.
- $$ unbreakable
- ++ breakable (no indent)
- -- breakable (indent=1)
- --- breakable (indent=2)
- @@ broken (no ident)
- @@- broken (ident=1)
- @@-- broken (ident=2)
Pay attention to operators priority. For example, ++
has a higher priority than $$, this may
be confusing. Here is a function example:
let get_layout (env: #IEnvironment) (e: #IFormattable) = e.GetLayout(env)
type ast =
| Val of int
| Var of string
| BinOp of binop * ast * ast
| UnOp of unop * ast
with
interface StructuredFormat.IFormattable with
member x.GetLayout(env) =
match x with
| Val i -> objL (box i)
| Var s -> wordL s
| BinOp (Custom (s, _), t1, t2) ->
wordL s ++ get_layout env t1 ++ get_layout env t2
| BinOp (b, t1, t2) ->
(leftL "(" $$ get_layout env t1) ++ get_layout env b ++
(get_layout env t2 $$ rightL ")")
| UnOp (b, t1) ->
get_layout env b $$ get_layout env t1
end
...
The get_layout function defined above can be removed, but you
might need to add type annotations in the code. If you want to test
how your printer works, I suggest you to try in the interactive
mode. You can set the value fsi.PrintWidth
to see how indentation behaves.
Indentation
Here are two examples to explain how to format a if
expression in a language.
Source code
| If (cond, exp1, exp2) ->
((wordL "if" $$ get_layout env cond $$ wordL "then")
@@- (get_layout env exp1))
@@ wordL "else"
@@- (get_layout env exp2)
Output
if exp then
1
else
2
Source code
| If (cond, exp1, exp2) ->
(wordL "if" $$ get_layout env cond) --
aboveL
(wordL "then" -- (get_layout env exp1))
(wordL "else" -- (get_layout env exp2))
Output
if exp then 1
else 2
Full example
I've written a full example using a pretty printer. The example
also features operators overloading, operators redefinition and AST
manipulations. The code contains a tree describing arithmetic
expressions (with variables). The AST can be printed and simplified
using arithmetic rules (x * 0 = 0, etc.).
#light
open StructuredFormat
open StructuredFormat.LayoutOps
let get_layout (env: #IEnvironment) (e: #IFormattable) = e.GetLayout(env)
type ast =
| Val of int
| Var of string
| BinOp of binop * ast * ast
| UnOp of unop * ast
with
static member (+)(a, b) = BinOp (Plus, a, b)
static member (-)(a, b) = BinOp (Minus, a, b)
static member ( *)(a, b) = BinOp (Times, a, b)
static member (/)(a, b) = BinOp (Div, a, b)
static member (~+)(a) = UnOp (UPlus, a)
static member (~-)(a) = UnOp (UMinus, a)
interface StructuredFormat.IFormattable with
member x.GetLayout(env) =
match x with
| Val i -> objL (box i)
| Var s -> objL (box s)
| BinOp (Custom (s, _), t1, t2) ->
wordL s ++ get_layout env t1 ++ get_layout env t2
| BinOp (b, t1, t2) ->
(get_layout env t1 $$ get_layout env b) -- get_layout env t2
|> bracketL
| UnOp (b, t1) ->
get_layout env b $$ get_layout env t1
end
end
and binop =
| Plus
| Minus
| Times
| Div
| Custom of string * (int -> int -> int)
with
interface StructuredFormat.IFormattable with
member x.GetLayout(env) =
match x with
| Plus -> wordL "+"
| Minus -> wordL "-"
| Times -> wordL "*"
| Div -> wordL "/"
| Custom (s, _) -> wordL s
end
end
and unop = UPlus | UMinus
with
interface StructuredFormat.IFormattable with
member x.GetLayout(env) =
match x with
| UPlus -> leftL "+"
| UMinus -> leftL "-"
end
end
let rec map f = function
| Val _ | Var _ as a -> f a
| BinOp (op, a, b) -> f (BinOp (op, map f a, map f b))
| UnOp (op, a) -> f (UnOp (op, map f a))
let simplify =
let op_to_fct = function
| Plus -> (+) | Minus -> (-) | Times -> ( * ) | Div -> (/)
| Custom (_, a) -> a
in map (function
| BinOp (op, Val a, Val b) -> Val ((op_to_fct op) a b)
| UnOp (UMinus, Val a) -> Val (-a)
| BinOp (Times, Val 0, _) | BinOp (Times, _, Val 0) -> Val 0
| BinOp (Times, Val 1, a) | BinOp (Times, a, Val 1)
| BinOp (Minus, a, Val 0) | BinOp (Div, a, Val 1)
| UnOp (UPlus, a) | UnOp (_, UnOp (_, a))
| a -> a)
let (!) a = Val a
let (!?) a = Var a
let min' a b = BinOp(Custom ("min", min), a, b)
let () =
let ast = - (- !?"x") * (!3 + (min' !5 !0) * !?"x") + !?"y"
in print_any (ast |> simplify)