the Ocaml format module

Honestly ocaml format module is a royal PITA to use. The only documentation apart the reference manual is this document here. Don't get me wrong. I think it's a very nice piece of software and absolutely worth having it in the stdlib, but it simply not intuitive (at least for me) to use at the first glance. I'll write down a couple of example. hopefully this will help me - and others - the next time I'll need to use it.

I'm going to use the Format.fprintf function quite a lot. This function uses similar formatting string to the more widely used Printf.fprintf. In the Format module page you can find all the details. Let's start easy and print a string. We write a pretty printer function pp_cell that gets a formatter and an element. This is my favourite way of writing printing function as I can daisy chain together in a printf function call using the "%a" formatting string. If the formatter is Format.std_formatter the string will be printed on stdout.

let pp_cell fmt cell = Format.fprintf fmt "%s" cell
Next we examine a simple function to pretty printer a list of elements. The signature of this function is quite similar as before, but this time we also pass an optional separator and a pretty printer for the element of the string.
let rec pp_list ?(sep="") pp_element fmt = function
  |[h] -> Format.fprintf fmt "%a" pp_element h
  |h::t ->
      Format.fprintf fmt "%a%s@,%a"
      pp_element h sep (pp_list ~sep pp_element) t
  |[] -> ()
The function takes care of printing the separator after all elements but the last one.

Let's start playing witht the boxes. The formatting boxes are the main reason why I use the format module and they are very handy if you want to pretty print nested structure easily.

If we use the std_formatter and the list pretty printer without formatting box, we obtain this output.

# let fmt = Format.std_formatter ;;
# (pp_list ~sep:"," pp_cell) fmt ["aa";"bb";"cc"];;
aa,bb,
cc- : unit =
#
that is the same as :
# Format.fprintf fmt "%a" (pp_list ~sep:"," pp_cell) ["aa";"bb";"cc"];;
aa,bb,
cc- : unit = ()
To be frank, I don't quite get yet why the formatter decide to add a new line after the last comma... but moving on. If I now use a formatting box, the result is different. To print the list one one line, I can use the hbox. If I want a vertical list, I can use the vbox. This gives respectively:
# Format.fprintf fmt "@[<h>%a@]@." (pp_list ~sep:"," pp_cell) ["aa";"bb";"cc"];;
aa,bb,cc
# Format.fprintf fmt "@[<v>%a@]@." (pp_list ~sep:"," pp_cell) ["aa";"bb";"cc"];;
aa,
bb,
cc
If we want to print a list with one character of indentation, this can be easily done as:
Format.fprintf fmt "@[<v 1>@,%a@]@." (pp_list ~sep:"," pp_cell) ["aa";"bb";"cc"];;
 aa,
 bb,
 cc
The idea is that by changing the type of formatting boxes, the soft break @, is interpreted differently by the formatter, once as newline, once as space. Moreover by adding an indentation, the formatter will take care of adding an offset to all strings printed within that box. And this is a winner when pretty printing nested structures.

Lets now delve a bit deeper and let's try to format a table... I didn't found any tutorial on the net about this, but bit and pieces of code buried into different projects... A table for me is a tuple composed by a header (a string array) and two-dimensional array string array. The point here is to format the table in a way where each element is displayed in a column in relation to the longest element in the table. First we need two support pretty printers, one for the header and the other one the each row in the table. In order to set the tabulation margins of the table, we need to find, for each column the longest string in the table. The result of this computation (the function is shown below in pp_table) is an array of integer widths. When we print the header of the table, we make sure to set the width of each column with the Format.pp_set_tab fmt function. The magic of the Format module will take care of the rest. The second function to print each row is pretty straightforward to understand.

let pp_header widths fmt header =
  let first_row = Array.map (fun x -> String.make (x + 1) ' ') widths in
  Array.iteri (fun j cell ->
    Format.pp_set_tab fmt ();
    for z=0 to (String.length header.(j)) - 1 do cell.[z] <- header.(j).[z] done;
    Format.fprintf fmt "%s" cell
  ) first_row

let pp_row pp_cell fmt row =
  Array.iteri (fun j cell ->
    Format.pp_print_tab fmt ();
    Format.fprintf fmt "%a" pp_cell cell
  ) row

The pretty printer for the table is pretty easy now. First we compute the width of the table, then we open the table box, we print the headers, we iterate on each row and we close the box. tadaaaa :)

let pp_tables pp_row fmt (header,table) =
  (* we build with the largest length of each column of the
   * table and header *)

  let widths = Array.create (Array.length table.(0)) 0 in
  Array.iter (fun row ->
    Array.iteri (fun j cell ->
      widths.(j) <- max (String.length cell) widths.(j)
    ) row
  ) table;
  Array.iteri (fun j cell ->
    widths.(j) <- max (String.length cell) widths.(j)
  ) header;

  (* open the table box *)
  Format.pp_open_tbox fmt ();

  (* print the header *)
  Format.fprintf fmt "%a@\n" (pp_header widths) header;
  (* print the table *)
  Array.iter (pp_row fmt) table;

  (* close the box *)
  Format.pp_close_tbox fmt ();
for example this is what we get :
let a = Array.make_matrix 3 4 "aaaaaaaa" in
let h = Array.make 4 "dddiiiiiiiiiiiiiiiii" in
let fmt = Format.std_formatter in
Format.fprintf fmt "%a" (pp_tables (pp_row pp_cell)) (h,a);;
dddiiiiiiiiiiiiiiiii          dddiiiiiiiiiiiiiiiii          dddiiiiiiiiiiiiiiiii           dddiiiiiiiiiiiiiiiii
aaaaaaaa             aaaaaaaa             aaaaaaaa             aaaaaaaa
aaaaaaaa             aaaaaaaa             aaaaaaaa             aaaaaaaa
aaaaaaaa             aaaaaaaa             aaaaaaaa             aaaaaaaa
Well ... more or less. On the terminal you will notice that everything is well aligned. This is of course only to scratch the surface. There are still few things I don't really understand, and many functions that I didn't consider at all. Maybe I'll write a second chapter one day.

Average: 1.1 (62 votes)

Comments

A lighter alternative to the Format module

As a simpler alternative to the Format module I use PPrint, by François Pottier ( on his home page ). It displays term inserting line-breaks and indentation according to the rules you gave, and it works quite nicely. I'm not sure it can handle horizontal alignment across different lines (as in your tab example), but for simpler things it's worth a try.

Unfortunately, it's not very documented (well there is the reference to the original Haskell paper, and the code is actually quite easy to read).

Answer to "why the formatter decide to add a new line"

I have just came across the same problem and decided to understand what was the problem.

I think it is related to boxes. If you don't open your own toplevel box, the Format module just don't know what to do.

Compare:

# Format.fprintf fmt "%a" (pp_list ~sep:"," pp_cell) ["aa";"bb";"cc"];;
aa,bb,
cc- : unit = ()

where the last "cc" is misplaced, to

# Format.fprintf fmt "@[%a@]" (pp_list ~sep:"," pp_cell) ["aa";"bb";"cc"];;
aa,bb,cc- : unit = ()

where you get what you expect.

To my mind, you can solve the problem by opening your own toplevel box (i.e. @[ ... @]).