This is the second post about distcheck. I want to give a quick overview of the differences between edos-distcheck and the new version. First despite using the same sat solver and encoding of the problem, Distcheck has been re-written from scratch. Dose2 has several architectural problems and not very well documented. Adding new features had become too difficult and error-prone, so this was a natural choice (at least for me). Hopefully Dose3 will survive the Mancoosi project and provide a base for dependency reasoning. The framework is well documented and the architecture pretty modular. It's is written in ocaml, so sadly, I don't expect many people to join the development team, but we'll be very open to it.
These are the main differences with edos-debcheck .
distcheck is about two times faster than edos-debcheck (from dose2), but it is a "bit" slower then debcheck (say the original debcheck), that is the tool wrote by Jerome Vouillon and that was then superseded in debian by edos-debcheck. The original debcheck was a all-in-one tool that did the parsing, encoding and solving without converting the problem to any intermediate format. distcheck trades a bit of speed for generality. Since it is based on Cudf, it can handle different formats and can be easily adapted in a range of situation just by changing the encoding of the original problem to cudf.
Below there are a couple of test I've performed on my machine (debian unstable). The numbers speak alone.
The second big difference is about different input format. In fact, at the moment, we have two different tools in debian, one edos-debcheck and the other edos-rpmcheck. Despite using the same underlying library these two tools have different code bases. distcheck basically is a multiplexer that convert different inputs to a common format and then uses it (agnostically) to solve the installation problem. It can be called in different ways (via symlinks) to behave similarly to its predecessors.
At the moment we are able to handle 5 different formats
distcheck handles gz and bz2 compressed file transparently . However if you care about performances, you should decompress your input file first and the parse it with distcheck and it often takes more time to decompress the file on the fly that run the installability test itself. There is also an experimental database backend that is not compiled by default at them moment.
Regarding the output, I've already explained the main differences in an old post. As a quick reminder, the old edos-debcheck had two output options. The first is a human readable - unstructured output - that was a handy source of information when running the tool interactively. The second was a xml based format (without a dtd or a schema, I believe) that was used for batch processing.
distcheck has only one output type in the YAML format that aims to be human and machine readable. Hopefully this will cater for both needs. Moreover, just recently I've added the output of distcheck a summary of who is breaking what. The output of edos-debcheck was basically a map of packages to the reasons of the breakage. In addition to this information distcheck gives also a maps between reason (a missing dependency or a conflict) to the list of packages that are broken by this problem.This additional info is off by default, but I think it can be nice to know what is the missing dependency that is responsible for the majority of problems in a distribution...
For example, calling distcheck with --summary :
Below I give a small example of the edos-debcheck output compared to the new yaml based output.
And an extract from the distcheck output (the order is different. I cut and pasted parts of the output here...)
bash (<= 2.7)to check all version of bash in the universe with version greater than 2.7.