Thought for the day

Posted on 19 January 2010 by cmsmcq

[19 January 2010]

Not quite heard in conversation: “You will never gain full mastery of any tool until you have misused it.”

Binary adder in ACL2 and Alloy

Posted on 11 January 2010 by cmsmcq

[11 January 2010]

Not too long ago, I wrote about cross training in programming languages, and the pleasure of rewriting things from one language into another.

Recently I had another occasion to think about that. I’ve been doing some work lately with two systems which I think of as similar, though they have different goals and emphases. ACL2 (‘a computational logic for applicative Common Lisp’) is an industrial-strength theorem prover, with a history of use for proofs of correctness for chip designs, programming languages, and mathematical propositions like the fundamental theorem of calculus. It’s a lot like a high-performance car, both in its power and in the fact that it spends a lot of time in the shop. Er, uh, I mean, it requires close, intelligent attention from the user. And its learning curve looks a lot like the approach to Mesa Verde. Alloy, by contrast, is an instance of what its designer calls “light-weight formal methods”; the language is kept simple for analysability’s sake, and no theorem prover is provided: instead, exhaustive searches through all the models of a system up to a given size are made, to find counter-examples to an assertion. It’s got a learning curve, too, but in my experience it’s pretty manageable.

I’m interested in both ACL2 and Alloy because both seem to me to offer help in making designs and specifications correct, and keeping them that way when changes are made. It would be nice to have a better sense of when one tool is more appropriate to a problem, and when the other. One conjecture I’ve entertained: Alloy for initial playing around with a design, ACL2 for fuller proofs of correctness (when real proofs are desired) after the design has settled down a bit. Solving the same problem using both tools seems like an obvious way to test that conjecture and get a sense of how the two compare.

So in a spirit of exploration, I recently took an example from the ACL2 textbook* and experimented with making an Alloy model of the same thing, which I finished the other day and have now published on the Alloy community site. The example covers a binary adder constructed from simple sub-units which add single-bit numbers. (I found it enlightening to read the Wikipedia article on adders; if you’re not an electrical engineer, you may, too.)

* Matt Kaufmann, Panagiotis Manolios, and J Strother Moore, Computer-Aided Reasoning: An Approach ([Austin]: [n.p.], 2002), section 10.5.1.

The experience poses a bit of a challenge to the conjecture just mentioned: although ACL2 does require some help proving that the relevant function terminates and that it’s correct, in some ways the ACL2 description seems simpler than the Alloy description. Certainly it’s more compact. To ensure that models are analysable, Alloy forbids recursive functions; finding an Alloy equivalent to the recursive ACL2 specification of the adder took me some head-scratching, and the result uses a lot of machinery that feels kind of ad hoc to me. But in a way, the fact that a non-recursive formulation can be found also feels like a demonstration of Alloy’s expressive power. (Short version: recursion can be changed to iteration, right? So think about using sequences.)

A really good comparative sense of the two systems will require a lot more than one exercise of this kind; ideally, I’d like a whole series of ACL2-to-Alloy and Alloy-to-ACL2 translations. Currently, my Alloy seems to be up to translating at least some ACL2 exercises into Alloy; I am not certain my ACL2 skills are up to finding ACL2 equivalents for standard Alloy exercises. If you are interested in formal methods (and who else will still be reading at this point?), all I can say is: stay tuned. And if you do any cross-translation exercises of this kind, think about sharing them!

Bare-bones TEI

Posted on 6 January 2010 by cmsmcq

[6 January 2010]

The markup vocabulary defined by the Text Encoding Initiative is, for good and sound reasons, rather large and in some ways rather complicated. From time to time, however, it’s useful to have a radically cut-down version of the TEI vocabulary. For people just learning TEI markup (to name just one instance), a cut-down version can simplify initial training and make the TEI feel a little less intimidating.

Many people use TEI Lite, which is much smaller than full TEI. But for some work I’m doing with Yves Marcoux and Claus Huitfeldt, we felt it would be handy to have an even smaller subset of TEI to work with. Years ago (1994), I defined a profile of TEI called ‘Bare Bones TEI’, intended not so much for serious use as for training and for thought experiments.

I had long regarded the full details of bare-bones TEI as lost to history: the documentation has been preserved by Robin Cover at the XML Cover Pages, but it didn’t include the DTD modification file showing the precise changes from the then-current TEI DTD. I had tried a few times, searching both the Web and my hard disk, to find the original data, but had had no luck. But the other day, for reasons I don’t think I can explain, an attempt at one more search eventually found an old copy of the documentation, the modification files, and the full DTD for bare-bones TEI.

I’ve now translated the original documentation to XML, added updated links to the various DTD files, and added parameter entities to the DTD to make it work both for SGML contexts and for XML. The documentation for Bare bones TEI is now available on this site, as is the modified DTD.

The current version of the bare-bones DTD is based on TEI P3, and the translation to XML loses a small amount of information involving the pb (page break) element. Eventually, perhaps, I’ll apply the bare-bones customization to TEI P5 and produce an updated version of the schema and documentation.

Note: in searching for Bare-bones TEI on the TEI Consortium site just now, I discovered that someone has produced a similar profile, called ‘Bare TEI’. [Further research shows that although the page on customizations does not identify the authors, the work was done originally by Laurent Romary and later edited by Syd Bauman, Lou Burnard, and possibly also Sebastian Rahtz.] This may be worth exploring, for those seeking a minimal profile of TEI. Unfortunately, I haven’t found any usable documentation for Bare TEI, so I don’t know the design principles that govern it, and the DTD is full of undefined ghost elements (or ‘zombies’), which renders it unfortunately cumbersome in a syntax-directed editor. So for now I’m sticking with bare-bones TEI.

Xerophily, a parser for XSD regular expressions

Posted on 30 December 2009 by cmsmcq

[30 December 2009]

A while back, in connection with the work of the W3C XML Schema working group, I wrote a parser, in Prolog, for the regular-expression language defined by the XSD spec. This has been on the Web for some time, at http://www.w3.org/XML/2008/03/xsdl-regex/.

Not long ago, though, I received an inquiry from Jan Wielemaker, the maintainer of SWI Prolog (and ipso facto someone to whom I owe a great debt, along with every other user of that well maintained system), asking if it wouldn’t be possible to re-issue the code under the Lesser Gnu Public License, so that he could use parts of it in some work he’s doing on support for the simple types of XSD. (Strictly speaking, he could do that under the W3C license, but that would mean he has to manage not one license but two for the same library: too complicated.)

Fine with me, but since I wrote it while working at W3C, I don’t own the copyright. To make a long story short, after some cavilling and objections and assorted mutual misunderstandings, agreement was reached, and W3C gave their blessing.

The LGPL boilerplate assumes that the package being issued has a name, so I have given the XSD regex parser the name Xerophily.

[“Xerophily? What on earth is that?” asked Enrique. The word, I explained to him, denotes the property possessed by plants well adapted to growing in dry, especially hot and dry, conditions. “Well, that makes it apt for code produced in New Mexico, I guess,” he allowed. “And your code, like those plants, is a little scraggly and scrawny. But what name are you going to use for your next program?” “It’s also one of the few words I could find containing X (for XSD), R (for regular expression) and P (for parser, and for Prolog).” “Oh.”]

A word of caution: Xerophily was written to support some very dry work on the grammar of the regex language; it doesn’t really have a user interface; and it doesn’t actually perform regex matching against strings. It just parses the regex into an abstract syntax tree (AST). I do have code, or I think I did some years back, to do the matching, but the shape of the AST has changed since then. In other words, Jan Wielemaker may be the only person on earth who could have any plausible reason for wanting to use or read this code.

But for what it’s worth, as of a few minutes ago, it’s on the Web, under the LGPL, at the address http://www.w3.org/XML/2008/03/xsdl-regex/LGPL/. The W3C-licensed version remains available, at the address given above.

Philosophers make quick keyboardists

Posted on 17 December 2009 by cmsmcq

[17 December 2009]

A mnemonic for ACL2 induction

Lately I’ve been spending time working through a lot of exercises in ACL2. As a way of helping the user internalize the requirements for successful induction, several exercises ask for an explicit reformulation of a problem in terms of the ACL2 induction principle.

Don’t worry: I don’t want to try to explain the ACL2 induction principle here. It suffices for present purposes to observe that a fully explicit application of the ACL2 induction principle requires that you write down a number of things; you, dear reader, don’t need to understand what they are, only that they exist and need to be specified:

φ, the formula being proved
m, the measure to be used when computing the ‘size’ of a particular instance of the formula
q_i (for 1 ≤ i ≤ k), the conditions which determine the different induction steps: one induction step for each q_i
k, the number of induction steps (and thus of conditions)
σ_i,j (for 1 ≤ i ≤ k and 1 ≤ j ≤ h_i), the substitutions applicable to condition q_i; each condition q_i may have up to h_i hypotheses and corresponding substitutions
h_i, number of induction hypotheses for each induction step q_i
the measure conjectures for the case: (a) that the measure given always produces an ordinal value, and (b) that the measure decreases on each recursive call (i.e. in each induction hypothesis)

After a while, my evil twin Enrique got tired of watching me flip back and forth between the statement of the problem I was trying to solve and the page that showed all the things that needed to be written down; he said “Haven’t you memorized that list yet?” “No,” I said. “It’s not that simple a list, is it?”

“Sure it is,” he said. “Just use a mnemonic to remember it. The full list, with all subscripts, is

φ m, q_{i ≤ k}, k, σ_{i, j}, h_i, m c

“So just remember

Philosophers make quick keyboardists; strength in judgment helps improve mental capacity.

“Or if you can remember the subscripts by yourself, and just need (φ, m, q, σ):

Philosophy multiplies quizzical subtleties.

“Easy, see?”

Sometimes I think Enrique has too much time on his hands.

Messages in a Bottle

CMSMcQ's klog

Thought for the day

Binary adder in ACL2 and Alloy

Bare-bones TEI

Xerophily, a parser for XSD regular expressions

Philosophers make quick keyboardists