MuPAD-Combinat is an open-source algebraic combinatorics package for the computer algebra system MuPAD 2.0.0 and higher. The main purpose of this package is to provide an extensible toolbox for computer exploration. The development started in spring 2001, and the package currently contains functions to deal with usual combinatorial classes (partitions, tableaux, decomposable classes, ...), Schubert polynomials, characters of the symmetric group, and weighted automata. It supplies the user with tools for constructing new combinatorial classes and combinatorial algebras and, as an application, provides some well-known combinatorial algebras like the algebra of symmetric functions and various generalizations. Most of the code derives from computer exploration while doing research on the following topics: invariant theory of permutation groups Thiery.IAGR,Thiery.AIG.2000,Thiery.CMGS.2001, Steenrod algebras and symmetric functions Hivert_Thiery.SA.2002, binary tree algebras such as the Loday-Ronco algebra and renormalization in quantum electrodynamics Hivert_Novelli_Thibon.PBT.2002,Hivert_Novelli_Thibon.PBT.2003, representation theory, quantum groups, and homological computations Duchamp_Hivert_Thibon.2002, as well as peaks and Hecke-Clifford Algebras Bergeron_Hivert_Thibon.2003. This represents about 65000 lines of MuPAD and C++ code together with 450 pages of documentation, written by 3 main developers and altogether about 20 contributers. The core of the package is integrated in the official library of MuPAD since version 2.5.0.
The purpose of this paper is to present the package for novice and advanced users, for potential contributers, as well as for developers who do not know about MuPAD but want to compare the package with other similar packages.
After a presentation of our motivations for writing such a package, we propose a guided tour through its features. Though this tour assumes some familiarity with the MuPAD computer algebra system, the first part which describes the combinatorial feature of MuPAD-Combinat is intended for novice users, and does not assume strong programming knowledge. The second part of the tour which is devoted to the building of new combinatorial algebras, is a little bit more involved.
After that, the paper goes on with some design notes. This third part of the paper deals much more with programming techniques, and may be of interest for people wanting to understand the underlying mechanisms of the package, for example to compare it with similar packages. It assumes some strong knowledge about programming but not necessarily about the MuPAD language itself. In particular we discuss the advantage of using typing mechanism and object oriented features rather than just manipulating expressions which is the usual mechanism of similar packages. As such, it may interest people wanting to have comparable features in a different language.
1.1. A need for a toolbox for computer exploration in algebraic combinatorics
While doing research in (algebraic) combinatorics, computer exploration can be of great help. In its simplest form, when looking for the generating series of a combinatorial class, one can try to compute its first terms; those may give a hint on a recurrence relation or a general formula; at least, they can be sent to the Online Encyclopedia of Integer Sequences Sloane for comparison with well known sequences. In general, using a computer allows one to study large scale examples (in combinatorics, the size of examples usually grows very quickly!). This can help to suggest conjectures, check them for likeliness, or find counter-examples.
The first author is interested in symmetric functions and their generalizations in connection with representation theory. The problem is basically to find interesting bases together with product and change of basis rules (analogues of Littlewood-Richardson rules). His results were mainly obtained by computer exploration, using some routines extending the package Duchamp_Hivert_Thibon.2002. Similarly, the second author had developed a library for computing within invariant rings of permutation groups, in order to study certain invariant rings related to graph theory. The common point of those tools was that they essentially consisted of basic combinatorial routines together with mechanics to compute within certain combinatorial algebras.
By a combinatorial algebra we mean a vector space with a basis indexed by combinatorial objects and endowed with a product that obeys some combinatorial rule. Think of the group algebra of the -th symmetric group: the basis is indexed by permutations, while the product is given by the usual product of permutations. Such combinatorial algebras appear in many situations (see e.g. MacDonald.SF.1995,Gelfand_Krob_Lascoux_Leclerc_Retakh_Thibon.1995.NCSF1,Connes_Kreimer.1998,Stembridge.peaks.1997). One often needs to run computations within such algebras, for example for finding generators, idempotents, and in general better understanding their algebraic structure.
A typical problem is to find the elements in a given combinatorial algebra satisfying certain properties (say ):
Provide the product rule on basis elements (unless the algebra is already implemented);
Produce a system of equations characterizing those elements by running appropriate computations in the algebra;
Solve this system of equations;
Interpret the result.
This simple example highlights what a platform for computer exploration should essentially provide: in step 1, a comprehensive toolbox of basic combinatorial routines may help to implement the product rule; in step 2, the system should take care of all the linear bookkeeping to allow one to easily manipulate elements of the algebra; step 3 requires all the usual computer algebra tools (linear algebra, Gröbner bases, integration, solvers, ...). We do not plan to automatize step 4.
1.2. Review of preexisting software
We now review some available software tools for doing algebraic combinatorics and we comment from our experience and expectations why or why not, or to what extent, they fit our needs. We do not seek completeness, but rather want to present the background that motivated the definition of the specifications for MuPAD-Combinat. For a comprehensive list of related software, we refer to http://www.mat.univie.ac.at/ slc/divers/software.html.
One of the first and widely known packages for algebraic combinatorics is J. Stembridge's library for which is designed to compute with symmetric functions.
A more ambitious package for , called , was developed in Marne-la-Vallée, mostly by S. Veigneau. It provides a wide range of combinatorial routines and implements several classical combinatorial algebras (symmetric functions, quasi-symmetric function, non commutative symmetric functions, Schubert polynomials, ...) using state of the art algorithms. Being a library for a computer algebra system that is widely used in the community helped it to spread (there are about 100 known users), and allowed one to combine it with the many other existing combinatorics package for the same system. On the other hand it suffered from the poor programming language of , and the notorious incompatibilities with the new versions of made its maintenance tricky. Experience showed that the overall design made it difficult to extend, in particular for defining new combinatorial algebras. Altogether, the development essentially stalled in 1999 when S. Veigneau left for industry after finishing his PhD thesis.
, also developed in Marne-la-Vallée, by V. Prosper, was an attempt to translate for the computer algebra system MuPAD. The goal was mainly to test whether the MuPAD programing language was more adapted to the needs, and in particular to incorporate (see below) into the system, via a dynamic module. However, it suffered from the same design and development model limitations as and the development also stalled when V. Prosper left for industry after finishing his PhD thesis in 2000.
A. Kohnert leads the development of , a collection of C routines to compute with symmetric functions and Schubert polynomials, ordinary, modular, and projective representations of the symmetric group, and Hecke algebras of type A. The underlying programming language permits very substantial speed improvements compared to equivalent algorithms written, say, in . The object oriented design definitely helps for maintaining and extending it. On the other hand, it does not provide support for an easy definition of new combinatorial algebras, and can't be straightforwardly combined with other computer algebra tools. The remaining drawbacks, coming from the programming language, are partly matters of personal taste. There is a steep learning curve for casual programmers, which makes it difficult to attract new users. We also find the development cycle to be too long in a low-level programing language. Finally, not having an interpreter makes it quite unpractical for interactive computer exploration (this could be circumvented by using a C interpreter like ).
B. Weybourne also wrote an interactive program called for calculating properties of Lie groups and symmetric functions, with a view toward physics. As for , it can't be easily combined with other computer algebra tools, and does not provide support for easy definition of new combinatorial algebras.
To some extent, the systems and allow the user to define new combinatorial (Lie) algebras, and provide a wide set of tools from group theory and algebra that are useful for algebraic combinatorics. We discuss them further later on in the choice of the underlying system.
Finally, one should mention the library for Gröbner basis computations by F. Chyzak which allows one to easily implement those combinatorial algebras that fit within the more specific framework of Ore-algebras.
As argued above our goal is to have a flexible and easy to use toolbox for computer exploration in algebraic combinatorics. This includes two clearly distinguished but closely interfaced parts: one for combinatorics and the other for algebra.
The combinatorial part should provide basic routines to deal with various combinatorial objects. As such, it is to become a large collection of relatively small functions to count, list, manipulate combinatorial objects. Most of the required combinatorial utilities are quite common (say, list all the partitions of a given integer, ...) but there are so many potential utilities that a combinatorial package will never be able to provide all of them, or at least not in an optimized form. Instead, the aim should be to make it easy for users to write such utilities as need for them arises. This includes providing versatile tools for defining and manipulating new combinatorial classes.
The algebraic part should be a sort of mecano built on top of the combinatorial part. In other packages like or , the aim is to provide polished implementations of some specific algebras like symmetric functions. Instead, we aim at providing an unspecialized, very flexible and extensible toolbox to build new algebras, with the standard algebras being implemented as mere examples of applications of the general framework. In short, the package should try to take care of the trivial but tedious parts of the computations, letting the writer concentrate on the specific parts of his problem which are, most of the time in our experience, of combinatorial nature.
Moreover, the system should allow for a computation of the answers of natural questions such as “what is the rank of this set of vectors?” or “which elements in this combinatorial algebra are idempotents?”. In such computations, one often builds large systems of equations, linear or not. Solving such systems, requires versatile general purpose computer algebra tools, like linear algebra, integration, Gröbner basis, and so on. In many occasions, the systems may become very large, requiring specialized high performance software like, for example, http://fgbrs.lip6.fr/jcf/index.htmlFGB/RS, http://www-sop.inria.fr/galaad/logiciels/synaps/SYNAPS, or http://www.linalg.org/LinBox. Interfacing with such tools should be as seamless as possible. This speaks for using a general purpose computer algebra system, that can be easily interfaced with external specialized tools.
1.3.2. A short development cycle
Another aspect of writing software for daily computer exploration is that the development cycle ought to be short. To stress on this, here is the schedule of one of our (ideal) research days:
Conjecture A is on the board;
We have a first idea on how to test it on a computer
The code is written; the laptop starts computing
The first results are there; oops, a counter-example; the conjecture is indeed incorrect as it is;
A patched conjecture A' is formulated;
The laptop starts running the adapted code;
The first results are there; ok, maybe this conjecture is not so stupid after all; let us try to prove it;
More results are there; but memory is running low on my laptop. Hey, see? All coefficients are integers. This gives a good hint for the proof. But this also suggests a much better algorithm for running the computation.
The new code is written. Good, it runs 10 times faster, and uses far less memory. Let us start it on the server, and go for lunch.
Back from lunch; many more examples have been checked in the meantime.
Conjecture A' is proved; problem B(A') is on the board. the server is computing some examples overnight.
The code written during the day is analyzed; the bits that can be reused are cleaned up and integrated into MuPAD-Combinat together with documentation and test files; everything is instantly available on the anonymous CVS server on mupad-combinat.sf.net.
Here, genericity, flexibility, rapid prototyping, and speed of development are at a premium. Of course, efficiency is desirable but constant time factors are not necessarily so important (anyway, most of the time, the size of the studied objects grows exponentially). Optimizations are usually only really required in very specific parts (underlying linear algebra, ...); only those parts need to be optimized, after a careful analysis with a profiler. Ideally, the code should be written in such a way as to leave room for such specializations and optimizations. All of this speaks for a high-level language that allows one to write code that sticks as much as possible to the mathematical way of thinking. Of course, this does not preclude the use of external modules written in a low-level language like C for the critical sections, when there is a clear need for it.
1.3.3. A package designed and developed by users, for users
The package should allow different levels of use:
Occasional usage, as a mere calculator using only predefined utilities.
Regular usage, programming of little utilities, definition of simple new combinatorial classes and algebras;
Intensive usage, programming of complete libraries for new combinatorial classes and algebras;
Core hacking, implementation of generic algorithms, writing of optimized external modules (say in C), ...
For the first two levels of use, being integrated in a well-known and widely available system helps so that the user can work in his usual computing environment. This is crucial to attract new users.
As stressed above, by the very nature of the application field, any non-trivial usage involves extending the package with new utilities. To avoid duplicate work, it is essential for users to share their code. Ideally, the package should essentially end up acting as a repository of user developed routines. To this end, it needs to define a well designed framework where new utilities can be easily and quickly integrated in a natural and easy to find place. Then, defining this framework, and coordinating the developments to ensure a large scale coherency would be the main role of the core developers. Of course, the platform and the development model should encourage contributions and foster collaborations.
This aspect is particularly important for us, core developers, as our ultimate goal is to do research, not to write software. It happens that, to do this research, we need appropriate tools, and we are currently investing a lot of energy to launch this project to fulfill our own needs as well as, hopefully, other's. It is our personal hope for the near future that sharing those tools and the associated development time with others will actually save us time for more research.
1.4. Structure of this document
Apart from the preceding introduction this paper is divided into two sections. The first section is a guided tour through MuPAD-Combinat. After a general example of usage, we describe step by step the structure of a combinatorial class, together with two generic tools to deal with constrained list of integers and decomposable objects described by a recursive grammar (this is similar to the description of languages by context-free grammar, though here the grammar are not required to be context free). We provide in particular some examples of how to define new combinatorial classes. The next two subsection are devoted to combinatorial algebras, both predefined and new. This first part ends with a summary of the current and soon-to-be features of the package. Note that the version of this document included in the MuPAD-Combinat documentation provides exercises throughout this guided tour.
The second section is devoted to the design of the package. First we discuss the choice of the platform and of the development model. We explain some very basic conventions such as naming. Then we proceed with combinatorial objects and how they can be represented in a computer algebra system. We describe the design of a unified interface for various combinatorial classes on the top of which algebraic objects can be built. There, inheritance is essential to standardize and reuse code. Finally, we deal with combinatorial algebras. We describe the advantage of typing objects rather than using expressions. Then, we concentrate on the description of several implementations of free module and how the system takes care of linearity. We end up by a description of the interface for combinatorial algebras with different bases and of the mechanism for conversions between these different bases.
The suggested order of reading is to browse quickly through the guided tour (Section 4), and the design notes (Section 14, essentially the beginning of subsections Representing combinatorial objects and classes and Representing combinatorial algebras), and then to read these two sections in detail with a computer under hand to experiment with the examples.