PyPy
PyPy[logic-plan]

Some ideas about implementation of Logic Programming

The problem

Basics

Logic and contraint programming as envisionned in PyPy draws heavily on the Computation Space concept present in the Oz language.

Computation spaces provide an infrastructure and API that allows seamless integration of concurrent constraint and logic programming in the language.

Currently, there is an implementation of computation spaces that is sufficient enough for constraint solving. It uses :

ask() clone() commit()

in straightforward ways. This space is merely a constraint store for finite domain constraint variables. It does not really support execution of arbitrary computations 'inside' the space ; only constraint propagation, i.e. a fixed interrpeter-level algorithm, is triggered on commit() calls. [note: don't look at the current code ...]

Choice points

For logic programming we need more. Basically a 'logic program' is a standard Python program augmented with the 'choice' operator. The program's entry point must be a zero arity procedure. Any program whose execution path contains a 'choice' is a logic, or relational (in Oz parlance) program.

For instance:

def foo():
    choice:
        return bar()
    or:
        from math import sqrt
        return sqrt(4)

def bar():
   choice: return 0 or: return 1

def entry_point():
    a = foo()
    if a < 2:
       fail()
    return a

What should happen when we encounter a choice ? One of the choice points ought to be chosen 'non-deterministically'. That means that the decision to choose does not belong to the program istelf (it is not an 'if clause' depending on the internal state of the program) but to some external entity having interest in the program outcomes depending on the various choices that can be made in a choice-littered program.

Such an entity is typically a 'solver' exploring the space of the program outcomes. The solver can use a variety of search strategies, for instance depth-first, breadth-first, discrepancy search, best-search, A* ... It can provide just one solution or some or all of them.

Thus the program and the methods to extract information from it are truly independant, contrarily to Prolog programs which are hard-wired for depth-first exploration (and do not support concurrency, at least not easily).

Choice, continued

The above program contains just one choice point. If given to a depth first search solver, it will produce three spaces : the two first spaces will be failed and thus discarded, the third will be a solution space and ready to be later merged. We leave the semantics of space merging for another day.

Pardon the silly ascii art:

entry_point -> foo : choice
                      /\
                     /  \
                    /    2 (solution)
           bar : choice
                  /\
                 /  \
                /    \
               0      1
          (failure)(failure)

Search and spaces

To allow this de-coupling between program execution and search strategy, the computation space comes handily. Basically, choices are made, and program branches are taken, in speculative computations which are embedded, or encapsulated in the so-called computation space, and thus do not affect the whole program, nor each other, until some outcome (failure, values, updated speculative world) is considered and eventually merged back into the parent world (which could be the 'top-level' space, aka normal Python world, or another speculative space).

For this we need another method of spaces :

choose()

The choice operator can be written straightforwardly in terms of choose:

def foo():
    choice = choose(3)
    if choice == 1:
        return 1
    elif choice == 2:
        from math import sqrt
        return sqrt(4)
    else: # choice == 3
        return 3

Choose is more general than choice since the number of branches can be determined at runtime. Conversely, choice is a special case of choose where the number of branches is statically determined. It is thus possible to provide syntactic sugar for it.

It is important to see the relationship between ask(), choose() and commit(). Ask and commit are used by the search engine running in the top-level space, in its own thread, as follows :

ask() wait for the space to be 'stable', which means that the program running inside is blocked on a choose() call. It returns precisely the value provided by choose, thus notifying the search engine about the number of available choice points. The search engine can then, depending on its strategy, commit() the space to one of its branches, thus unblocking the choose() call and giving a return value which represent the branch to be taken. The program encapsulated in the space can thus run until :

  • it encounters another choice point,
  • it fails (by way of explicitly calling the fail() operator, or raising an uncatched exception up to its entry point),
  • properly terminating, thus being a candidate 'satisfying' solution to the problem it stands for.

Commit and choose allow a disciplined communication between the world of the search engine (the normal Python world) and the speculative computation embedded in the space. Of course the search and the embedded computation run in different threads. We need at least one thread for the search engine and one per space for this scheme to work.

Cloning

The problematic thing for us is what happens before we commit to a specific choice point. Since the are many of them, the solver typically needs to take a snapshot of the current computation space so as to be able to later try the other choice points (possibly all of them if it has been asked to enumerate all solutions of a relational or constraint program).

Taking a snapshot, aka cloning, is easy when the space is merely a constraint store. It is more involved when it comes to snapshotting a whole tree of threads. It is akin to saving and making copies of the current continuation (up to some thread frame).

Solutions, or ideas in the direction of

Copying collector

In Mozart computation space cloning is implemented by leveraging the copying garbage collector. There is basically a switch on the GC entry point which tells whether it is asked to really do GC or just clone a space.

Thread cloning

It is expected that PyPy coroutines/greenlets/microthreads be at some point picklable. One could implement clone() using the thread pickling machinery.

Such is the abstract interface of computation spaces (the inherit-from-threads part is a bit speculative):

class MicroThread ...
     "provided by PyPy, stackless team"

class CompSpace(Microthread):

     """solver API"""

     def ask(self):
         """
         blocks until the space is stable, which means that
         the embedded computation has reached a fixpoint
         returns int in [0,1,n]
            0 : space is failed (further calls to the
                                 space raise an exception)
            1 : space is entailed (solution found)
            n : space is distributable (n choice points)
         """

     def commit(self, choice):
         """tells the space which choice point to pick"""

     def clone(self):
         """returns a cloned computation space"""

     def merge(self):
         """
         extract solution values from an entailed space
         furher calls to space methods will raise
         an exception
         """

     """distributor API"""

     def choose(self):
         """
         blocks until commit is called
         returns the choice value that
         was given to the commit call
         """

     def fail(self):
         """
         mark the space as failed
         """

     """
     constraint stuff : a space acts as/contains a constraint
     store which holds : logic variables, constraint variables
     (logic vars augmented with finite domains), constraints
     """

     def var(self, domain, name=None):
         """
         creates a possibly named contraint variable inside
         the space, returns the variable object
         """

     def tell(self, constraint):
         """register a constraint"""

Programs runing in a computation spaces should not be allowed to alter the top-level space (i.e interaction with the global program state and outside world).