Boost logo

Boost Users :

From: Stephen Nuchia (snuchia_at_[hidden])
Date: 2008-02-23 02:02:04


> I'm talking about a schedule generating program for an undergraduate
> student. It generates a schedule for a semester for a student, given
> some constraints (time of day, courses done already and so on) and the

I think you'll find the number of feasible schedules to be much larger
than you have time to enumerate in many cases.

Whether for build scheduling, course scheduling, or analysis of
protocols and distributed algorithms, the basic topo sort setup can be
applied to finding feasible parallel schedules (without resource
constraints) by converting all non-atomic events into begin < end pairs.
There may be a trick for extending it to resource constraints (e.g.
number of CPUs) but I can't think of one right now.

The variations on this theme are well studied; scheduling is an ancient
application of automatic data processing machinery. I suggest we leave
off discussing applications (interesting though they are) and think
about whether and how the existing library might be extended or adapted
to the task of enumerating feasible orderings of a poset.

> For the time being, all I want is to be able to generate all the
> different schedules (i.e., all various topo sorts). Indeed, something
> like std::next_permutation() would make a lot of sense. It would be
> fast and flexible. My next task would then be to find an algorithm to
> quickly estimate the number of different orderings. :)

You seem to be convinced that your scheduling problem can be represented
as a topological sorting problem. I am skeptical of this but, as
mentioned above, the application is not really the point.

Heuristics probably exist, I can think of a few that might have some
value. There are N! permutations on N elements and every independent
edge cuts the space in half. If you can find a set of k independent
single-edge subgraphs you know there are no more than N!/2^k feasible
permutations. Wherever you have mutually incomparable chains you can
compute the number of feasible interleavings. I'm not sure but I
suspect computing the number exactly will require the same machinery as
enumerating them. If we're interested in truncated permutations (or
general projections of permutations) counting gets much harder.

>> For an input small enough that the
>> set of feasible orderings will fit in memory I would simply generate
all
>> permutations and screen for feasibility rather than attempting to
create
>> an enumeration algorithm.

> I don't follow. Can I change the ordering of vertices without ruining
> the graph's integrity? How? Would this affect the toposort output?

I'd have to look at the representation details and I haven't yet. In
general, the abstract structure of the input to any algorithm is encoded
in some physical way and that physical encoding may contain information
beyond that in the abstract data structure it represents.

The example to which I was alluding is from conventional sorting. When
the key fields of two input records are equivalent under the rules of a
sort, the algorithm has to choose which will be appear first in the
output. The choice may be arbitrary -- either random or effectively
random -- or it may be based on the input in some direct way. One
choice is to use non-key fields to extend the key. Another is to use
the positions of the records in the input stream to extend the keys. A
sort that implements this last policy is said to be "stable" because it
keeps records with equivalent keys in their original order. I don't
know if it is standard terminology but I would call a topological sort
stable if it kept incomparable elements in their original order --
assuming that has meaning.

The input to a topological sort has to identify the elements of the
poset by some runtime representation: an object pointer, a number, a
string, whatever. And the data specifying the poset has to be organized
in the computer's storage somehow; there is usually some flexibility in
this arrangement. If you inspect the core of the topological sort
algorithm you will find that manipulating these abstractly irrelevant
details of the physical representation of the input will influence the
"free" choices it makes when the output is not fully determined by the
abstract input.

>> Create a comparability matrix and use it to
>> prune the permutation tree, perhaps? N^2 bits doesn't sound
>> prohibitive.

> I don't follow...

That's probably not standard terminology. Create a square matrix with
one bit per entry, indexed on rows and columns by poset element
identifiers. Mark a 1 in Mij if Vi <= Vj (say) is in the input. Then
iterate adding the transitive implications of the input constraints
until M contains an explicit representation of the partial order
relation. Details left as an exercise involving bitwise-or and looking
it up in a book.

Now if you use the standard permutation generator you can devise an
algorithm to screen its output for consistency with the ordering
relation using the matrix. But you could also pass the matrix in to the
permutation generator and it could save a lot of time by not generating
BACD... and BADC... if it knows that A<=B is in the relation.

> depth_first_search(). For what I understand, depth_first_search()
> works on a set of start vertices (which have no incoming edges), and
> for each of these the algorith dives in to the depth of the vertex's

The abstract algorithm works on a set. Any concrete realization of the
algorithm operates on representations of sets. Control the input and
you control the output :-)

> I personally think the topological_sort() algorithm begs for an
> algorithm that would allow iterating over all unique orderings. As I

My biases say that in interesting cases you don't have time to iterate
over them. But that just reveals a difference in what we find
interesting. Any set you can enumerate before the heat death of the
universe is uninteresting. ;-) In contrast, the set of "traces"
(feasible interleavings of atomic events) for a distributed reactive
system is quite interesting. The set of 4-threaded build schedules for
a software system with 15000 source files is fairly interesting.

A variation on the permutation enumerator that takes some kind of
feasibility information and uses it to prune the tree (the logical tree
of outputs that it is virtually traversing) makes sense to me. A
partial order specification would be the obvious parameter and might be
logically equivalent to any other reasonable parameterization but I'd
need to think about that. It might be feasible to design the
parameterization interface a bit more generally and admit things like
resource constraints that are hard to encode in the poset abstraction
alone. And if we're going to implement tree pruning we should provide
an interface for a figure-of-merit function so we can also prune out
feasible but unprofitable branches.

If all you want is a solution to your problem as stated, code it in
Prolog. The interactive interpreter will keep on generating sequences
as long as you tell it "no" in response to each one. The only reason to
add this to the library would be if it would be used to advantage in a
variety of C++ programs. I can think of a whole raft of good uses for a
constrained combination/permutation generator with general tree pruning.
On the other hand, a whole lot of scheduling and optimization code
already exists and it would be wise to inventory that before we start
hacking.

-steve


Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net