|
Boost Users : |
Subject: Re: [Boost-users] [Graph] Stable topological_sort? And topological_sort woes in general.
From: Dominique Devienne (ddevienne_at_[hidden])
Date: 2013-09-30 08:41:41
On Fri, Sep 27, 2013 at 9:54 PM, Jeremiah Willcock <jewillco_at_[hidden]>wrote:
> On Fri, 27 Sep 2013, Dominique Devienne wrote:
>
>> For my own education, why "singleton" vertices and duplicate edges
>> make topological_sort break?
>>
>
> What behavior do you get from them? What is broken in those two cases
If I comment out this code:
// Boost.Graph does not like duplicate edges, so remove them.
std::sort(edges.begin(), edges.end());
std::vector<IndexPair>::iterator new_edges_end =
std::unique(edges.begin(), edges.end());
edges.erase(new_edges_end, edges.end());
then I get
Running 46 test cases...
unknown location(0): fatal error in "test_get_schema_tables_dup_edges_bug":
C:\Program Files (x86)\Microsoft Visual Studio 10
.0\VC\include\vector(932) : Assertion failed: vector subscript out of range
..\..\..\..\..\..\..\src\tests\core\utils\Relational\main.cpp(2105): last
checkpoint
../../../../../../../src/lib/pdgm/core/utils/Relational.cpp(3243): fatal
error in "test_deterministic": Assertion failed (0 <
= vrtx_index && vrtx_index < count), in function class std::vector<class
pdgm::core::relational::Table,class std::allocator<c
lass pdgm::core::relational::Table> > __cdecl
pdgm::core::relational::PhysicalSchema::tables(bool) const
*** 2 failures detected in test suite "Master Test Suite"
*** failure detail:
(1) :
pdgm_core_utils_RelationalTests/test_get_schema_tables_dup_edges_bug
failed. C:\Program Files (x86)\Microsoft Visua
l Studio 10.0\VC\include\vector(932) : Assertion failed: vector subscript
out of range
(2) : pdgm_core_utils_RelationalTests/test_deterministic failed.
If I comment out the if (info.isGraphVertex()) test in the code below (and
do the code inside unconditionally):
// Boost.Graph does not like "holes" (missing vertex indices),
// so assign vertex indices only to tables participating in graph to
sort,
// and recording mapping back to TopoSortInfo given the vertex index.
std::vector<TopoSortInfo*> vrtxIndex2Info;
BOOST_REVERSE_FOREACH (TopoSortInfoMap::value_type& entry, infoMap) {
TopoSortInfo& info = entry.second;
if (info.isGraphVertex()) {
info.vertex_index_ = vrtxIndex2Info.size();
vrtxIndex2Info.push_back(&info);
}
}
const size_t vrtx_count = vrtxIndex2Info.size();
with
bool isGraphVertex() const { return indegree_ + outdegree_ > 0; }
I get plenty of failures (28). Many of them is simply the case where all
tables are standalone, so I end up with no edges... Since edges is the only
input to topo-sort, it cannot return those nodes I now realize. Which
forces me to preprocess the input to topo-sort.
I also had to
// ignore self-referential FKs, which do not influence
topo-sort,
// (and seem to confuse Boost-Graph too...)
if (cnstr.isFK() && (cnstr.parentTableName() != table.name())) {
Since the graph would not be a DAG then.
In the end, by using topological_sort(), it almost feel like I've written
more code to fit the input it needs, than if I had used that simple algo
that iterates on 0-outdegree nodes (the roots of the graph), remove those
as dependencies to other nodes (and update their indegree/outdegree) and
goes on until all nodes are consumed (or there's a cycle).
> But what brings me here today is to ask about a "stable" topo-sort. When
>> I have several vertices which are all
>> "unrelated" and thus sort at the same "level", topological_sort does not
>> preserve the relative order of those
>> "sibling" vertices, and I'd like something fully deterministic that I can
>> count on.
>>
>
> It is probably deterministic (looking at the code), assuming your graph
> data structure preserves edge order.
Yes it does.
> What are my options? Does BGL provide something like this?
>>
>> As a concrete example, given tables A, B, C, with A depending on both B,
>> C, topological_sort gives me the C, B, A
>> order (or adding the "levels" (C, B), A), while I'd like B, C, A, i.e.
>> preserving the relative order of B, C.
>>
>
> Is the order of outgoing edges always backwards from what you want (I
> suspect it is)? If so, reversing the order when putting the edges into the
> graph would be an easy fix to the issue.
It appears that way. So I've simply switched some of my loops to
using BOOST_REVERSE_FOREACH to have them alpha-sorted. Thanks.
> Thanks for any insight. --DD
>>
>> PS: What about getting a topo-sort, but which also gives you the
>> "levels", so that you can schedule those
>> independent tasks (on each level) in parallel for example? Does BGL
>> provide something like this? I've done it in the
>> past with the help of the Sedgewick book, but I see a single topo-sort in
>> BGL.
>>
>
> The right way to do that, most likely, is to use a proper multi-processor
> scheduling algorithm (not straight topological sort, which only works
> ideally for one processor) to assign tables to processors. Work-stealing
> is likely to behave well, also, is simple, and can handle the case where
> task times are unpredictable.
The algo I describe above "feels" like work-stealing in a way, but I'm no
expert.
Thank you for your input. I don't think there's anything wrong
with topological_sort(), it's just that it doesn't fit my use case that
much after all, that's all.
Thanks, --DD
Boost-users list run by williamkempf at hotmail.com, kalb at libertysoft.com, bjorn.karlsson at readsoft.com, gregod at cs.rpi.edu, wekempf at cox.net