Boost logo

Boost :

From: Calum Grant (calum_at_[hidden])
Date: 2005-10-07 14:13:52


Arkadiy Vertleyb wrote
> "Arkadiy Vertleyb" <vertleyb_at_[hidden]> wrote
>
> > Another example -- you have table of cities, and you need
> to get all
> > the pairs, such as the distance between them is less than certain
> > number. How would one solve the problem with the conventional SQL
> > approach? This is a self-join. But how to make this query fast?
> >
> > With RTL approach, we could create an index on this self-join.
>
> Here is a simple example that illustrates this. The attached
> program does the following:
>
> 1) creates table of one column;
> 2) self-joins the table
> 3) selects records where columns are equal.
>
> This by itself doesn't make any sence, but it closely
> approximates the task of finding close cities, stars,
> billiard balls, monsters and soldiers, etc.

Here is the equivalent in RML. Forget 3000 rows, what about adding
1000000 rows, then adding another 1000000 to it? What time does the
equivalent RTL take?

I don't think self-joins come up that often in real life however.

Regards, Calum

#include <ctime>
#include <relational.hpp>
using namespace relational;

RM_DEFINE_ROW_1(MyRow, unique<int>, x);

int main(int argc, char*argv[])
{
        int total=0;
        clock_t t0 = clock();
        {
                table<MyRow> my_table;

                const int N1=1000000, N2=1000000;

                for(int i=0; i<N1; ++i)
                        my_table.insert(MyRow(i));

                std::cout << select( (my_table, my_table), col<0,0>() ==
col<1,0>() ).size() << std::endl;

                // Print only 10 results
                output_results(limit(select( (my_table, my_table),
col<0,0>() == col<1,0>() ), 10));

                for(int i=1; i<=N2; ++i)
                        my_table.insert(MyRow(-i));

                std::cout << select( (my_table, my_table), col<0,0>() ==
col<1,0>() ).size() << std::endl;
        }
        clock_t t1 = clock();

        std::cout << "That took just " << (1000*(t1-t0))/CLOCKS_PER_SEC
<< "ms\n";

        return 0;
}

Here is the output on my 1.7GHz laptop:

1000000
((0),(0))
((1),(1))
((2),(2))
((3),(3))
((4),(4))
((5),(5))
((6),(6))
((7),(7))
((8),(8))
((9),(9))
Total 10 results
2000000
That took just 1687ms
Press any key to continue

Arkadiy Vertleyb wrote
> "Arkadiy Vertleyb" <vertleyb_at_[hidden]> wrote
>
> > Another example -- you have table of cities, and you need
> to get all
> > the pairs, such as the distance between them is less than certain
> > number. How would one solve the problem with the conventional SQL
> > approach? This is a self-join. But how to make this query fast?
> >
> > With RTL approach, we could create an index on this self-join.
>
> Here is a simple example that illustrates this. The attached
> program does the following:
>
> 1) creates table of one column;
> 2) self-joins the table
> 3) selects records where columns are equal.
>
> This by itself doesn't make any sence, but it closely
> approximates the task of finding close cities, stars,
> billiard balls, monsters and soldiers, etc.
>
> Here is the text (Boost CVS head was used to be able to use typeof):
>
> /////////////////////
> #include <table_delta.hpp>
> #include <utils.hpp>
> #include <selection_delta.hpp>
> #include <key_index_delta.hpp>
> #include <crossproduct_delta.hpp>
> #include <rename_delta.hpp>
> #include <boost/lambda/lambda.hpp>
> #include <boost/typeof/typeof.hpp>
> #include <expression_registry.hpp>
> #include <transaction.hpp>
> #include <cstdlib>
>
> using namespace boost;
> using namespace boost::lambda;
>
> BOOST_RTL_DEFINE_COLUMN(int, c1);
>
> struct my_info : rel::table_info<
> mpl::vector<c1>
> >{};
>
> typedef rel::table<my_info> my_table;
> typedef my_table::value_type my_tuple;
>
> struct a;
> typedef rel::alias<c1, a> c2;
>
> main()
> {
> // populate table
>
> my_table t;
>
> for (int i = 0; i < 3000; ++i)
> t.insert(my_tuple(i + 1));
>
> // create and print the view
>
> BOOST_AUTO(t2, rel::auto_rename<a>(t));
>
> BOOST_AUTO(xp, rel::cross_product(t, t2));
>
> BOOST_AUTO(pred,
> _1[c1()] == _1[c2()]
> );
>
> BOOST_AUTO(sel,
> rel::selection(xp, pred)
> );
>
> std::cout << "building index" << std::endl;
>
> BOOST_AUTO(idx,
> (rel::key_index<mpl::vector<c1, c2> >(sel))
> );
>
> std::cout << "printing index" << std::endl;
>
> rel::print(idx);
>
> // update table (index doesn't get fully rebuilt)
>
> rel::expression_registry exprs;
> exprs.add(idx);
>
> rel::transaction tr;
>
> for (i = 0; i < 10; ++i)
> tr.insert(t, my_tuple(-i));
>
> tr.commit(exprs);
>
> // print the view once again
>
> std::cout << "printing updated index" << std::endl;
> rel::print(idx);
>
> return 0;
> }
> /////////////////
>
> Regards,
> Arkadiy
>
>
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/bo> ost
>


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk