Boost logo

Boost :

From: Rene Rivera (grafik.list_at_[hidden])
Date: 2005-12-20 21:52:01


Rene Rivera wrote:
> Peter Dimov wrote:
>>Rene Rivera wrote:

>>>Did you try to use the library?
>>>
>>>No.
>>
>>It'd probably be helpful if you create a simple throughput test and post the
>>results so that there is a clear target for Asio to match or exceed. This
>>will also validate your objections as they currently seem to be based on
>>intuition. :-)
>
> Working on it...

OK, I have now tried it. Two problems I ran into: compile errors with
CW-8.3, and compile errors if any windows headers are included before
the asio headers.

The test is a single process multi-thread uni-directional UDP timing of
100,000 messages of 1K size on the local loop (127.0.0.1). The small
message size and the localoop are to flood the server with as many
messages as possible, as opposed to as much data as possible. Since it's
the individual message handling that I'm testing. Each test cycle is run
6 times, and the first and last samples are removed, all in the process.
The test process is run 3 times and the median is what I'm reporting
here. In the process there are two basically identical variants which
are tested. There is a single sync client, which is reused, that sends
the messages to the current server thread. There are two types of
servers tested, an async server using the callbacks for handling
messages, and a sync server doing the reads directly and dispatching
manually. Internally those represent almost identical code paths, except
that the async is doing the extra allocations for the deferred dispatch.

[code attached and output attached]

I ran the same 100,000*1K*6*2*3 tests with both debug and release
compiled code. As can be seen from the attached output in the best case,
of release code, there is a 5.6% "overhead" from the async to sync
cases. For the debug code the difference is a more dramatic 25.2%.

In my continue reading and use of the code I concluded that the basic
design flaw is that the underlying *socket_service classes are not
available as a public interface. It's nice that thought and care has
gone into designing a handling and dispatch pattern that can serve many
needs. But as it's currently structured, and was pointed and/or alluded
to in the sync/demuxer threads, not having access to using or
customizing this lower level leads to forcing a single use case. Another
restriction that this causes is that the various OS handling models are
not available as individual handling types. Hence, for example on Linux,
it is not possible to choose at runtime whether one should use epoll,
kqueue, or select.

I should mention that just because I voted no, doesn't mean I won't be
using this library myself. The performance for my current application is
sufficient. But I'm hoping that more improvements come along to make it
possible to still have those optimum uses.

Thanks to all for reading ;-)

And happy _insert_your_preferred_holiday_here_.

PS. The test was on my AMD Athlon 3200+ 2.2Ghz, 1Gig RAM, Win2KSP4 setup.

-- 
-- Grafik - Don't Assume Anything
-- Redshift Software, Inc. - http://redshift-software.com
-- rrivera/acm.org - grafik/redshift-software.com
-- 102708583/icq - grafikrobot/aim - Grafik/jabber.org

>>>> DEBUG
====== BEGIN OUTPUT ======
--- ASYNC...
### TIME: total = 2.97428; iterations = 100000; iteration = 2.97428e-005; iterations/second = 33621.6
### TIME: total = 3.01433; iterations = 100000; iteration = 3.01433e-005; iterations/second = 33174.8
### TIME: total = 2.98429; iterations = 100000; iteration = 2.98429e-005; iterations/second = 33508.8
### TIME: total = 3.00432; iterations = 100000; iteration = 3.00432e-005; iterations/second = 33285.4
### TIME: total = 2.98429; iterations = 100000; iteration = 2.98429e-005; iterations/second = 33508.8
### TIME: total = 3.05439; iterations = 100000; iteration = 3.05439e-005; iterations/second = 32739.7
-- ...ASYNC: average iterations/second = 33369.5
--- SYNC...
### TIME: total = 2.40346; iterations = 100000; iteration = 2.40346e-005; iterations/second = 41606.8
### TIME: total = 2.41347; iterations = 100000; iteration = 2.41347e-005; iterations/second = 41434.1
### TIME: total = 2.41347; iterations = 100000; iteration = 2.41347e-005; iterations/second = 41434.1
### TIME: total = 2.40346; iterations = 100000; iteration = 2.40346e-005; iterations/second = 41606.8
### TIME: total = 2.42348; iterations = 100000; iteration = 2.42348e-005; iterations/second = 41262.9
### TIME: total = 2.31333; iterations = 100000; iteration = 2.31333e-005; iterations/second = 43227.8
### TIME: total = 0; iterations = 100000; iteration = 0; iterations/second = 1.#INF
-- ...SYNC: average iterations/second = 41793.1
 
EXIT STATUS: 1
====== END OUTPUT ======
>>>> DIFF = 25.2%

>>>> RELEASE
====== BEGIN OUTPUT ======
--- ASYNC...
### TIME: total = 2.46354; iterations = 100000; iteration = 2.46354e-005; iterations/second = 40592
### TIME: total = 2.42348; iterations = 100000; iteration = 2.42348e-005; iterations/second = 41262.9
### TIME: total = 2.42348; iterations = 100000; iteration = 2.42348e-005; iterations/second = 41262.9
### TIME: total = 2.4335; iterations = 100000; iteration = 2.4335e-005; iterations/second = 41093.1
### TIME: total = 2.4335; iterations = 100000; iteration = 2.4335e-005; iterations/second = 41093.1
### TIME: total = 2.37341; iterations = 100000; iteration = 2.37341e-005; iterations/second = 42133.4
-- ...ASYNC: average iterations/second = 41178
--- SYNC...
### TIME: total = 2.2933; iterations = 100000; iteration = 2.2933e-005; iterations/second = 43605.3
### TIME: total = 2.27327; iterations = 100000; iteration = 2.27327e-005; iterations/second = 43989.5
### TIME: total = 2.30331; iterations = 100000; iteration = 2.30331e-005; iterations/second = 43415.7
### TIME: total = 2.31333; iterations = 100000; iteration = 2.31333e-005; iterations/second = 43227.8
### TIME: total = 2.30331; iterations = 100000; iteration = 2.30331e-005; iterations/second = 43415.7
### TIME: total = 2.30331; iterations = 100000; iteration = 2.30331e-005; iterations/second = 43415.7
### TIME: total = 0; iterations = 100000; iteration = 0; iterations/second = 1.#INF
-- ...SYNC: average iterations/second = 43492.9
 
EXIT STATUS: 1
====== END OUTPUT ======
>>>> DIFF = 5.6%


#include <cstdio>
#include <iostream>
#include <vector>
#include <valarray>
#include <boost/shared_ptr.hpp>
#include <boost/thread.hpp>
#include <boost/bind.hpp>
#include <boost/asio.hpp>

const std::size_t message_size = /**/ 1*1024 /*/ 64*1024 /**/;
const int port = 9999;
const std::size_t message_iterations = /** 100 /*/ 100000 /**/;

namespace asio = ::boost::asio;

namespace detail_test
{
        struct timed_scope
        {
                boost::xtime t0;
                boost::xtime t1;
                std::size_t n;
                
                std::vector<double> & results;
                
                inline timed_scope(std::vector<double> & r, std::size_t iterations = 1)
                        : results(r), n(iterations)
                {
                        boost::xtime_get(&t0,boost::TIME_UTC);
                }
                
                inline ~timed_scope()
                {
                        boost::xtime_get(&t1,boost::TIME_UTC);
                        double t = double(t1.sec)+double(t1.nsec)/double(1000000000);
                        t -= double(t0.sec)+double(t0.nsec)/double(1000000000);
                        std::cerr << "### TIME"
                                << ": total = " << t
                                << "; iterations = " << n
                                << "; iteration = " << t/double(n)
                                << "; iterations/second = " << double(n)/t
                                << '\n';
                        results.push_back(double(n)/t);
                }
        };
        
        template <typename out>
        out & result_summary(out & o, const std::vector<double> & results)
        {
                std::valarray<double> r(&results[1],results.size()-2);
                o << r.sum()/r.size();
                return o;
        }
        
        void sleep_for_secs(int n)
        {
                boost::xtime t;
                boost::xtime_get(&t,boost::TIME_UTC);
                t.sec += n;
                boost::thread::sleep(t);
        }
}

struct async_server
{
        char io_buffer[message_size];
        boost::asio::demuxer demuxer;
        boost::asio::datagram_socket socket;
        std::auto_ptr<detail_test::timed_scope> timer;
        std::size_t message_count;
        std::size_t message_recount;
        std::auto_ptr<boost::thread> runner;
        std::vector<double> results;
        
        async_server()
                : socket(this->demuxer,boost::asio::ipv4::udp::endpoint(port))
                , message_count(0)
                , message_recount(0)
        {
                socket.async_receive(
                        boost::asio::buffer(io_buffer, message_size),
                        0,
                        boost::bind(handle_receive_from,
                                this,
                                boost::asio::placeholders::error,
                                boost::asio::placeholders::bytes_transferred ) );
        }
        
        void handle_receive_from(
                const boost::asio::error& error,
                size_t /*bytes_transferred*/)
        {
                if (!error || error == boost::asio::error::message_size)
                {
                        if (++message_count == this->timer->n)
                        {
                                this->clear_timer();
                                message_count = 0;
                                this->reset_timer(message_recount);
                        }
                        socket.async_receive(
                                boost::asio::buffer(io_buffer, message_size),
                                0,
                                boost::bind(handle_receive_from,
                                        this, boost::asio::placeholders::error,
                                        boost::asio::placeholders::bytes_transferred ) );
                }
        }
        
        void start()
        {
                this->runner.reset(new boost::thread(
                        boost::bind(boost::asio::demuxer::run,&this->demuxer) ));
        }
        
        void stop()
        {
                this->demuxer.interrupt();
                this->runner->join();
                this->clear_timer();
        }
        
        void reset_timer(std::size_t i = 1)
        {
                this->message_recount = i;
                this->timer.reset(new detail_test::timed_scope(this->results,i));
        }
        
        void clear_timer()
        {
                this->timer.reset();
        }
};

struct sync_server
{
        char io_buffer[message_size];
        boost::asio::demuxer demuxer;
        boost::asio::datagram_socket socket;
        std::auto_ptr<detail_test::timed_scope> timer;
        std::size_t message_count;
        std::size_t message_recount;
        std::auto_ptr<boost::thread> runner;
        std::vector<double> results;
        volatile bool running;
        
        sync_server()
                : socket(this->demuxer,boost::asio::ipv4::udp::endpoint(port))
                , message_count(0)
                , message_recount(0)
                , running(false)
        {
        }
        
        void handle_receive_from(
                const boost::asio::error& error,
                size_t /*bytes_transferred*/)
        {
                if (!error || error == boost::asio::error::message_size)
                {
                        if (++message_count == this->timer->n)
                        {
                                this->clear_timer();
                                message_count = 0;
                                this->reset_timer(message_recount);
                        }
                }
        }
        
        void start()
        {
                this->runner.reset(new boost::thread(
                        boost::bind(sync_server::run,this) ));
        }
        
        void run()
        {
                this->running = true;
                while (this->running)
                {
                        boost::asio::error error;
                        std::size_t bytes_transferred =
                                socket.receive(
                                        boost::asio::buffer(io_buffer, message_size),
                                        0,
                                        boost::asio::assign_error(error));
                        this->handle_receive_from(error,bytes_transferred);
                        if (error && error != boost::asio::error::message_size) break;
                }
                this->running = false;
        }
        
        void stop()
        {
                this->running = false;
                this->socket.close();
                this->runner->join();
                this->clear_timer();
        }
        
        void reset_timer(std::size_t i = 1)
        {
                this->message_recount = i;
                this->timer.reset(new detail_test::timed_scope(this->results,i));
        }
        
        void clear_timer()
        {
                this->timer.reset();
        }
};

struct sync_client
{
        char io_buffer[message_size];
        boost::asio::demuxer demuxer;
        boost::asio::ipv4::host_resolver host_resolver;
        boost::asio::ipv4::host host;
        boost::asio::ipv4::udp::endpoint receiver_endpoint;
        boost::asio::datagram_socket socket;
        
        sync_client()
                : host_resolver(this->demuxer)
                , receiver_endpoint(port)
                , socket(this->demuxer,boost::asio::ipv4::udp::endpoint(0))
        {
                host_resolver.get_host_by_name(host, "127.0.0.1");
                receiver_endpoint.address(host.address(0));
        }
        
        void send()
        {
                socket.send_to(
                        boost::asio::buffer(io_buffer, message_size),
                        0, receiver_endpoint);
        }
};

int main()
{
        sync_client c0;
        {
                async_server s0;
                s0.start();
                detail_test::sleep_for_secs(2);
                std::cerr << "--- ASYNC...\n";
                s0.reset_timer(message_iterations);
                for (std::size_t m = 0; m < message_iterations*(1+4+1); ++m)
                {
                        c0.send();
                }
                s0.stop();
                detail_test::result_summary(
                        std::cerr << "-- ...ASYNC: average iterations/second = ",
                        s0.results) << "\n";
        }
        detail_test::sleep_for_secs(2);
        {
                sync_server s0;
                s0.start();
                detail_test::sleep_for_secs(2);
                std::cerr << "--- SYNC...\n";
                s0.reset_timer(message_iterations);
                for (std::size_t m = 0; m < message_iterations*(1+4+1); ++m)
                {
                        c0.send();
                }
                s0.stop();
                detail_test::result_summary(
                        std::cerr << "-- ...SYNC: average iterations/second = ",
                        s0.results) << "\n";
        }
        return 1;
}


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk