Boost logo

Boost :

From: David Abrahams (dave_at_[hidden])
Date: 2005-02-03 10:13:20


"Jeff Garland" <jeff_at_[hidden]> writes:

> On Wed, 02 Feb 2005 17:20:41 -0500, David Abrahams wrote
>
>> > How come we didn't test it before changing? Seems like we have the
>> > same level of process in this new selection...
>>
>> No, the process is being discussed here first in part so we can head
>> off problems.
>
> Great. From the tone of your first response it didn't seem like it,

Check the subject line. The question mark there was intentional.

> since all of my suggestions were simply dismissed out of hand (a bad
> test or FUD) -- or at least that's how it seemed to me.

IIRC you made only one suggestion, which after some consideration I
thought was not a good test. The rest were just fears with no
proposal for action, again IIRC.

> But I'm over it now...

Good :^)

> My whole point in bringing up the mailing list move issues, was as
> an analogy to the current CVS situation. While SVN+OSL might appear
> to be the superior solution at first blush, it might have
> significant unknown downsides and it might not even solve the
> problem(s). We should do whatever we can practically do to figure
> out if there are downsides because, as we both agree, there isn't
> any going back once we jump.

Well, we can certainly do some serious testing without jumping. And,
in fact, there is going back. It would be very painful, but we could
generate a script from the SVN repository and ask the SF guys to run
it. But I'd rather avoid that.

>> >> > Maybe we need to have a coordinated test where we all agree to
>> >> > pull down the whole repository at a certain time -- just to make
>> >> > sure the machines/network/software is really up to the task?
>> >>
>> >> That's a bad test IMO. If we all hit CVS at the same time, how good
>> >> do you think the response time will be?
>> >
>> > I don't know what the response would be with CVS and it doesn't
>> > matter.
>>
>> Sure it does, if you want to prove that SVN will be worse.
>
> I'm not trying to prove that SVN will be worse. First of all, the
> proposal at hand changes 2 variables: the hosting provider and the
> configuration management system. The stated goal is to improve the
> performance and scalability of the boost repository. So SVN might be
> really scalable, but the machine/network that hosts it might not be
> up the the task

>From what I've been told, OSL has a fast connection, and loads of money
for equipment and good support, which is why I feel confident about
that part of it. Andy, please correct me if I'm wrong.

> -- or vice versa.

?? The task might not be up to the machine/network?

> So my proposed test was just brainstorming -- I was trying to think
> of ways that we could validate that OSL+SVN combo will be
> performant.

I think deductive reasoning is probably the best approach on this one,
because I don't think we're going to be able to create a sufficiently
wide range of realistic usage patterns. Not least because we don't
know how many people are using the CVS, and how.

> Frankly, my guess is that SVN will be fine, but never having hosted
> an SVN server I have no idea what kind of machine / network
> connection it takes to do it well.

Again, deductive reasoning is probably a good tool here: find out what
gets sent across the network; do some research into the relative
demands of CVS and SVN on resources.

> I have no idea what kind of machine OSL will host on, what sort of
> network they have, and what other apps will be running on the
> server.

I think those details are easy to discover, but even more so I think
OSL will do what's neccessary to be sure the resources are sufficient.

> We basically don't have any idea of the current average and
> peak load on the boost CVS repository. Basically it's a shot in the
> dark from my view -- I have no idea if making this move will be
> better or worse.
>
> Of course just setting up an experimental boost repository in SVN at
> OSL won't put any load on the SVN server. Since it will just be for
> experimentation there won't be users doing updates, developers will
> just be tinkering, etc. It won't validate that the goal of the
> effort will be met -- only serve as a way to get developers familiar
> with SVN. So that's where the idea of intentionally loading the
> server comes in. It ensures that we at least give it some sort of
> tryout...

I like it in principle, but without the information you mentioned
earlier about the CVS repository loads, how will we draw any
conclusions from the results?

>> > My point was that CVS/Sourceforge has been mostly handling whatever
>> > the concurrent load is today. Yes, there's a stuck lock, slow
>> > updates
>>
>> and commits, and logs, ...
>
> As I said, I see more than adequate CVS performance and a couple
> others also seem reasonably satisfied as well. I don't see constant
> complaints on the list about CVS. So from my view there's no rush
> for this. But I recognize there are others interacting with CVS
> might have a different view.
>
>> The slowness we're seeing in CVS would be there regardless of how
>> much server support we had.
>> ...slight rearrangement of points...
>> Much of the slowness when you update occurs entirely on your local machine.
>
> I have a smoking machine on my desk -- I assure you it is faster
> than the CVS server ;-) Seriously though, this assertion is
> irrelevant and unproveable.

Not completely. As the first step in updating CVS churns through all
the files in your working copy.

> The performance is impacted by lots of factors: server load, network
> speeds, etc.
>
>> CVS just wasn't designed to scale to projects the size of Boost.
>
> In terms of files, simultaneous users,

Yes.

> what's the 'unit of size'? If you mean number of files, well, I'll
> just disagree b/c I've worked on bigger projects effectively with
> CVS (closed fast LAN, dedicated fast server, fast clients).

That doesn't mean it was designed for the purpose.

> But probably more relevant, there are current open source projects
> much larger than Boost that use CVS (try KDE for example). Maybe
> they just don't care about performance -- I don't know. But anyway,
> we need to stop focusing on what CVS is supposedly designed to do or
> not do and get back to why we want to convert, what the benefits
> will be, and how we make sure we get the benefits without stepping
> on a landmine.

Well, maybe we don't want to convert. If I'm the only one with
complaints, it probably isn't a good idea.

>> > If we knew what the average and peak load on the repository
>> > was we could easily test that out that subversion/OSL would meet our
>> > needs. But since we don't know I was just suggesting a test that
>> > would give us information. If 10 people can't operate on the
>> > respository at the same time then I'd be worried about converting.
>>
>> When you say "we all agree to pull down the whole repository at a
>> certain time" I didn't think you were talking about only 10 people,
>> doing the typical things that they might do.
>
> Sorry, I was a little loose with my test plan definition -- 'all'
> wasn't a wise word. As I said above, just putting the repository
> out there doesn't test the performance -- so that's what I was
> trying to define. A test to make sure the new perfomance will be
> good. 10 users doing checkout against the repository would probably
> be a good test. 100 users would be a stress test.

Okay, that sounds reasonable. Although I doubt cvs checkout is ever
done by more than a few people at once. More likely update.

>> > -- changing all the boost web pages that talk about CVS
>>
>> Premature, IMO.
>
> Agree.
>
>> > -- telling all the users to download and learn subversion clients
>>
>> Might be a good idea anyway.
>
> Agree.
>
>> > -- creating SSL accounts (or whatever) at OSL for all boost
>> > developers
>>
>> Premature, IMO.
>
> How are we going to test the repository at OSL without this? Won't
> SSL be required for development access?

Probably. But we don't need access for "all" boost developers, unless
by "all" you mean ten of us again.

> Running checkouts thru SSL is likely to be slower and put more load
> on the server than without encrypting.

I'm not sure whether encryption is actually used for anything other
than authentication, but your point is taken nonetheless.

>> > -- setup anonymous access to the repository
>>
>> Premature, IMO
>
> Needed for performance testing.

You don't think we can do the tests with 10 developers' https://
access? Not that I object to setting up anonymous access, or
anything.

>> > -- converting the repository
>>
>> Might be a good idea anyway.
>
> A foundational test in my mind.

Yep.

>> > -- getting regression testing switched over
>
> I brought this list up to enumerate some of the costs of converting.
> There is non-trivial cost and I would hope there would be enough
> benefit to justify the costs. Personally I'm not sure I see it, but
> I'm willing to be pursuaded.

I don't have much to say other than to say that some of us spend a lot
of time waiting for CVS, there are constant stale locks, our
occasional file moves are painful, and the anonymous pserver image
lags the real one by an hour or more. Oh, and SF's connection for
getting the CVS tarball keeps dropping, so it's hard for anyone to do
automated backups.

But if you aren't having any trouble, there may not be much anyone can
do to persuade.

-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk