On Mon, Dec 14, 2015 at 1:26 PM, Niall Douglas <nialldouglas14@gmail.com> wrote:
Please correct me where I’m wrong, but there seems to be some agreement on these points.

  • CI is the proper long term goal for Boost testing.
  • The work required to achieve this is underway, but is “not small.”
  • We want to have quick turnaround on results across all our supported configurations (OS and compiler)
  • We want to control our own destiny (don’t want to be vulnerable to a policy change another organization).

I don’t mean to imply that this leads to a specific policy. I just want to point out that although you aren’t necessarily on the same page, it seem like you are singing out of the same song book.

For the last point, I struggle to agree with the chosen wording. We haven't done the best job in keeping our own infrastructure maintained and up to date (e.g. Trac, which we have still failed to upgrade despite the known security vulnerabilities). In my opinion outsourcing Boost infrastructure, and paying someone to do the updates, in exchange for loss of "sovereignty" is a very valuable discussion to be had.
 

Niall has raise two points.
  1. This particular configuration may not be optimal and
  2. Purchased machines in general are not the way to go (as opposed to online CI services).

Perhaps item #2 is for the Steering Committee to decide. I think Niall has made a case, but I also think there is a case for an interim solution that can relieve some current pain.

But I really don’t think that any of us want to see the committee hashing out issue #1. There isn’t any reason to believe that the committee members have any particular insight into this area (or even a complete understanding of the requirements). I know that I personally have nothing to contribute here and no basis to make an intelligent judgment.

What I would like to see is a hardware configuration that you could all get behind.

If you read back through the discussion I think there was universal agreement that a $2000 machine would be enormously more valuable to us long term than a $550 machine. If $2000-2500 is provisionally okayed, I am extremely sure we can all agree on a suitable spec, indeed glad to discuss here or off list.

I don't actually agree with this. I think that four of the $550 machines would be *far* more efficient (in terms of tests/dollar)  than one $2000+ machine. Previously you made some comments about needing ECC ram because of corruption issues, I haven't seen this before and generally consumer-level gear will get you a lot better bang for the buck. We don't have an stringent uptime requirements, so it doesn't seem like investing in server grade hardware makes sense in our use case.

As far as my previous statement, I should have written "the highest value way to get more of the existing testing", there would obviously be cheaper solutions, but they wouldn't add much. As I mentioned above, with today's hardware prices, I think that a 4/8core + 16GB Ram + 240GB SSD (a 500MB/s write version) is the sweet spot for running our existing regression set (by running a single configuration at a time across all the cores to minimize latency of the results). If we want to scale, then I believe that multiple boxes along these lines is a better value than getting a beefier box and sharing the resources across VMs or Docker containers.

As I indicated, this is for the current tests. The future per-commit tests would be different.

There's also the issue of space/noise in my house. I could easily fit a few of the mini-itx machines next to my desk, but I don't have anywhere for a big supermicro server. Additionally, having multiple machines makes the cooling need less dense and therefore quieter. If we wanted to go with a colo that would be quite a bit more, but a lot more professional solution.
 
 

Niall, I know that you question that direction, but you also seem to have a lot of valuable experience on this issue.

Tom, you indicated thought your proposal was “the cheapest way to get more of the existing testing.”

Here is the priority I’d ask you to consider: rather than the least use of Boost money, go for the best use of Boost money. We are looking for reliability and capacity. (I think we all know that chasing performance is going to put us in a poor place on the cost curve and may not be as reliable. Reliability is important. I think this is your first request, Tom, so I’ll share my experience. The committee generally says, “yes,” but the process is tortuous. We don’t want to make another request if we can avoid it.)

Tom, Niall, Anthony, you all had specific ideas about the appropriate configuration. Can you converge on something?

I want to thank you all for your help with the big job of testing. It sounds like we are moving in a good direction and I applaud and support you in that. I’d also like to see the committee discuss any proposals that you have for interim solutions, but I want any such proposal to be as solid and un-contentious as possible.

Overall, I would like to revise my proposal to the following five points, which I hope incorporates the ideas of the others that have contributed to this thread:

Purchase three of the machines matching my original configuration (approx $550 each). I will initially purchase one, get it up and running and verify that it is performing then purchase the other two. I will supply electricity, network access, and my time for setup and maintenance. These machines would run the current regression test using the same scripts I currently use to control other runs on my machines in either windows on KVM or linux in docker.

Provide Niall with a server class machine (approx $2500) or a similar amout of amazon AWS credits for spot instances which he can use to perform advanced testing (valgrind, fuzzing, etc) on libraries that wish to take advantage of that. (Assuming he wants this and has the time to take advantage of it).

The boost organization purchase a Travis-CI 5 concurrent job plan for $3000/yr (see if we can get an open-source discount?) and an Appveyor plan with 4 concurrent jobs for $795/yr (includes 50% open-source discount) and encourage the library authors to setup their libraries to take advantage of these resources.

Investigate as a community our own Travis/Appveyor/Jenkins setup so that in the long term we can phase out our reliance on those service providers and provide a similar service across our extremely diverse configuration sets (we currently have Windows, Mac, Linux, FreeBSD, IBM, Sun, Android(!) with Visual Studio 8.0-14.0, GCC 4.6-5.2, Clang 3.0-3.7, Intel, IBM with lots of different standards and compiler flags).

Make inquiries (through conservancy?) to see if any of the big hosting companies (Amazon, Microsoft, Google, Rack Space, etc) or development companies (Apple, Microsoft, Sun, IBM, etc) would be interested in providing hosting our hardware for running these tests (I'd especially look at MS, they've been active in the community recently trying to make Visual Studio C++ development better, they might be interested in making sure their compiler(s) at least get lots of testing...and they have the Azure service which would make it easy.)

Is this something we can agree to from the technical/operational side?
Tom