Boost logo

Boost :

Subject: Re: [boost] date_time -> serialization (Was: spirtit -> serialization)
From: Julian Gonggrijp (j.gonggrijp_at_[hidden])
Date: 2014-06-16 15:02:41


Andrey Semashev wrote:

> On Monday 16 June 2014 11:42:58 Julian Gonggrijp wrote:
>> Where do you store the metadata if not within the module? I like the
>> idea of taking automatically generated data out of the versioning
>> system, but it should be minimally invasive.
>
> I'm specifically not restricting this part, other than that metadata has to be
> available without downloading the submodule. As a simple example, an ftp
> server should be enough to set up a Boost distribution repository. The
> metadata is stored in a (compressed) text file or several files (better in one
> file though to speedup its downloading). The tool is able to download this
> file and build the dependency graph upon user's request before installing
> anything. If you feel git or another VCS suits better for this metadata, you
> can use it instead, but I don't see much value in version controlling for this
> data, and VCSs seem to add quite some overhead.

There is a very obvious value in version control: dependencies may
change from one Boost commit to the next. The dependency handler
should work not only for end users, but also for maintainers and
testers who check out any point in Git history.

Given that dependency information is directly tied to a specific
commit, I think storing the information within the commit isn't such a
strange idea. Maybe the full information shouldn't be exposed to the
user, but I don't think taking the file completely out of the
repository is the right approach. Perhaps it could be stored in a git
note [1].

It is friendly to tell end users in advance what dependencies will be
installed, but that can be solved by other means. A very simple
solution would be to list the dependencies on the Boost website. A
slightly more advanced solution would be to have the handler download
only the dependency file associated with the release tag using git
archive [2], before cloning the entire module (that might not work
with git notes, though). For releases, the dependency information
could also simply be aggregated in the superproject archive.

The advantage of just storing a plain file in the module directory is
that it certainly works, even if you download an archive without git
history, and without a need to set up a new FTP server or other web
service. I would prefer to start there and investigate prettier
solutions later.

>
> Another reason I want to separate the metadata from git repos - and I'm
> fantasizing now - is that I can see this tool being used without git at all -
> to download source packages and install Boost on the user's machine. For
> example, if I want to install a subset of Boost 1.57 on my machine, I'd like
> to be able to do that easily, without dealing with git submodules and without
> downloading the whole git history. The tool will just resolve dependencies,
> download and extract a set of archives for me.

If I'm right a module can also be downloaded without all of the
history.

>
>>> The metadata should be
>>> automatically updated when the packages are uploaded (i.e. official
>>> snapshots are uploaded or a referred git tag is added).
>>
>> Do you envision this in the current situation where "packages" are
>> loaded as sub-modules of the boost super-project, or in a new
>> situation where the boost super-project is taken away and "packages"
>> are standalone (but with dependencies)?
>
> What I meant is that there should be some kind of "official" Boost repository
> which will be used by the packaging tool (let's call it boost-pkg for
> brevity). That repository will serve the metadata for boost-pkg. The metadata
> will be updated when a certain new release is published into it, whether that
> is a new library release through a git tag or the whole Boost release. I'm not
> sure if updating the metadata upon a tag creation can be automated, but at
> least for major Boost releases this should be doable.

This seems to confirm that you are not interested in dependency
handling for maintainers and testers (yet).

>
> Continuing my fantasy, the repository may contain standalone packages for
> library and Boost releases, which may be useful for Boost users (in the above
> example, I would be downloading archives of Boost 1.57 from this repository).
> It may also contain references to git repositories - tags or branches. boost-
> pkg would offer a unified interface so that it is possible to use either of
> them - e.g. download official Boost 1.57 and a newer release of that one
> library X, which fixes a critical bug for me.

This gets more and more ambitious. Don't take me wrong, I like what you
describe, but I think it will be easier to get there if we take it one
step at a time.

>
> Potentially, boost-pkg could replace the superproject and be used to checkout
> the whole Boost from git on develop or master branch, which would be useful
> for developers and testers.

I think you have now almost reinvented Ryppl, but I might be mistaken.

> For testers this would help to checkout the tested
> library from develop and everything else from master. Although it is also
> possible to do with plain git, when you have checked out everything.
>
>> In a nutshell, both for users and maintainers:
>> 1. Clone the superproject non-recursively.
>> 2. Request specific modules to be installed by Boost.Build;
>> dependencies are tracked by the handler tool which uses git to clone
>> more modules.
>
> That would require at least Boost.Build to be checked out as well.

Ah, right.

>
>> In addition, for maintainers:
>> 3. Create/update the configuration file by running the handler tool
>> before pushing to the public repo (this could be a git hook).
>
> This would be a blocker, for me at least. I'm sure I will forget doing that
> and will be very much annoyed. A git hook that scans the headers on every push
> doesn't sound very good.

But you do agree that caching is a good idea, right? You seem to
believe that caches should only be created for releases. I think a
dependency handler would have at least as much value to maintainers
and testers, if it works for any commit.

-Julian

____________

[1] https://www.kernel.org/pub/software/scm/git/docs/git-notes.html
[2] https://www.kernel.org/pub/software/scm/git/docs/git-archive.html


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk