Boost logo

Boost-Build :

From: Alexey Pakhunov (alexeypa_at_[hidden])
Date: 2005-09-02 11:11:48


Joao Abecasis wrote:
>>1. Loading of >100MB file into the memory is not a good idea;
> We could add a default 1MB (?) limit and change the signature to,
> rule CAT ( file : max_bytes ? )

Sooner or later we'll need to extend this limit. It is even questionable
to have such a limit as 2GB or 4GB. Of course CAT'ing 4GB file looks
strange but imagine the following scenario: a user wants to detect some
signature at the end of a big file. 'CAT' may allow it if an offset can
be passed. I.e.:

rule CAT ( file : offset ? : bytes ? ) ;

> Well, I assumed allocation would fail and an empty list would be
> returned.

This is a big hole potentially. What if file size is 0x1000000001. The
value will be truncated to 0x1.

> Of course another question entirely is if we should care about files
> that large. I didn't notice support for large files (>2GB) elsewhere in
> bjam (of course I may have overlooked it).

I don't know either if bjam supports >2GB files. But if it doesn't than
we have to add it step by step.

> Do you think adding a default or even a hard-coded limit for the number
> of bytes read would fix these issues or are you suggesting that the
> approach is flawed from the beginning?

I think the limit will not solve all problems. I think some kind of
streaming support should be implemented instead. For example each time
'CAT' is called it will read only a single block/line/block of lines.

Other features, I guess, can be useful:

- Passing an offset and block size to read;
- Support of negative offsets - to be able to read from the tail of a file;
- Support of line-by-line reading.

Best regards/Venlig hilsen,
Alexey Pakhunov.

 


Boost-Build list run by bdawes at acm.org, david.abrahams at rcn.com, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk