Boost logo

Boost :

Subject: Re: [boost] [convert] Performance
From: Thijs van den Berg (thijs_at_[hidden])
Date: 2014-06-12 05:04:44


Sitmo Consultancy B.V.
Financial Modelling & Data Science
+ 31 6 24110061
thijs_at_[hidden]
P.O. Box 1059, 2600BB, Delft, The Netherlands

On 12 Jun 2014, at 10:41, Joel de Guzman <djowel_at_[hidden]> wrote:

> On 6/12/14, 2:45 PM, Thijs (M.A.) van den Berg wrote:
>>
>>
>> On Jun 12, 2014, at 2:30 AM, Joel de Guzman <djowel_at_[hidden]> wrote:
>>>
>>> I do not think a random distribution of number of digits is a
>>> good representation of what's happening in the real world. In
>>> the real world, especially with human generated numbers(*), shorter
>>> strings are of course more common.
>>>
>>
>> A well known real world property is Benford's law, often used in fraud detection to check is numbers are fake or "natural".
>>
>> If you draw random numbers uniformly from the logarithmic scale then you'll get that scale invariant property. I think that leads to a random number of digits?
>>
>> http://en.m.wikipedia.org/wiki/Benford's_law#Mathematical_statement
>
> That one is for the first digit only and not for the number of digits.
> Is it just a conjecture that single digits, for example, occur more
> frequently than say 1,000,000 digits? If that conjecture does not hold,
> then we should probably be using big nums all over! It's also a known
> *fact* that varint encoding gives the best performance compared to
> uniform encoding when transferring data over networks!
>
> I'm not sure if there's a study of the probability of the occurrence of
> N digits, is there? Anyway, here's one:
>
> http://mathematicalmulticore.wordpress.com/2011/02/04/which-numbers-are-the-most-common/
>
> Perhaps the math guys should set me straight and I would not be surprised
> if the answer is 42 again! :-)
>
You’re right. There are indeed many distributions that give rise to Benford’s law. Maybe someone should write a script that scrapes all the numbers in boost source files.


Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk