Boost logo

Boost :

Subject: Re: [boost] [convert] Performance
From: Joel de Guzman (djowel_at_[hidden])
Date: 2014-06-12 04:41:21


On 6/12/14, 2:45 PM, Thijs (M.A.) van den Berg wrote:
>
>
> On Jun 12, 2014, at 2:30 AM, Joel de Guzman <djowel_at_[hidden]> wrote:
>>
>> I do not think a random distribution of number of digits is a
>> good representation of what's happening in the real world. In
>> the real world, especially with human generated numbers(*), shorter
>> strings are of course more common.
>>
>
> A well known real world property is Benford's law, often used in fraud detection to check is numbers are fake or "natural".
>
> If you draw random numbers uniformly from the logarithmic scale then you'll get that scale invariant property. I think that leads to a random number of digits?
>
> http://en.m.wikipedia.org/wiki/Benford's_law#Mathematical_statement

That one is for the first digit only and not for the number of digits.
Is it just a conjecture that single digits, for example, occur more
frequently than say 1,000,000 digits? If that conjecture does not hold,
then we should probably be using big nums all over! It's also a known
*fact* that varint encoding gives the best performance compared to
uniform encoding when transferring data over networks!

I'm not sure if there's a study of the probability of the occurrence of
N digits, is there? Anyway, here's one:

   http://mathematicalmulticore.wordpress.com/2011/02/04/which-numbers-are-the-most-common/

Perhaps the math guys should set me straight and I would not be surprised
if the answer is 42 again! :-)

Regards,

-- 
Joel de Guzman
http://www.ciere.com
http://boost-spirit.com
http://www.cycfi.com/

Boost list run by bdawes at acm.org, gregod at cs.rpi.edu, cpdaniel at pacbell.net, john at johnmaddock.co.uk