Numerical Recipes Forum  

Go Back   Numerical Recipes Forum > Obsolete Editions Forum > General Computing and Open Discussions

Reply
 
Thread Tools Display Modes
  #1  
Old 11-23-2009, 01:23 PM
MPD78 MPD78 is offline
Registered User
 
Join Date: Dec 2008
Location: Pittsburgh, PA
Posts: 196
Built in function rand() and srand()

Hello all,

In the NR3 book in chapter 7, there is a list of "traps to watch out for" and one of the traps is: Avoid using the standard/built in C++ routines, rand() and srand(), because they are flawed, have no standard implementation, and goes on to state that many scientific papers that have used these functions are inaccurate/incorrect.

With the above statement made by some highly experienced users of the C++ language why are these routines still in the standard C++ library?

Also, in Bjarne Stroustrup's book "Programming Principles and Practice Using C++" published in 2009 (Bjarne Stroustrup is the original author of the C++ language.) the function rand() is taught and it is stated that if it is repeatedly called it will give the same sequence every time the program is run. It continues on the state that this is good from a debugging standpoint.

It seems that beginner/intermediate users could be led down a very bad path by using these functions, rand() and srand() and not really know they heading for trouble. Therfore, why are they not removed from the standards library?

Thanks
Matt
Reply With Quote
  #2  
Old 11-26-2009, 01:25 PM
davekw7x davekw7x is offline
Registered User
 
Join Date: Jan 2008
Posts: 453
Quote:
Originally Posted by MPD78 View Post
...why...
Since no one from the C or C++ Standards committee has responded, I will give my outsider's opinion. Like all of my ramblings, what follows is just an opinion. My opinion.

It is freely given (like free beer and like free speech) and it's worth exactly what you want it to be worth and exactly what you are paying for it.

Quote:
Originally Posted by MPD78
...Avoid using the standard/built in C++ routines, rand() and srand()
I think that condemnation of the use of rand() is appropriate in the context of introducing programmers and would-be programmers to the use of pseudo-random number generation in significant research applications.

That doesn't mean (to me, at least) that rand() is never useful and that people who use it for anything at all should be sentenced to public impalement or other appropriate punishment meted out by the Secret Society of rand() Commandos. (Yes; they are Out There. Someday they may come for you if you violate this sacred commandment.)

What it means to me is that if you are going to make significant use of pseudo-random number sequences in any project, you had better make damn-sure that you know that your generators create data sets that have suitable properties for this particular application.

Over the years, vendor-supplied library functions such as rand() have varied greatly in quality, but as far as I can tell, the supplied versions of rand() have just about always been tuned for speed. The main strike against them is that the C (and by inheritance C++) standards documents have deliberately left the details of the implementation subject to compiler and library vendors' whims.

The Good News about that is that, as a particular vendor finds a "better way" the functions can be improved for later versions.

The Bad News (in addition to the fact that we usually have no way of guessing about the quality of the function) is that I can create a test program that gives one result with a particular version of compiler, but you may not get the exact same results unless you have the exact same compiler and the exact same release version of the compiler's standard library as I have.

My conclusion: To keep my interest, researchers absolutely must supply me with enough information to reproduce, exactly, their results, and that means that they can't use any pseudo-random sequence functions for which source is not available. Then, I can use their results as a starting point for whatever programs for enhancements or improvements (or rebuttal) that I want to implement.


Now, if I want to show beginner how to create a program to, say, calculate the sample variance of a set of pseudo-randomly generated data points, I can supply a program that uses the C standard library functions srand() and rand() in the program to generate the data set.

The other guy's results may not be exactly the same as mine, but maybe the principle can be illustrated adequately. Anyone with a C or C++ compiler can try the program and maybe learn a few things with no external functions required.

Also, given a function that supplies pseudo-random integer values from zero to some maximum value, I can show how to create a pseudo-random number sequence in a certain range of integer or floating point values. How to generate a sequence that approximates samples from a power distribution or a poisson distribution. Etc.

Then when the grasshopper has graduated to the "real world," he/she can still use all of the programming techniques that were learned, but somewhere along the way, someone should have supplied the supplicant with the important clue that something other than rand() and srand() will (probably) be more suitable for generating the deviates.

I have no way of testing or quantifying the assertion that many supposedly respectable "scientific" studies have been published that are flawed by poor choice of a pseudo-random number generator, but I won't deny that such things have happened, and, presumably, continue to happen.

[/begin editorial comment]
I personally think that that ignorance or deliberate mis-application of the mathematics and principles of statistics (and improperly assuming the appropriateness of inference of cause/effect gleaned from statistical correlation) is a bigger cultural threat. (But maybe that's just me; I'm funny that way.) See Footnote [1].
[/end editorial comment]

Quote:
Originally Posted by MPD78
...rand() is taught and it is stated that if it is repeatedly called it will give the same sequence every time the program is run. It continues on the state that this is good from a debugging standpoint.
Of course you want to be able to reproduce, exactly, the same sequence every time so that you can how the output is changed when you change program statements. I have never seen a pseudo-random number function that doesn't have a way to supply a "seed" that will guarantee the same sequence every time. After debugging, when you want to generate a different sequence every time the the program is run, then you, somehow, try to make it use a different seed each time.

Supply srand() with a seed from the user, or a changing value based on time of day, or maybe with a value supplied from a true random generator (such as /dev/random, if that is available on your system). An interesting test might be to see if there is any detectable correlation between sequences (created by a particular pseudo-random generator) that were seeded by numbers that differed by 1 (or some other small value). (This would happen if you used the seconds value from the C standard library function time() with program runs a second or so apart.)

The NR book gives the following as a reference:
Seminumerical Algorithms 3rd Edition Volume 2 of The Art of Computer Programming
---Donald E. Knuth

Good stuff. If you are serious about this stuff, I can't think of a better resource. See Footnote [2]


If you want to see some different generators and see about testing, then you might check out the open source "dieharder" distribution. http://www.phy.duke.edu/~rgb/General/dieharder.php

In addition to excellent documentation about what goes into testing for, and what is meant by, "how random is my random generator," there are a number of examples of generator functions in the source code. You can also test such things as the stuff from NR ran.h and compare speeds, etc.


Regards,

Dave


Footnotes:

[1] "People commonly use statistics like a drunk uses a lamp post;
for support rather than illumination”
---Mark Twain


[2] "...random number generators should not be generated with
a method chosen at random. Some theory should be used."
---Donald Knuth
The art of Computer Programming
Volume 2: Seminumerical Algorithms
Reply With Quote
  #3  
Old 12-02-2009, 06:42 PM
MPD78 MPD78 is offline
Registered User
 
Join Date: Dec 2008
Location: Pittsburgh, PA
Posts: 196
Dave,

Thanks for the indepth response and weblinks.

I have purchased a copy of Donald Knuth's book but it hasn't arrived yet. I hope to read it over my week of vacation for Christmas and New Years.

Until I purchased NR3, I only remotely new of random number generators and thier use from a couple of physicists I know and in my engineering studies, we touched on the field of stochastic vibrations and spectral density. (Although, ANSYS is all I have ever seen used to produce a power spectral analysis.)

I find this "random" mathematics/programming very interesting.

Thanks
Matt
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT -5. The time now is 06:53 PM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2014, Jelsoft Enterprises Ltd.