![]() |
|
#1
|
|||
|
|||
|
Chapter 6 - incgammabeta.h : binomialdist:: cdf (bug?)
Hello all,
I've translated to Java. So I'm not sure if it's a bug. But the binomialdist.cdf only 'worked' after I've replaced the line: return 1. - betai( (Doub)k, n - k + 1., pe); by return betai(n - (Doub)k, k + 1., 1. - pe); Is this difference related to k (since it's discrete), a bug or bad java translation? regards, Carlos |
|
#2
|
|||
|
|||
|
Hmmm...I hadn't tested that function by itself.
So, I wrote a little program for p = 1.0/6.0 (probability of a "one" turning up when you toss a fair six-sided die). Here's the program Code:
#include "../code/nr3.h"
#include "../code/gamma.h"
//
// uncomment the following to use the original version
#include "../code/incgammabeta.h"
//
// uncomment the following to use the modified version
//#include "./incgammabeta1.h"
int main()
{
int n;
double p;
// Change #if 0 to #if 1 to let the user enter the number of trials and
// the probability of a desired outcome.
#if 0
cout << "Enter n and p: ";
cin >> n >> p;
if (!cin || n < 1 || p <= 0 || p >= 1) {
cout << "Invalid input." << endl;
exit(EXIT_FAILURE);
}
#else
// Probability for a given face of a six-sided die
n = 20; p = 1.0/6.0;
#endif
cout << "CDF for binomial distribution with n = " << n
<< " and p = " << p << endl << endl;
Binomialdist bd(n, p);
cout << fixed;
cout << scientific;
cout << setprecision(11);
cout << " x Pr(X <= x)" << endl;
for (int k = 0; k < n; k++) {
cout << setw(3) << k << " " << setw(20) << bd.cdf(k) << endl;
}
return 0;
}
Code:
CDF for binomial distribution with n = 20 and p = 0.166667 x Pr(X <= x) 0 0.00000000000e+00 1 2.60840533046e-02 2 1.30420266523e-01 3 3.28659071638e-01 4 5.66545637776e-01 5 7.68749218993e-01 6 8.98159510972e-01 7 9.62864656961e-01 8 9.88746715357e-01 9 9.97158384336e-01 . . . Similarly for the other values: It seems to me that for the usual definition of CDF the function uses a value of argument that's off by 1. I haven't tested to see how this affects any other functions that use this function, but here's a test of cdf() with a slight modification: Instead of changing the return statement (which would require changing the conditionals that test special cases), I just inserted the following line at the beginning of Binomialdist::cdf() in incgammabeta.h: Code:
++k; Code:
CDF for binomial distribution with n = 20 and p = 0.166667 x Pr(X <= x) 0 2.60840533046e-02 1 1.30420266523e-01 2 3.28659071638e-01 3 5.66545637776e-01 4 7.68749218993e-01 5 8.98159510972e-01 6 9.62864656961e-01 7 9.88746715357e-01 8 9.97158384336e-01 9 9.99401496063e-01 . . . Regards, Dave Footnote: I don't "do" java, but as a quick cross check, I ran the following in GNU octave: Code:
# GNU Octave m-file to check binomial cdf
n = 20;
p = 1.0/6.0;
printf('CDF for binomial distribution with n = %d and p = %f\n\n', n, p);
printf(' x Pr(X <= x)\n\n')
for x=0:n
printf('%3d %.11e\n', x, binocdf(x, n, p))
end
Code:
CDF for binomial distribution with n = 20 and p = 0.166667 x Pr(X <= x) 0 2.60840533046e-02 1 1.30420266523e-01 2 3.28659071638e-01 3 5.66545637776e-01 4 7.68749218993e-01 5 8.98159510972e-01 6 9.62864656961e-01 7 9.88746715357e-01 8 9.97158384336e-01 9 9.99401496063e-01 . . . |
|
#3
|
|||
|
|||
|
Hi! Dave
thank you for your fast reply. adding k++ makes cdf looks "right". I've read Poisson explanation 6.14.13. So, it might be the same case. However, for x = 1; k>20 and pe > 0.9 the cdf goes to 0.0 below 1.0E-16 adding ++k, k++ or not. could you test this, please? regards, Carlos |
|
#4
|
|||
|
|||
|
Quote:
Let's look at it for n = 25 and pe = 0.95. With my modified incgammabeta.h, cdf(0) returns zero. Not nearly zero. Exactly zero. My little sanity check says that for k = 0, the result should be 0.05 to the power 25. That's something like 2.98...e-33. That's pretty small, but it is not beyond representation as a double precision floating point number. So, what happened? Well... For k = 0, the cdf function calls betai with the values: a = 1, b = 25, and c = 0.95 Inside betai, you can confirm that for these values it is doing the last exit with a calculated return value equal to 1.0-bt*betacf(b,a,1.0-x)/b This turns out to be something like 1.0 - 2.9802322387696186e-33 Since there are only 16 or 17 significant decimal digits in a C++ double precision number, the returned value is 1.0 Then the cdf function returns 1.0 - the betai() value, so it becomes zero. See what happened? 1.0 - (1.0 - very_small_number) should be equal to the very_small_number, but because of roundoff error it becomes 0.0. Associative and distributive "laws" don't always hold for finite-precision floating point numbers. Bummer! Bottom line: perhaps the cdf function can be rewritten so that it calculates the smaller terms directly instead of using the the betai stuff, but for now, at least there is my explanation of why everything smaller than approximately 1.0e-16 becomes zero. Understanding the results can give insight. Regards, Dave Footnote: My little GNU octave program gives pretty much the same results (everything smaller than 1.0e-16 becomes zero), so they apparently use some similar calculations. On the other hand... When I used the GNU Scientific Library function gsl_cdf_binomial_P(), I got more precision. Here's the C program: Code:
#include <stdio.h>
#include <gsl/gsl_cdf.h>
int main()
{
int n = 25;
double p = 0.95;
int k;
printf("CDF of binomial distribution with n = %d, and p = %f\n\n", n, p);
printf(" x Pr(X <= x)\n\n");
for (k = 0; k <= n; k++) {
printf("%3d %18.6e\n", k, gsl_cdf_binomial_P(k, p, n));
}
return 0;
}
Code:
CDF of binomial distribution with n = 25, and p = 0.950000 x Pr(X <= x) 0 2.980232e-33 1 1.418591e-30 2 3.241777e-28 3 4.733943e-26 4 4.960433e-24 5 3.970253e-22 6 2.522780e-20 7 1.305786e-18 8 5.604966e-17 9 2.020747e-15 10 6.174753e-14 11 1.609214e-12 12 3.591139e-11 13 6.876528e-10 14 1.130173e-08 15 1.591912e-07 16 1.915378e-06 17 1.958055e-05 18 1.687532e-04 19 1.212961e-03 20 7.164948e-03 21 3.409060e-02 22 1.271065e-01 23 3.576241e-01 24 7.226104e-01 25 1.000000e+00 "There's more than one way to skin a cat." My dear old granny used to say that, but I don't know where she got it. I mean, no one in my family ever actually skinned a cat. (At least as far as I know, they didn't. I mean she said that we were eating squirrel. But why would she mention something about cats? Oh, crap!) ---davekw7x |
|
#5
|
|||
|
|||
|
Dave,
Now, after your explanation, it's clear to me. I've gotten the same GNU Scientific Library results using Colt-1.2.0, SSJ-2.4 java libraries. This doubt has arisen when I was trying to find a good invcdf function, but for cdf=1.418591e-30, n=25 and p 0.95 the result is 8.0 instead 1., for example. Thank you very much, Carlos |
![]() |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|