Accuracy: C++11's binomial_distribution<int> not coincide with the what R returns -


i need generate samples in c++ follow hypergeometric distribution. but, case can approximate binomial distribution without problem.

thus i'd use std implementation in c++11. if generate many samples @ calculate probability different values 1 r tells me. more, difference not smaller when increase number of samples increases. parameters same r , c++.

thus question: why not same results , can do/which should trust?

see below, r , c++ code. c++ program calculates difference r values. if let program run quite while numbers don't smaller wiggle around e-5, e-6, e-7 magnitude.

r:

dbinom(0:2, 2, 0.48645948945615974379) #0.26372385596962805154 0.49963330914842424280 0.23664283488194759464 

c++:

#include <iostream> #include <iomanip> #include <random>  using namespace std;  class generator { public:     generator();     virtual ~generator();     int binom(); private:     std::random_device randev;     std::mt19937_64 gen;     std::binomial_distribution<int> dist; }; generator::generator() : randev(), gen(randev()), dist(2,0.48645948945615974379) { } generator::~generator() {} int generator::binom() { return dist(gen); }  int main() {     generator rd;     const double nrolls = 10000000; // number of experiments     double p[3]={};     (int k=1; k<100; ++k) {         (int i=0; i<nrolls; ++i) {             int number = rd.binom();             ++p[number];         }          cout << "samples=" << setw(8) << nrolls*k <<             "   dp(0)="<<setw(13)<<p[0]/(nrolls*k)-0.26372385596962805154<<             "   dp(1)="<<setw(13)<<p[1]/(nrolls*k)-0.49963330914842424280<<             "   dp(2)="<<setw(13)<<p[2]/(nrolls*k)-0.23664283488194759464<<endl;     }     cout<<"end";     return 0; } 

a selective output:

samples=   1e+07   dp(0)=  -2.0056e-05   dp(1)=  9.49909e-05   dp(2)= -7.49349e-05 samples=   1e+08   dp(0)=   1.5064e-05   dp(1)=  3.43609e-05   dp(2)= -4.94249e-05 samples= 9.9e+08   dp(0)= -2.06449e-05   dp(1)=  5.93429e-06   dp(2)=  1.47106e-05 

this should comment.

i don't see wrong numbers. doing 10**9 repetitions. hence central limit theorem should see accuracy around 10**(-4.5). indeed seeing. signs of dp(0) , dp(2) fluctuate sign. if run program multiple times, signs on last line show same pattern. if not, sign.

btw r giving way many digits in opinion. doubles have 15 digits of accuracy.


Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -