I can try and salvage what information I have from the randomizer theory work we did, if you want. (I also have his original charts for analyzing the history-type randomizers.)
i think colour_thief is still calculating something wrong... about the possibility of lacking I in a range of a specified length... a program to calculate this: (works for GNU C++) Code: #include <cstdio> #include <cstdlib> const int maxlen=1048576,maxrng=14; const char * files[]={ "bag7.txt", "roll4.txt", "roll6.txt", "random.txt", "test.txt" }; const int filec=5; int apptbl[256]={0}; bool acptbl[256]={0}; int seqlen; unsigned char seq[maxlen],inchar; int pfxsum[maxlen]={0}; FILE *fi; void init() { acptbl['I']=true; acptbl['T']=true; acptbl['O']=true; acptbl['S']=true; acptbl['Z']=true; acptbl['J']=true; acptbl['L']=true; //use file instead of stdout freopen("result.txt","w",stdout); } int main() { init(); for(int xhf=0;xhf<filec;xhf++) { fi=fopen(files[xhf],"r"); seqlen=0; //import seq from txt while(true) { fscanf(fi,"%c",&inchar); if(feof(fi)) break; apptbl[inchar]++; if(acptbl[inchar]) { seq[seqlen++]=inchar; } } printf("importing file %s, length=%d\n",files[xhf],seqlen); //sum of chars? for(int xh=0;xh<256;xh++) { if(apptbl[xh]!=0) printf("char %d appears %d\n",xh,apptbl[xh]); apptbl[xh]=0; } //build prefix sum for 'I' for(int xhp=0;xhp<seqlen;xhp++) { pfxsum[xhp+1]=pfxsum[xhp]; if(seq[xhp]=='I') pfxsum[xhp+1]+=1; } //for lengths of range for(int xhl=1;xhl<=maxrng;xhl++) { if(xhl>seqlen) break; int noi=0,br=seqlen-xhl+1; for(int xho=0;xho<br;xho++) if(pfxsum[xho+xhl]==pfxsum[xho]) noi++; printf("length %d, psb of no-I:\t%8d/%8d\t=%.13f\n",xhl,noi,br,double(noi)/double(br)); } //finish printf("---\n\n"); fclose(fi); } printf("finished: %d files\n",filec); //system("pause"); return 0; } i used 5 files. 4 of them generated by nullpomino sequencer, with seed of 0, length of 1,000,000. first line of comment removed. Code: bag7 roll4 roll6 memoryless random the rest one is a test of correctness: Code: ITOSZJL here is the result: Code: importing file bag7.txt, length=1000000 char 10 appears 16666 char 73 appears 142857 char 74 appears 142857 char 76 appears 142857 char 79 appears 142858 char 83 appears 142857 char 84 appears 142857 char 90 appears 142857 length 1, psb of no-I: 857143/ 1000000 =0.8571430000000 length 2, psb of no-I: 717157/ 999999 =0.7171577171577 length 3, psb of no-I: 582880/ 999998 =0.5828811657623 length 4, psb of no-I: 457305/ 999997 =0.4573063719191 length 5, psb of no-I: 343392/ 999996 =0.3433933735735 length 6, psb of no-I: 244231/ 999995 =0.2442322211611 length 7, psb of no-I: 162762/ 999994 =0.1627629765779 length 8, psb of no-I: 101634/ 999993 =0.1016347114430 length 9, psb of no-I: 57921/ 999992 =0.0579214633717 length 10, psb of no-I: 28973/ 999991 =0.0289732607593 length 11, psb of no-I: 11666/ 999990 =0.0116661166612 length 12, psb of no-I: 2889/ 999989 =0.0028890317793 length 13, psb of no-I: 0/ 999988 =0.0000000000000 length 14, psb of no-I: 0/ 999987 =0.0000000000000 length 15, psb of no-I: 0/ 999986 =0.0000000000000 length 16, psb of no-I: 0/ 999985 =0.0000000000000 length 17, psb of no-I: 0/ 999984 =0.0000000000000 length 18, psb of no-I: 0/ 999983 =0.0000000000000 length 19, psb of no-I: 0/ 999982 =0.0000000000000 length 20, psb of no-I: 0/ 999981 =0.0000000000000 --- importing file roll4.txt, length=1000000 char 10 appears 16666 char 73 appears 143096 char 74 appears 143140 char 76 appears 142108 char 79 appears 143109 char 83 appears 142367 char 84 appears 143350 char 90 appears 142830 length 1, psb of no-I: 856904/ 1000000 =0.8569040000000 length 2, psb of no-I: 717339/ 999999 =0.7173397173397 length 3, psb of no-I: 581314/ 999998 =0.5813151626303 length 4, psb of no-I: 448661/ 999997 =0.4486623459870 length 5, psb of no-I: 319328/ 999996 =0.3193292773171 length 6, psb of no-I: 227508/ 999995 =0.2275091375457 length 7, psb of no-I: 162205/ 999994 =0.1622059732358 length 8, psb of no-I: 115648/ 999993 =0.1156488095417 length 9, psb of no-I: 82431/ 999992 =0.0824316594533 length 10, psb of no-I: 58782/ 999991 =0.0587825290428 length 11, psb of no-I: 41930/ 999990 =0.0419304193042 length 12, psb of no-I: 29882/ 999989 =0.0298823287056 length 13, psb of no-I: 21299/ 999988 =0.0212992555911 length 14, psb of no-I: 15232/ 999987 =0.0152321980186 length 15, psb of no-I: 10907/ 999986 =0.0109071527001 length 16, psb of no-I: 7805/ 999985 =0.0078051170768 length 17, psb of no-I: 5548/ 999984 =0.0055480887694 length 18, psb of no-I: 3969/ 999983 =0.0039690674741 length 19, psb of no-I: 2829/ 999982 =0.0028290509229 length 20, psb of no-I: 2001/ 999981 =0.0020010380197 --- importing file roll6.txt, length=1000000 char 10 appears 16666 char 73 appears 142960 char 74 appears 143041 char 76 appears 142474 char 79 appears 142946 char 83 appears 142642 char 84 appears 143139 char 90 appears 142798 length 1, psb of no-I: 857040/ 1000000 =0.8570400000000 length 2, psb of no-I: 715278/ 999999 =0.7152787152787 length 3, psb of no-I: 574777/ 999998 =0.5747781495563 length 4, psb of no-I: 435467/ 999997 =0.4354683064049 length 5, psb of no-I: 297309/ 999996 =0.2973101892408 length 6, psb of no-I: 203058/ 999995 =0.2030590152951 length 7, psb of no-I: 138798/ 999994 =0.1387988327930 length 8, psb of no-I: 94911/ 999993 =0.0949116643817 length 9, psb of no-I: 64935/ 999992 =0.0649355194842 length 10, psb of no-I: 44467/ 999991 =0.0444674002066 length 11, psb of no-I: 30402/ 999990 =0.0304023040230 length 12, psb of no-I: 20830/ 999989 =0.0208302291325 length 13, psb of no-I: 14266/ 999988 =0.0142661711941 length 14, psb of no-I: 9737/ 999987 =0.0097371265826 length 15, psb of no-I: 6620/ 999986 =0.0066200926813 length 16, psb of no-I: 4487/ 999985 =0.0044870673060 length 17, psb of no-I: 3054/ 999984 =0.0030540488648 length 18, psb of no-I: 2041/ 999983 =0.0020410346976 length 19, psb of no-I: 1380/ 999982 =0.0013800248404 length 20, psb of no-I: 933/ 999981 =0.0009330177273 --- importing file random.txt, length=1000000 char 10 appears 16666 char 73 appears 142990 char 74 appears 143828 char 76 appears 142021 char 79 appears 142852 char 83 appears 142569 char 84 appears 143103 char 90 appears 142637 length 1, psb of no-I: 857010/ 1000000 =0.8570100000000 length 2, psb of no-I: 734544/ 999999 =0.7345447345447 length 3, psb of no-I: 629579/ 999998 =0.6295802591605 length 4, psb of no-I: 539564/ 999997 =0.5395656186969 length 5, psb of no-I: 462392/ 999996 =0.4623938495754 length 6, psb of no-I: 396304/ 999995 =0.3963059815299 length 7, psb of no-I: 339588/ 999994 =0.3395900375402 length 8, psb of no-I: 290940/ 999993 =0.2909420365943 length 9, psb of no-I: 249266/ 999992 =0.2492679941440 length 10, psb of no-I: 213529/ 999991 =0.2135309217783 length 11, psb of no-I: 182815/ 999990 =0.1828168281683 length 12, psb of no-I: 156550/ 999989 =0.1565517220689 length 13, psb of no-I: 134076/ 999988 =0.1340776089313 length 14, psb of no-I: 114790/ 999987 =0.1147914922894 length 15, psb of no-I: 98294/ 999986 =0.0982953761353 length 16, psb of no-I: 84212/ 999985 =0.0842132631989 length 17, psb of no-I: 72141/ 999984 =0.0721421542745 length 18, psb of no-I: 61777/ 999983 =0.0617780502269 length 19, psb of no-I: 52909/ 999982 =0.0529099523791 length 20, psb of no-I: 45335/ 999981 =0.0453358613814 --- importing file test.txt, length=7 char 73 appears 1 char 74 appears 1 char 76 appears 1 char 79 appears 1 char 83 appears 1 char 84 appears 1 char 90 appears 1 length 1, psb of no-I: 6/ 7 =0.8571428571429 length 2, psb of no-I: 5/ 6 =0.8333333333333 length 3, psb of no-I: 4/ 5 =0.8000000000000 length 4, psb of no-I: 3/ 4 =0.7500000000000 length 5, psb of no-I: 2/ 3 =0.6666666666667 length 6, psb of no-I: 1/ 2 =0.5000000000000 length 7, psb of no-I: 0/ 1 =0.0000000000000 --- finished: 5 files for bag7, it is about 0.289%, far from 1/(7*7), but quite close to 1/(7*7*7). for roll4, it turns to be 2.988%. about 10 times of bag7........ and for memoryless, it remains 11% for length of 14... nothing seems to be wrong in the correctness test. idea?
That's more than a jump... your probability is adding up to more than 100% because of it! I'm not familiar with Java but I would expect those variables are getting corrupted or something. More than likely your data is fine, there's just a program issue. My curves aren't based on data, just theoretical probability calculations, but if I can help you with anything just let me know. Thanks for the vote of confidence. You are calculating something different than me. Hopefully once I explain the difference you will see how my calculation is more useful. You are testing n consecutive pieces for if they are I. So for n=1, you are looking at every piece in the entire sequence and checking if it is I. For any normal randomizer, there is no I 6/7 = ~85.7% of the time. Look at your data: Every single randomizer is showing you ~85.7% for n=1! I am testing for a complete interval of size n. So for example, you would need an I, then n-1 pieces without an I, then a second I on the nth piece. Do this and I am certain your data will show curves like mine. But note also that I am not just testing for I, I am testing for these intervals with all pieces. If you took my probabilities and divide them by 7, you would get the drought probabilities for one specific piece (eg. I). This brings me to your observation about the 12 droughts. Because this is the maximum possible 7-bag drought, your program doesn't over-count it the way it does with smaller droughts. For this reason, it lines up with my probabilities. But since you are only considering I (1/7 of the pieces) you have to take my probability and divide by 7. That's why I have 1/7^2 and you have 1/7^3. tldr; You answer the question: "What is the probability that there will be no I piece when examining n consecutive pieces?" I answer the question: "What is the probability of not obtaining a piece until n pieces later?"
mm. i thought you had left? But yeah. probability calculations are often non-intuitive. That's why casinos stay in business. For memoryless randomizer, the odds of a I drought of X length is very easy to calculate. it's (6/7)^n where n is the drought length. For bag, we have to handle multiple cases. first we have the case of starting with a full bag. you can go up to 6 pieces without an I. The probabilities are 6/7, 5/7, 4/7, 3/7, 2/7, 1/7, and 0/7 for droughts of 1 through 7. This is very intuitive. There are twelve other possible cases to consider for bag, all equally likely. All can be calculated using a more complicated, but still simple method, and then you can work out the general case for each drought length by taking the mean of all 13 possible probabilities for each drought i think. (add all 13 up, divide by 13) Note that drought length greater than 12 is impossible. For strict history, the same calculations can be made, but there are only five cases. no i in history, and one i in each position. The same calculation method used for geenral case of bag works. For 4 and 6 roll history... well... that's a royal pain to work out. Strict history will be close, but not nearly exact. drought possibiliy is greater when we allow history repeats, because repeats make a drought more likely to continue by giving you less pieces in the history. For this case, i believe monte carlo analysis is most useful.
You are wrong. It's (1/7)*(6/7)^(n-1). Dude this was the simplest one. Wrong again. For a 7n-bag randomizer, there are exactly n(7n choose n)^2 cases to consider. For 7-bag (n=1) this means 49 cases. 7-bag is super easy to work out, drought intervals 1 through 13 have probabilities 1, 2, 3, 4, 5, 6, 7, 6, 5, 4, 3, 2, 1 out of 49 respectively. For n>1 it loses symmetry and gets really tedious in a hurry. For 63-bag (n=9) there are 5,041,435,870,611,492,608,025 cases to consider, and I've worked out the fractions. Wrong. Wtf Monte Carlo man. Do you even know what that is? (rhetorical question) I've worked it out for reals, and there was no need to randomly sample and approximate. Correct! Perhaps the resident misinformer should not speak with authority on this matter.
This is a simple case of semantics. If you define N as the number of pieces dealt including the N, and define the drought length as the number of non I pieces before the I is dealt, then your formula is correct. Since i was working on different definitions i came up wit a different, and equally correct answer. But you can't go 13 pieces without an I in 1 bag. do you consider a drought length of 13 to be 12 without an i? I agree my calculation method was probably more complex than needed, but your answer is flat out wrong. Under the assumption that you don't know where in the bag you are, and you don't know if you've drawn the I or not, then the odds of your next piece being an I are.. guess what? 1 in 7! for a drought length of zero. The probability that you get the i second given those assumptions, is in fact ((48/49)/7) The probability that you get it 3rd is ((46/49)/7) The probability that you get it 4th is ((43/49)/7) The probability that you get it 5th is ((39/49)/7) The probability that you get it 6th is ((34/49)/7) The probability that you get it 7th is ((28/49)/7) The probability that you get it 8th is ((21/49)/7) The probability that you get it 9th is ((15/49)/7) The probability that you get it 10th is ((10/49)/7) The probability that you get it 11th is ((6/49)/7) The probability that you get it 12th is ((3/49)/7) The probability that you get it 13th is ((1/49)/7) Your drought distribution for bag couldn't be more wrong. How I did this. wrote out all possible I positions in double 7 bags. Count each drought length from the start of the line. record it delete the first chaacter from each line repeat the above. Do this for the cases of 1-6 draws. the 7 draw case is equivalent to 0 draw case, so further cases are skipped. Once we have the drought probabilities for 0 through 6 pieces drawn, we add them up for each length, and divide by 7 (number of cases calculated). This is the true drought probabilities for 7 bag. And yes, these probabilities add up to 1. Nice job showing the distribution of possible sums of 2d7, though. Well, congrats on working it out for reals, although if your bag calculation is wrong, i'm not sure your tgm calculation is right either. If so i can't correct it though. I would approximate with a large sample at this point because i don't know how to work it otherwise. Also a large monte carlo sample is a good way to check your calculations, and would have revealed the fallacy in the 7 bag calculation you mentioned above (generate a truckload of bags, and test each starting position in the sequence). Who's the misinformer?
PS: farter's monte carlo agrees with my bag calculations. I think i see what ct was trying to get at, but i disagree with his assertion that his calculation is more useful For example at the start of the game you haven't recieved an I. So the calculation is useless. That said, this case has a much simpler calculation. With memoryless, whether you just got an I or not is irrelevant, so it doesn't affect the odds. What if a tetris game decided to start with a random number of pieces already dealt from the bag, to shake thigns up a little a little? Then the calculation I made is much more useful. And if you do know the status of the bag, it's much more useful calculating based on all your knowledge, instead of simply the fact that you got an I. If you get an I at the start of the bag, his calculation gives you a 57% chance of getting another I within 7 pieces, when in fact, the odds are only 1 in 7. In a history based randomizer, I can see the usefulness of this sort of calculation. as for the statement above. This is where the error is. His numbers are different because he does not assume that the last piece dealt was an I. Dividing your numbers by seven will not give his numbers, but will instead give something adding up to 1/7th. This "i'm considering all 7, and you are only considering I" business is simply not how it goes. My long calculation above duplicates his results, which are definitely not symmetric. To sum up The probability of getting n pieces without any Is in memoryless is (6/7)^2. this tells you how unlikely you getting screwed and never getting that I was. The probability of getting your first i on the nth piece in memoryles is (1/7)*(6/7)^(n-1) The probabilities of having n pieces between the 2 Is in two bags is ct's table, with n being 0 for the first entry and 12 for the last. The probabilities for getting an I on the nth piece in bag without ANY prior knowledge to go on are my numbers. If you know your position in the bag, and if the I piece has been dealt, you can use simple calculations to work out the math. there will be seven possible positions with equal odds, from n+1 to n+7, where n is the number remaining in the bag, if the I has been dealt. Otherwise, there will be seven or less possible positions of equal odds, from 1 to n where n is the remaining pieces in the bag. ct's calculation for history based randomizers is easier than the one to do without knowledge of history, yet both are useful. I'm not up to doubling the number of cases and weighting them properly. Dividing ct's result by 7 does not give an accurate probability except for a small subset of the cases handled.
Zaphod, your terrible math is just reinforcing your title of resident misinformer. For pure random, my objection is not just semantics. Semantics would be objecting to calling an interval of 1 a drought of 0. I would never take issue with that. But no matter what you want to call it you have to calculate it my way. If you don't have that (1/7) in there you're not measuring an interval of length n, you're measuring an interval of length n or greater. It's the same issue farter was having just a few posts up. For 7-bag, you have to consider 2 consecutive bags. There are 7 positions a piece could take in the left bag, and 7 in the right bag. 7*7 = 49 possibilities for a pair of positions. Some quick notation: (L,R) would be the index of the piece in the left and right bags. There's exactly 1 scenario where you get an interval of 1: (7,1) = 1/49 And exactly 2 scenarios where you get an interval of 2: (6,1) & (7,2) = 2/49 And exactly 3 scenarios where you get an interval of 3: (5,1) & (6,2) & (7,3) = 3/49 etc. Hopefully you can see the pattern with that. It is a very simple pattern after all. The probabilities are definitely the values 1, 2, 3, 4, 5, 6, 7, 6, 5, 4, 3, 2, 1 divided by 49, and you are just wrong. Don't mistake my calculations for guesswork. They are verified against actual data, and for simple randomizers like 7-bag and pure random tons of people have worked out the probabilities before and these are non-controversial, agreed upon probabilities.
Yes, because you are both making the same mistake. You with probability calculations and him with reading the data.
I dunno exactly what you meant by "I drought" dudes but: Assuming "memoryless" randomizer: the probability that I doesn't appear during N specified trials is (6/7)^N. The probability the first I appears at the trial N is (1/7)*(6/7)^(N-1). Accurate language make things easier (bad english doesn't but i'm working on it ) For the bag randomizer: how exactly does it works (i don't know the rule of these games)
With memoryless, you are 100% correct. Thank you for the voice of reason. 7-bag is: 1 of every piece is put in a bag. Randomly take pieces from the bag 1 at a time. When the bag is empty refill it with 1 of every piece.
ok.. you mean calculating like this? for every starting position, calculate how long it will starve you, and possibility[length]=times_of_being_starved_for[length]/number_of_starting_positions (not quite code) new code: Code: #include <cstdio> #include <cstring> #include <cstdlib> const int maxlen=1048576,maxrng=20; const char * files[]={ "bag7.txt", "roll4.txt", "roll6.txt", "random.txt", "test.txt" }; const int filec=5; int apptbl[256]={0}; bool acptbl[256]={0}; int noirec[maxlen]; int seqlen; unsigned char seq[maxlen],inchar; int pfxsum[maxlen]={0}; FILE *fi; void init() { acptbl['I']=true; acptbl['T']=true; acptbl['O']=true; acptbl['S']=true; acptbl['Z']=true; acptbl['J']=true; acptbl['L']=true; //use file instead of stdout freopen("result_new.txt","w",stdout); } int main() { init(); for(int xhf=0;xhf<filec;xhf++) { fi=fopen(files[xhf],"r"); seqlen=0; //import seq from txt while(true) { fscanf(fi,"%c",&inchar); if(feof(fi)) break; apptbl[inchar]++; if(acptbl[inchar]) { seq[seqlen++]=inchar; } } printf("importing file %s, length=%d\n",files[xhf],seqlen); //sum of chars? for(int xh=0;xh<256;xh++) { if(apptbl[xh]!=0) apptbl[xh]=0; } //build prefix sum for 'I' for(int xhp=0;xhp<seqlen;xhp++) { pfxsum[xhp+1]=pfxsum[xhp]; if(seq[xhp]=='I') pfxsum[xhp+1]+=1; } /* //wrong way...? //for lengths of range for(int xhl=1;xhl<=maxrng;xhl++) { if(xhl>seqlen) break; int noi=0,br=seqlen-xhl+1; for(int xho=0;xho<br;xho++) if(pfxsum[xho+xhl]==pfxsum[xho]) noi++; printf("length %2d, psb of no-I: %8d/%8d\t=%.13f\n",xhl,noi,br,double(noi)/double(br)); //printf("%.13f\n",double(noi)/double(br)); } */ memset(noirec,0,sizeof(noirec)); for(int xho=0;xho<seqlen;xho++) { int bl=xho,br=seqlen,imid; //binary search for the length of longest range starting from xho while(bl<br) { imid=(bl+br)>>1; if(pfxsum[imid]==pfxsum[xho]) bl=imid+1; else br=imid; } if(pfxsum[br]!=pfxsum[xho]) br--; noirec[br-xho]++; } for(int xhl=0;xhl<=seqlen;xhl++) //print { if(noirec[xhl]!=0) printf("length %2d, psb: %8d/%8d = %.13f\n", xhl,noirec[xhl],seqlen,double(noirec[xhl])/double(seqlen)); } //finish printf("---\n\n"); fclose(fi); } printf("finished: %d files\n",filec); //system("pause"); return 0; } result: Code: importing file bag7.txt, length=1000000 length 0, psb: 142857/ 1000000 = 0.1428570000000 length 1, psb: 139986/ 1000000 = 0.1399860000000 length 2, psb: 134277/ 1000000 = 0.1342770000000 length 3, psb: 125575/ 1000000 = 0.1255750000000 length 4, psb: 113913/ 1000000 = 0.1139130000000 length 5, psb: 99161/ 1000000 = 0.0991610000000 length 6, psb: 81469/ 1000000 = 0.0814690000000 length 7, psb: 61128/ 1000000 = 0.0611280000000 length 8, psb: 43713/ 1000000 = 0.0437130000000 length 9, psb: 28948/ 1000000 = 0.0289480000000 length 10, psb: 17307/ 1000000 = 0.0173070000000 length 11, psb: 8777/ 1000000 = 0.0087770000000 length 12, psb: 2889/ 1000000 = 0.0028890000000 --- importing file roll4.txt, length=1000000 length 0, psb: 143096/ 1000000 = 0.1430960000000 length 1, psb: 139565/ 1000000 = 0.1395650000000 length 2, psb: 136025/ 1000000 = 0.1360250000000 length 3, psb: 132653/ 1000000 = 0.1326530000000 length 4, psb: 129333/ 1000000 = 0.1293330000000 length 5, psb: 91820/ 1000000 = 0.0918200000000 length 6, psb: 65303/ 1000000 = 0.0653030000000 length 7, psb: 46557/ 1000000 = 0.0465570000000 length 8, psb: 33217/ 1000000 = 0.0332170000000 length 9, psb: 23649/ 1000000 = 0.0236490000000 length 10, psb: 16852/ 1000000 = 0.0168520000000 length 11, psb: 12048/ 1000000 = 0.0120480000000 length 12, psb: 8583/ 1000000 = 0.0085830000000 length 13, psb: 6067/ 1000000 = 0.0060670000000 length 14, psb: 4325/ 1000000 = 0.0043250000000 length 15, psb: 3102/ 1000000 = 0.0031020000000 length 16, psb: 2257/ 1000000 = 0.0022570000000 length 17, psb: 1579/ 1000000 = 0.0015790000000 length 18, psb: 1140/ 1000000 = 0.0011400000000 length 19, psb: 828/ 1000000 = 0.0008280000000 length 20, psb: 589/ 1000000 = 0.0005890000000 length 21, psb: 424/ 1000000 = 0.0004240000000 length 22, psb: 309/ 1000000 = 0.0003090000000 length 23, psb: 212/ 1000000 = 0.0002120000000 length 24, psb: 138/ 1000000 = 0.0001380000000 length 25, psb: 98/ 1000000 = 0.0000980000000 length 26, psb: 67/ 1000000 = 0.0000670000000 length 27, psb: 47/ 1000000 = 0.0000470000000 length 28, psb: 35/ 1000000 = 0.0000350000000 length 29, psb: 27/ 1000000 = 0.0000270000000 length 30, psb: 20/ 1000000 = 0.0000200000000 length 31, psb: 10/ 1000000 = 0.0000100000000 length 32, psb: 7/ 1000000 = 0.0000070000000 length 33, psb: 6/ 1000000 = 0.0000060000000 length 34, psb: 3/ 1000000 = 0.0000030000000 length 35, psb: 3/ 1000000 = 0.0000030000000 length 36, psb: 2/ 1000000 = 0.0000020000000 length 37, psb: 1/ 1000000 = 0.0000010000000 length 38, psb: 1/ 1000000 = 0.0000010000000 length 39, psb: 1/ 1000000 = 0.0000010000000 length 40, psb: 1/ 1000000 = 0.0000010000000 --- importing file roll6.txt, length=1000000 length 0, psb: 142960/ 1000000 = 0.1429600000000 length 1, psb: 141762/ 1000000 = 0.1417620000000 length 2, psb: 140501/ 1000000 = 0.1405010000000 length 3, psb: 139310/ 1000000 = 0.1393100000000 length 4, psb: 138158/ 1000000 = 0.1381580000000 length 5, psb: 94251/ 1000000 = 0.0942510000000 length 6, psb: 64260/ 1000000 = 0.0642600000000 length 7, psb: 43887/ 1000000 = 0.0438870000000 length 8, psb: 29976/ 1000000 = 0.0299760000000 length 9, psb: 20468/ 1000000 = 0.0204680000000 length 10, psb: 14065/ 1000000 = 0.0140650000000 length 11, psb: 9572/ 1000000 = 0.0095720000000 length 12, psb: 6564/ 1000000 = 0.0065640000000 length 13, psb: 4529/ 1000000 = 0.0045290000000 length 14, psb: 3117/ 1000000 = 0.0031170000000 length 15, psb: 2133/ 1000000 = 0.0021330000000 length 16, psb: 1433/ 1000000 = 0.0014330000000 length 17, psb: 1013/ 1000000 = 0.0010130000000 length 18, psb: 661/ 1000000 = 0.0006610000000 length 19, psb: 447/ 1000000 = 0.0004470000000 length 20, psb: 307/ 1000000 = 0.0003070000000 length 21, psb: 220/ 1000000 = 0.0002200000000 length 22, psb: 149/ 1000000 = 0.0001490000000 length 23, psb: 98/ 1000000 = 0.0000980000000 length 24, psb: 61/ 1000000 = 0.0000610000000 length 25, psb: 42/ 1000000 = 0.0000420000000 length 26, psb: 24/ 1000000 = 0.0000240000000 length 27, psb: 14/ 1000000 = 0.0000140000000 length 28, psb: 7/ 1000000 = 0.0000070000000 length 29, psb: 3/ 1000000 = 0.0000030000000 length 30, psb: 2/ 1000000 = 0.0000020000000 length 31, psb: 2/ 1000000 = 0.0000020000000 length 32, psb: 1/ 1000000 = 0.0000010000000 length 33, psb: 1/ 1000000 = 0.0000010000000 length 34, psb: 1/ 1000000 = 0.0000010000000 length 35, psb: 1/ 1000000 = 0.0000010000000 --- importing file random.txt, length=1000000 length 0, psb: 142990/ 1000000 = 0.1429900000000 length 1, psb: 122466/ 1000000 = 0.1224660000000 length 2, psb: 104965/ 1000000 = 0.1049650000000 length 3, psb: 90015/ 1000000 = 0.0900150000000 length 4, psb: 77172/ 1000000 = 0.0771720000000 length 5, psb: 66088/ 1000000 = 0.0660880000000 length 6, psb: 56716/ 1000000 = 0.0567160000000 length 7, psb: 48648/ 1000000 = 0.0486480000000 length 8, psb: 41674/ 1000000 = 0.0416740000000 length 9, psb: 35737/ 1000000 = 0.0357370000000 length 10, psb: 30714/ 1000000 = 0.0307140000000 length 11, psb: 26265/ 1000000 = 0.0262650000000 length 12, psb: 22474/ 1000000 = 0.0224740000000 length 13, psb: 19286/ 1000000 = 0.0192860000000 length 14, psb: 16496/ 1000000 = 0.0164960000000 length 15, psb: 14082/ 1000000 = 0.0140820000000 length 16, psb: 12071/ 1000000 = 0.0120710000000 length 17, psb: 10364/ 1000000 = 0.0103640000000 length 18, psb: 8868/ 1000000 = 0.0088680000000 length 19, psb: 7574/ 1000000 = 0.0075740000000 length 20, psb: 6426/ 1000000 = 0.0064260000000 length 21, psb: 5488/ 1000000 = 0.0054880000000 length 22, psb: 4709/ 1000000 = 0.0047090000000 length 23, psb: 4057/ 1000000 = 0.0040570000000 length 24, psb: 3473/ 1000000 = 0.0034730000000 length 25, psb: 2982/ 1000000 = 0.0029820000000 length 26, psb: 2563/ 1000000 = 0.0025630000000 length 27, psb: 2198/ 1000000 = 0.0021980000000 length 28, psb: 1901/ 1000000 = 0.0019010000000 length 29, psb: 1623/ 1000000 = 0.0016230000000 length 30, psb: 1402/ 1000000 = 0.0014020000000 length 31, psb: 1235/ 1000000 = 0.0012350000000 length 32, psb: 1052/ 1000000 = 0.0010520000000 length 33, psb: 897/ 1000000 = 0.0008970000000 length 34, psb: 778/ 1000000 = 0.0007780000000 length 35, psb: 655/ 1000000 = 0.0006550000000 length 36, psb: 554/ 1000000 = 0.0005540000000 length 37, psb: 468/ 1000000 = 0.0004680000000 length 38, psb: 396/ 1000000 = 0.0003960000000 length 39, psb: 344/ 1000000 = 0.0003440000000 length 40, psb: 298/ 1000000 = 0.0002980000000 length 41, psb: 257/ 1000000 = 0.0002570000000 length 42, psb: 224/ 1000000 = 0.0002240000000 length 43, psb: 195/ 1000000 = 0.0001950000000 length 44, psb: 171/ 1000000 = 0.0001710000000 length 45, psb: 150/ 1000000 = 0.0001500000000 length 46, psb: 123/ 1000000 = 0.0001230000000 length 47, psb: 108/ 1000000 = 0.0001080000000 length 48, psb: 91/ 1000000 = 0.0000910000000 length 49, psb: 79/ 1000000 = 0.0000790000000 length 50, psb: 66/ 1000000 = 0.0000660000000 length 51, psb: 55/ 1000000 = 0.0000550000000 length 52, psb: 47/ 1000000 = 0.0000470000000 length 53, psb: 35/ 1000000 = 0.0000350000000 length 54, psb: 31/ 1000000 = 0.0000310000000 length 55, psb: 28/ 1000000 = 0.0000280000000 length 56, psb: 23/ 1000000 = 0.0000230000000 length 57, psb: 20/ 1000000 = 0.0000200000000 length 58, psb: 17/ 1000000 = 0.0000170000000 length 59, psb: 16/ 1000000 = 0.0000160000000 length 60, psb: 13/ 1000000 = 0.0000130000000 length 61, psb: 12/ 1000000 = 0.0000120000000 length 62, psb: 11/ 1000000 = 0.0000110000000 length 63, psb: 9/ 1000000 = 0.0000090000000 length 64, psb: 6/ 1000000 = 0.0000060000000 length 65, psb: 3/ 1000000 = 0.0000030000000 length 66, psb: 3/ 1000000 = 0.0000030000000 length 67, psb: 3/ 1000000 = 0.0000030000000 length 68, psb: 3/ 1000000 = 0.0000030000000 length 69, psb: 2/ 1000000 = 0.0000020000000 length 70, psb: 2/ 1000000 = 0.0000020000000 length 71, psb: 2/ 1000000 = 0.0000020000000 length 72, psb: 2/ 1000000 = 0.0000020000000 length 73, psb: 2/ 1000000 = 0.0000020000000 length 74, psb: 2/ 1000000 = 0.0000020000000 length 75, psb: 2/ 1000000 = 0.0000020000000 length 76, psb: 2/ 1000000 = 0.0000020000000 length 77, psb: 2/ 1000000 = 0.0000020000000 length 78, psb: 2/ 1000000 = 0.0000020000000 length 79, psb: 2/ 1000000 = 0.0000020000000 length 80, psb: 2/ 1000000 = 0.0000020000000 length 81, psb: 2/ 1000000 = 0.0000020000000 length 82, psb: 2/ 1000000 = 0.0000020000000 length 83, psb: 2/ 1000000 = 0.0000020000000 length 84, psb: 1/ 1000000 = 0.0000010000000 length 85, psb: 1/ 1000000 = 0.0000010000000 length 86, psb: 1/ 1000000 = 0.0000010000000 length 87, psb: 1/ 1000000 = 0.0000010000000 length 88, psb: 1/ 1000000 = 0.0000010000000 length 89, psb: 1/ 1000000 = 0.0000010000000 length 90, psb: 1/ 1000000 = 0.0000010000000 --- importing file test.txt, length=7 length 0, psb: 1/ 7 = 0.1428571428571 length 1, psb: 1/ 7 = 0.1428571428571 length 2, psb: 1/ 7 = 0.1428571428571 length 3, psb: 1/ 7 = 0.1428571428571 length 4, psb: 1/ 7 = 0.1428571428571 length 5, psb: 1/ 7 = 0.1428571428571 length 6, psb: 1/ 7 = 0.1428571428571 --- finished: 5 files suppose that there is an I right after the given sequence..
I took some time to think through it. Even though I am awful with probabilities, I have been able to come up with what seems to be a reasonable model for calculating probabilistic values for starvation length (or maybe not ). I don't know how much my values agree with CT, since I have only done it for a much more simple case. My method is lengthy and perhaps some parts of it may be superfluous? ... I don't know really. In any case at any given time in the game, what we want to do is calculate the probability distribution for various drought lengths (for any given piece, say I). That is at any given time: P(0) - I is the very next piece P(1) - The very next I comes after a delay of exactly 1 P(2) - The very next I comes after a delay of exactly 2 ... and so on. Obviously the delay is only calculated in terms of pieces and hence is just a discrete number. First to get some clarity lets see that for a 7n-bag in general, the maximum drought length is 12n. In other words at any given point in the game P(x)=0 for all x greater than 12n. While in general there will be a probability distribution for values of x between 0 and 12n. For example, for a 7-bag randomizer no matter what place you are in a game, you can confidently say that you won't be starved for an I (or any other piece for that matter) any more than 6 other pieces. Similarly for a 14-bag this number would be 12 and so on. Since bags are completely symmetric, I think that any calculation you do for one particular piece should also apply for all others. In general for example, it is perfectly reasonable that maximum drought lengths might be different for different pieces in a given randomizer. But this post isn't about those randomizers. For a memoryless randomizer there is no upper bound on the drought length. That is the probability values never fall to 0 after a given interval. This is in contrast to bag randomizer. Now coming to states, given 7 pieces, a 7-bag randomizer has a total of 127 states (=2^7-1). In general any 7n-bag randomizer (with 7 pieces) has a total of (n+1)^7-1 states. Now as a simple model lets take a randomizer with just 3 pieces 1,2 and 3. First lets draw the transition diagram for a randomizer that works like a 3-bag for these pieces. Firstly the number of states in the diagram are just 7 (=2^3-1). The states are: first level: (blank) second level: (1), (2), (3) third level: (12), (13), (23) We make the transition diagram by first writing down the states (each level we go from left to right). So we connect the first level with the second level using arcs/transitions and marking the pieces on the transitions. Similarly we connect the second level to the third level and finally the third level back to the first level. To read a sequence of pieces just follow a path on the transition diagram while noting down the pieces marked on the transitions. The probabilities for the transitions are simply 1/(multiplicity of the state). I am assuming there is no preference for one particular transition over the other. First we calculate the normalized probabilities for the states. I just arrived at the method by trial and error so I don't have a rigorous justification . All I can say is that it makes sense in terms of the frequency of the states in the long run (no matter which state one starts from). SP just stands for "stable probabilities" in the sense that how would frequencies would be expected to settle in the long run. SP(blank) = SP(12) + SP(13) + SP(23) SP(1) = (1/3)*SP(blank) SP(2) = (1/3)*SP(blank) SP(3) = (1/3)*SP(blank) SP(12) = (1/2)*SP(1) + (1/2)*SP(2) SP(13) = (1/2)*SP(1) + (1/2)*SP(3) SP(23) = (1/2)*SP(2) + (1/2)*SP(3) Adding all these probabilities and normalizing them to 1 we would get SP(blank) as 1/3 and all the other probabilities as 1/9. Now glancing at the posts, Zaphod also has a point with respect to how in a specific game at a specific point we can say much more about the probabilities of a drought. However, I think CT's analysis was directed towards more or less a generic result without assuming anything (we can roughly say long term equilibrium). Now lets assume 1 to be the I-piece. Note that the maximum drought length here is 4. Here are the calculated results that can be deduced from the complete transition diagram: For (blank) state: P(0) = 1/3; P(1) = 1/3; P(2) = 1/3; P(3) = 0; P(4) = 0 For (1): P(0) = 0; P(1) = 0; P(2) = 1/3; P(3) = 1/3; P(4) = 1/3 For (2) and (3): P(0) = 1/2; P(1) = 1/2; P(2) = 0; P(3) = 0; P(4) = 0 For (12) and (13): P(0) = 0; P(1) = 1/3; P(2) = 1/3; P(3) = 1/3; P(4) = 0 For (23): P(0) = 1; P(1) = 0; P(2) = 0; P(3) = 0; P(4) = 0 If we combine these results with the "stable probabilities" SP we calculated above we get the generic values for P(0), P(1), P(2), P(3), P(4) without any assumption about the state we are in. For me these values turn out to be: P(0) = 9/27; P(1) = 8/27; P(2) = 6/27; P(3) = 3/27; P(4) = 1/27 Now if we try to calculate the probability that, in the long run, at any arbitrary time what would be the chance (without assuming anything about our position in the randomizer) that the very next piece will be 1. Given the symmetry of the bag, one would expect this value to be exactly 1/3, which is reflected in the value of P(0) above. Now it would be completely impossible to carry out this analysis by hand for a 7-bag, which is true. But if we informally use the idea of composite states, it might be possible to carry this out. I carried the above analysis using composite states informally (reducing the states to 5 and altering the probabilities) and got the same results. In 7-bag of course, the state reduction would be enormous, although it still might be too tedious. Using nCk to calculate subsets of given size can also help to reduce the tedium. I haven't done it though. edit: corrections
Let D be the distance between 2 successive I pieces. Just to make it absolutely clear: Code: I I |<->| D=1 I O I |<----->| D=2 I J Z I |<--------->| D=3 etc. For the 7-bag random generator, the probability mass function for D is given as: Code: P(D=d) = { d/49, for d= 1, 2, ... , 7, (14-d)/49, for d= 8, 9, ... , 13. That's it; we're done. This is the most useful thing to know about 7-bag drought probabilities, and is exactly what colour_thief said it was. Qualitatively, the pmf tells us that the probabilities decrease linearly and symmetrically about the expected value. Even at a glance, we have learnt something profound about the character of the randomizer. Taking a look at the only thing that Zaphod77 has written that is actually correct, and isn't merely a restatement of something previously written by colour_thief: I haven't really looked at his code, but I think farter was trying to do something very similar. What the above probabilities are effectively telling us is that, if you picked a piece at random from a 7-bag sequence, then the probability that it is an I is 1/7. That's hardly surprising, considering the 7-bag is guaranteed to give out an I a seventh of the time, even over the relatively short term. Further, if you picked n consecutive pieces at random from the sequence, then the probability that only the nth piece is an I decreases as n increases. In other words, the larger the n, the less likely you are to get a long string of n-1 pieces without a single I. Again, this is intuitively obvious, and will generally hold regardless of choice of randomizer. Qualitatively speaking, these figures tell us nothing that we could not have intuited for ourselves. It is also worth noting that the probability distribution Zaphod77 has given can be more concisely defined as: Code: P(X=x)= (1/7) * P(D>=x), for x= 1, 2, ... , 13 Where the I that you're waiting for is the Xth piece you receive. For example: The probability that you get it 11th is: Code: P(X=11)= (1/7) * P(D>=11) = (1/7) * ( P(D=11) + P(D=12) + P(D=13) ) = (1/7) * ( (3/49) + (2/49) + (1/49) ) = (1/7) * (6/49) (compare this with the figure Zaphod77 gave, quoted above). You will have noticed that the useful quantitative information contained in Zaphod77's distribution is a direct result of the pmf for D. How he managed to get those figures after so emphatically insisting colour_thief's drought distribution "couldn't be more wrong", is beyond me...
Okay this made it clearer for me at what CT was calculating. I thought it was some other quantity before. Anyway, interestingly, I did one more calculation for the 3-piece bag style randomizer I analyzed above. The probability values that a 1 will occur right after a 1 has occurred just now are as follows: P(0) = 1/9; P(1) = 2/9; P(2) = 3/9; P(3) = 2/9; P(4) = 1/9 The derivations is based on the fact that using stable probability values I calculated above (1/3 for (blank) and 1/9 for all others) as soon as a 1 occurs, the new state probability distribution becomes (note that these values are obtained after normalization): SP(blank) = 1/3 SP(1) = 1/3 SP(12) = 1/6 SP(13) = 1/6 The values are 0 for the other states since there is no incoming 1 transition into them.