Randomizer Theory

Thread in 'Research & Development' started by colour_thief, 12 Jan 2007.

  1. I can try and salvage what information I have from the randomizer theory work we did, if you want. (I also have his original charts for analyzing the history-type randomizers.)
     
  2. i think colour_thief is still calculating something wrong...
    about the possibility of lacking I in a range of a specified length...

    a program to calculate this: (works for GNU C++)
    Code:
    #include <cstdio>
    #include <cstdlib>
    const int maxlen=1048576,maxrng=14;
    const char * files[]={
    	"bag7.txt",
    	"roll4.txt",
    	"roll6.txt",
    	"random.txt",
    	"test.txt"
    };
    const int filec=5;
    int apptbl[256]={0};
    bool acptbl[256]={0};
    int seqlen;
    unsigned char seq[maxlen],inchar;
    int pfxsum[maxlen]={0};
    FILE *fi;
    
    void init()
    {
    	acptbl['I']=true;
    	acptbl['T']=true;
    	acptbl['O']=true;
    	acptbl['S']=true;
    	acptbl['Z']=true;
    	acptbl['J']=true;
    	acptbl['L']=true;
    	//use file instead of stdout
    	freopen("result.txt","w",stdout);
    }
    int main()
    {
    	init();
    	for(int xhf=0;xhf<filec;xhf++)
    	{
    		fi=fopen(files[xhf],"r");
    		seqlen=0;
    		//import seq from txt
    		while(true)
    		{
    			fscanf(fi,"%c",&inchar);
    			if(feof(fi))
    				break;
    			apptbl[inchar]++;
    			if(acptbl[inchar])
    			{
    				seq[seqlen++]=inchar;
    			}
    		}
    		printf("importing file %s, length=%d\n",files[xhf],seqlen);
    		//sum of chars?
    		for(int xh=0;xh<256;xh++)
    		{
    			if(apptbl[xh]!=0)
    				printf("char %d appears %d\n",xh,apptbl[xh]);
    			apptbl[xh]=0;
    		}
    		//build prefix sum for 'I'
    		for(int xhp=0;xhp<seqlen;xhp++)
    		{
    			pfxsum[xhp+1]=pfxsum[xhp];
    			if(seq[xhp]=='I')
    				pfxsum[xhp+1]+=1;
    		}
    		//for lengths of range
    		for(int xhl=1;xhl<=maxrng;xhl++)
    		{
    			if(xhl>seqlen)
    				break;
    			int noi=0,br=seqlen-xhl+1;
    			for(int xho=0;xho<br;xho++)
    				if(pfxsum[xho+xhl]==pfxsum[xho])
    					noi++;
    			printf("length %d, psb of no-I:\t%8d/%8d\t=%.13f\n",xhl,noi,br,double(noi)/double(br));
    		}
    		//finish
    		printf("---\n\n");
    		fclose(fi);
    	}
    	printf("finished: %d files\n",filec);
    	//system("pause");
    	return 0;
    }
    
    i used 5 files.
    4 of them generated by nullpomino sequencer, with seed of 0, length of 1,000,000.
    first line of comment removed.
    Code:
    bag7
    roll4
    roll6
    memoryless random
    
    the rest one is a test of correctness:
    Code:
    ITOSZJL
    here is the result:
    Code:
    importing file bag7.txt, length=1000000
    char 10 appears 16666
    char 73 appears 142857
    char 74 appears 142857
    char 76 appears 142857
    char 79 appears 142858
    char 83 appears 142857
    char 84 appears 142857
    char 90 appears 142857
    length 1, psb of no-I:	  857143/ 1000000	=0.8571430000000
    length 2, psb of no-I:	  717157/  999999	=0.7171577171577
    length 3, psb of no-I:	  582880/  999998	=0.5828811657623
    length 4, psb of no-I:	  457305/  999997	=0.4573063719191
    length 5, psb of no-I:	  343392/  999996	=0.3433933735735
    length 6, psb of no-I:	  244231/  999995	=0.2442322211611
    length 7, psb of no-I:	  162762/  999994	=0.1627629765779
    length 8, psb of no-I:	  101634/  999993	=0.1016347114430
    length 9, psb of no-I:	   57921/  999992	=0.0579214633717
    length 10, psb of no-I:	   28973/  999991	=0.0289732607593
    length 11, psb of no-I:	   11666/  999990	=0.0116661166612
    length 12, psb of no-I:	    2889/  999989	=0.0028890317793
    length 13, psb of no-I:	       0/  999988	=0.0000000000000
    length 14, psb of no-I:	       0/  999987	=0.0000000000000
    length 15, psb of no-I:	       0/  999986	=0.0000000000000
    length 16, psb of no-I:	       0/  999985	=0.0000000000000
    length 17, psb of no-I:	       0/  999984	=0.0000000000000
    length 18, psb of no-I:	       0/  999983	=0.0000000000000
    length 19, psb of no-I:	       0/  999982	=0.0000000000000
    length 20, psb of no-I:	       0/  999981	=0.0000000000000
    ---
    
    importing file roll4.txt, length=1000000
    char 10 appears 16666
    char 73 appears 143096
    char 74 appears 143140
    char 76 appears 142108
    char 79 appears 143109
    char 83 appears 142367
    char 84 appears 143350
    char 90 appears 142830
    length 1, psb of no-I:	  856904/ 1000000	=0.8569040000000
    length 2, psb of no-I:	  717339/  999999	=0.7173397173397
    length 3, psb of no-I:	  581314/  999998	=0.5813151626303
    length 4, psb of no-I:	  448661/  999997	=0.4486623459870
    length 5, psb of no-I:	  319328/  999996	=0.3193292773171
    length 6, psb of no-I:	  227508/  999995	=0.2275091375457
    length 7, psb of no-I:	  162205/  999994	=0.1622059732358
    length 8, psb of no-I:	  115648/  999993	=0.1156488095417
    length 9, psb of no-I:	   82431/  999992	=0.0824316594533
    length 10, psb of no-I:	   58782/  999991	=0.0587825290428
    length 11, psb of no-I:	   41930/  999990	=0.0419304193042
    length 12, psb of no-I:	   29882/  999989	=0.0298823287056
    length 13, psb of no-I:	   21299/  999988	=0.0212992555911
    length 14, psb of no-I:	   15232/  999987	=0.0152321980186
    length 15, psb of no-I:	   10907/  999986	=0.0109071527001
    length 16, psb of no-I:	    7805/  999985	=0.0078051170768
    length 17, psb of no-I:	    5548/  999984	=0.0055480887694
    length 18, psb of no-I:	    3969/  999983	=0.0039690674741
    length 19, psb of no-I:	    2829/  999982	=0.0028290509229
    length 20, psb of no-I:	    2001/  999981	=0.0020010380197
    ---
    
    importing file roll6.txt, length=1000000
    char 10 appears 16666
    char 73 appears 142960
    char 74 appears 143041
    char 76 appears 142474
    char 79 appears 142946
    char 83 appears 142642
    char 84 appears 143139
    char 90 appears 142798
    length 1, psb of no-I:	  857040/ 1000000	=0.8570400000000
    length 2, psb of no-I:	  715278/  999999	=0.7152787152787
    length 3, psb of no-I:	  574777/  999998	=0.5747781495563
    length 4, psb of no-I:	  435467/  999997	=0.4354683064049
    length 5, psb of no-I:	  297309/  999996	=0.2973101892408
    length 6, psb of no-I:	  203058/  999995	=0.2030590152951
    length 7, psb of no-I:	  138798/  999994	=0.1387988327930
    length 8, psb of no-I:	   94911/  999993	=0.0949116643817
    length 9, psb of no-I:	   64935/  999992	=0.0649355194842
    length 10, psb of no-I:	   44467/  999991	=0.0444674002066
    length 11, psb of no-I:	   30402/  999990	=0.0304023040230
    length 12, psb of no-I:	   20830/  999989	=0.0208302291325
    length 13, psb of no-I:	   14266/  999988	=0.0142661711941
    length 14, psb of no-I:	    9737/  999987	=0.0097371265826
    length 15, psb of no-I:	    6620/  999986	=0.0066200926813
    length 16, psb of no-I:	    4487/  999985	=0.0044870673060
    length 17, psb of no-I:	    3054/  999984	=0.0030540488648
    length 18, psb of no-I:	    2041/  999983	=0.0020410346976
    length 19, psb of no-I:	    1380/  999982	=0.0013800248404
    length 20, psb of no-I:	     933/  999981	=0.0009330177273
    ---
    
    importing file random.txt, length=1000000
    char 10 appears 16666
    char 73 appears 142990
    char 74 appears 143828
    char 76 appears 142021
    char 79 appears 142852
    char 83 appears 142569
    char 84 appears 143103
    char 90 appears 142637
    length 1, psb of no-I:	  857010/ 1000000	=0.8570100000000
    length 2, psb of no-I:	  734544/  999999	=0.7345447345447
    length 3, psb of no-I:	  629579/  999998	=0.6295802591605
    length 4, psb of no-I:	  539564/  999997	=0.5395656186969
    length 5, psb of no-I:	  462392/  999996	=0.4623938495754
    length 6, psb of no-I:	  396304/  999995	=0.3963059815299
    length 7, psb of no-I:	  339588/  999994	=0.3395900375402
    length 8, psb of no-I:	  290940/  999993	=0.2909420365943
    length 9, psb of no-I:	  249266/  999992	=0.2492679941440
    length 10, psb of no-I:	  213529/  999991	=0.2135309217783
    length 11, psb of no-I:	  182815/  999990	=0.1828168281683
    length 12, psb of no-I:	  156550/  999989	=0.1565517220689
    length 13, psb of no-I:	  134076/  999988	=0.1340776089313
    length 14, psb of no-I:	  114790/  999987	=0.1147914922894
    length 15, psb of no-I:	   98294/  999986	=0.0982953761353
    length 16, psb of no-I:	   84212/  999985	=0.0842132631989
    length 17, psb of no-I:	   72141/  999984	=0.0721421542745
    length 18, psb of no-I:	   61777/  999983	=0.0617780502269
    length 19, psb of no-I:	   52909/  999982	=0.0529099523791
    length 20, psb of no-I:	   45335/  999981	=0.0453358613814
    ---
    
    importing file test.txt, length=7
    char 73 appears 1
    char 74 appears 1
    char 76 appears 1
    char 79 appears 1
    char 83 appears 1
    char 84 appears 1
    char 90 appears 1
    length 1, psb of no-I:	       6/       7	=0.8571428571429
    length 2, psb of no-I:	       5/       6	=0.8333333333333
    length 3, psb of no-I:	       4/       5	=0.8000000000000
    length 4, psb of no-I:	       3/       4	=0.7500000000000
    length 5, psb of no-I:	       2/       3	=0.6666666666667
    length 6, psb of no-I:	       1/       2	=0.5000000000000
    length 7, psb of no-I:	       0/       1	=0.0000000000000
    ---
    
    finished: 5 files
    
    [​IMG]
    for bag7, it is about 0.289%, far from 1/(7*7), but quite close to 1/(7*7*7).
    for roll4, it turns to be 2.988%.
    about 10 times of bag7........
    and for memoryless, it remains 11% for length of 14...
    nothing seems to be wrong in the correctness test.

    idea? :confused:
     
    Last edited: 22 Nov 2012
  3. Yes, that would be cool.
     
  4. That's more than a jump... your probability is adding up to more than 100% because of it! I'm not familiar with Java but I would expect those variables are getting corrupted or something. More than likely your data is fine, there's just a program issue.

    My curves aren't based on data, just theoretical probability calculations, but if I can help you with anything just let me know.

    Thanks for the vote of confidence. :p

    You are calculating something different than me. Hopefully once I explain the difference you will see how my calculation is more useful.

    You are testing n consecutive pieces for if they are I. So for n=1, you are looking at every piece in the entire sequence and checking if it is I. For any normal randomizer, there is no I 6/7 = ~85.7% of the time. Look at your data: Every single randomizer is showing you ~85.7% for n=1!

    I am testing for a complete interval of size n. So for example, you would need an I, then n-1 pieces without an I, then a second I on the nth piece. Do this and I am certain your data will show curves like mine. But note also that I am not just testing for I, I am testing for these intervals with all pieces. If you took my probabilities and divide them by 7, you would get the drought probabilities for one specific piece (eg. I).

    This brings me to your observation about the 12 droughts. Because this is the maximum possible 7-bag drought, your program doesn't over-count it the way it does with smaller droughts. For this reason, it lines up with my probabilities. But since you are only considering I (1/7 of the pieces) you have to take my probability and divide by 7. That's why I have 1/7^2 and you have 1/7^3.

    tldr;

    You answer the question:
    "What is the probability that there will be no I piece when examining n consecutive pieces?"

    I answer the question:
    "What is the probability of not obtaining a piece until n pieces later?"
     
    Last edited: 2 Dec 2012
  5. Zaphod77

    Zaphod77 Resident Misinformer

    mm. i thought you had left? :)

    But yeah. probability calculations are often non-intuitive. That's why casinos stay in business. :)

    For memoryless randomizer, the odds of a I drought of X length is very easy to calculate. it's (6/7)^n where n is the drought length.

    For bag, we have to handle multiple cases. first we have the case of starting with a full bag. you can go up to 6 pieces without an I. The probabilities are

    6/7, 5/7, 4/7, 3/7, 2/7, 1/7, and 0/7 for droughts of 1 through 7. This is very intuitive.

    There are twelve other possible cases to consider for bag, all equally likely. All can be calculated using a more complicated, but still simple method, and then you can work out the general case for each drought length by taking the mean of all 13 possible probabilities for each drought i think. (add all 13 up, divide by 13) Note that drought length greater than 12 is impossible.

    For strict history, the same calculations can be made, but there are only five cases. no i in history, and one i in each position. The same calculation method used for geenral case of bag works.

    For 4 and 6 roll history... well... that's a royal pain to work out. :) Strict history will be close, but not nearly exact. drought possibiliy is greater when we allow history repeats, because repeats make a drought more likely to continue by giving you less pieces in the history. For this case, i believe monte carlo analysis is most useful.
     
  6. You are wrong. It's (1/7)*(6/7)^(n-1). Dude this was the simplest one.

    Wrong again. For a 7n-bag randomizer, there are exactly n(7n choose n)^2 cases to consider. For 7-bag (n=1) this means 49 cases. 7-bag is super easy to work out, drought intervals 1 through 13 have probabilities 1, 2, 3, 4, 5, 6, 7, 6, 5, 4, 3, 2, 1 out of 49 respectively. For n>1 it loses symmetry and gets really tedious in a hurry. For 63-bag (n=9) there are 5,041,435,870,611,492,608,025 cases to consider, and I've worked out the fractions.

    Wrong. Wtf Monte Carlo man. Do you even know what that is? (rhetorical question)
    I've worked it out for reals, and there was no need to randomly sample and approximate.

    Correct! Perhaps the resident misinformer should not speak with authority on this matter. ;)
     
  7. Zaphod77

    Zaphod77 Resident Misinformer

    O_O
    This is a simple case of semantics.
    If you define N as the number of pieces dealt including the N, and define the drought length as the number of non I pieces before the I is dealt, then your formula is correct. Since i was working on different definitions i came up wit a different, and equally correct answer.

    But you can't go 13 pieces without an I in 1 bag. do you consider a drought length of 13 to be 12 without an i?

    I agree my calculation method was probably more complex than needed, but your answer is flat out wrong.
    Under the assumption that you don't know where in the bag you are, and you don't know if you've drawn the I or not, then the odds of your next piece being an I are.. guess what? 1 in 7! for a drought length of zero.

    The probability that you get the i second given those assumptions, is in fact ((48/49)/7)
    The probability that you get it 3rd is ((46/49)/7)
    The probability that you get it 4th is ((43/49)/7)
    The probability that you get it 5th is ((39/49)/7)
    The probability that you get it 6th is ((34/49)/7)
    The probability that you get it 7th is ((28/49)/7)
    The probability that you get it 8th is ((21/49)/7)
    The probability that you get it 9th is ((15/49)/7)
    The probability that you get it 10th is ((10/49)/7)
    The probability that you get it 11th is ((6/49)/7)
    The probability that you get it 12th is ((3/49)/7)
    The probability that you get it 13th is ((1/49)/7)

    Your drought distribution for bag couldn't be more wrong. :)

    How I did this.

    wrote out all possible I positions in double 7 bags.

    Count each drought length from the start of the line. record it

    delete the first chaacter from each line

    repeat the above.

    Do this for the cases of 1-6 draws. the 7 draw case is equivalent to 0 draw case, so further cases are skipped.

    Once we have the drought probabilities for 0 through 6 pieces drawn, we add them up for each length, and divide by 7 (number of cases calculated).

    This is the true drought probabilities for 7 bag.

    And yes, these probabilities add up to 1.

    Nice job showing the distribution of possible sums of 2d7, though. :)

    Well, congrats on working it out for reals, although if your bag calculation is wrong, i'm not sure your tgm calculation is right either. If so i can't correct it though. I would approximate with a large sample at this point because i don't know how to work it otherwise. Also a large monte carlo sample is a good way to check your calculations, and would have revealed the fallacy in the 7 bag calculation you mentioned above (generate a truckload of bags, and test each starting position in the sequence).
    Who's the misinformer? :)
     
    Last edited: 2 Dec 2012
  8. Muf

    Muf

    [​IMG]
     
  9. I like how you quoted the only parts of the post that you actually understand :p
     
  10. Zaphod77

    Zaphod77 Resident Misinformer

    PS: farter's monte carlo agrees with my bag calculations.

    I think i see what ct was trying to get at, but i disagree with his assertion that his calculation is more useful

    For example at the start of the game you haven't recieved an I. So the calculation is useless. That said, this case has a much simpler calculation.

    With memoryless, whether you just got an I or not is irrelevant, so it doesn't affect the odds.

    What if a tetris game decided to start with a random number of pieces already dealt from the bag, to shake thigns up a little a little? Then the calculation I made is much more useful.

    And if you do know the status of the bag, it's much more useful calculating based on all your knowledge, instead of simply the fact that you got an I. If you get an I at the start of the bag, his calculation gives you a 57% chance of getting another I within 7 pieces, when in fact, the odds are only 1 in 7.

    In a history based randomizer, I can see the usefulness of this sort of calculation.

    as for the statement above.

    This is where the error is.

    His numbers are different because he does not assume that the last piece dealt was an I. Dividing your numbers by seven will not give his numbers, but will instead give something adding up to 1/7th. This "i'm considering all 7, and you are only considering I" business is simply not how it goes. My long calculation above duplicates his results, which are definitely not symmetric.

    To sum up

    The probability of getting n pieces without any Is in memoryless is (6/7)^2. this tells you how unlikely you getting screwed and never getting that I was.

    The probability of getting your first i on the nth piece in memoryles is
    (1/7)*(6/7)^(n-1)

    The probabilities of having n pieces between the 2 Is in two bags is ct's table, with n being 0 for the first entry and 12 for the last.

    The probabilities for getting an I on the nth piece in bag without ANY prior knowledge to go on are my numbers.

    If you know your position in the bag, and if the I piece has been dealt, you can use simple calculations to work out the math. there will be seven possible positions with equal odds, from n+1 to n+7, where n is the number remaining in the bag, if the I has been dealt. Otherwise, there will be seven or less possible positions of equal odds, from 1 to n where n is the remaining pieces in the bag.

    ct's calculation for history based randomizers is easier than the one to do without knowledge of history, yet both are useful. I'm not up to doubling the number of cases and weighting them properly.

    Dividing ct's result by 7 does not give an accurate probability except for a small subset of the cases handled.
     
    Last edited: 2 Dec 2012
  11. Zaphod, your terrible math is just reinforcing your title of resident misinformer.

    For pure random, my objection is not just semantics. Semantics would be objecting to calling an interval of 1 a drought of 0. I would never take issue with that. But no matter what you want to call it you have to calculate it my way. If you don't have that (1/7) in there you're not measuring an interval of length n, you're measuring an interval of length n or greater. It's the same issue farter was having just a few posts up.

    For 7-bag, you have to consider 2 consecutive bags. There are 7 positions a piece could take in the left bag, and 7 in the right bag. 7*7 = 49 possibilities for a pair of positions. Some quick notation: (L,R) would be the index of the piece in the left and right bags.

    There's exactly 1 scenario where you get an interval of 1:
    (7,1) = 1/49

    And exactly 2 scenarios where you get an interval of 2:
    (6,1) & (7,2) = 2/49

    And exactly 3 scenarios where you get an interval of 3:
    (5,1) & (6,2) & (7,3) = 3/49

    etc.

    Hopefully you can see the pattern with that. It is a very simple pattern after all. The probabilities are definitely the values 1, 2, 3, 4, 5, 6, 7, 6, 5, 4, 3, 2, 1 divided by 49, and you are just wrong.

    Don't mistake my calculations for guesswork. They are verified against actual data, and for simple randomizers like 7-bag and pure random tons of people have worked out the probabilities before and these are non-controversial, agreed upon probabilities.
     
  12. Yes, because you are both making the same mistake. You with probability calculations and him with reading the data.
     
  13. COL

    COL

    I dunno exactly what you meant by "I drought" dudes but:
    Assuming "memoryless" randomizer:
    the probability that I doesn't appear during N specified trials is (6/7)^N.
    The probability the first I appears at the trial N is (1/7)*(6/7)^(N-1).
    Accurate language make things easier (bad english doesn't but i'm working on it ;))
    For the bag randomizer: how exactly does it works (i don't know the rule of these games)
     
  14. With memoryless, you are 100% correct. Thank you for the voice of reason.

    7-bag is: 1 of every piece is put in a bag. Randomly take pieces from the bag 1 at a time. When the bag is empty refill it with 1 of every piece.
     
  15. Nice to see you around here again! :)
     
  16. ok.. you mean calculating like this?
    for every starting position, calculate how long it will starve you, and
    possibility[length]=times_of_being_starved_for[length]/number_of_starting_positions
    :facepalm:(not quite code)

    new code:
    Code:
    #include <cstdio>
    #include <cstring>
    #include <cstdlib>
    const int maxlen=1048576,maxrng=20;
    const char * files[]={
    	"bag7.txt",
    	"roll4.txt",
    	"roll6.txt",
    	"random.txt",
    	"test.txt"
    };
    const int filec=5;
    int apptbl[256]={0};
    bool acptbl[256]={0};
    int noirec[maxlen];
    int seqlen;
    unsigned char seq[maxlen],inchar;
    int pfxsum[maxlen]={0};
    FILE *fi;
    
    void init()
    {
    	acptbl['I']=true;
    	acptbl['T']=true;
    	acptbl['O']=true;
    	acptbl['S']=true;
    	acptbl['Z']=true;
    	acptbl['J']=true;
    	acptbl['L']=true;
    	//use file instead of stdout
    	freopen("result_new.txt","w",stdout);
    }
    int main()
    {
    	init();
    	for(int xhf=0;xhf<filec;xhf++)
    	{
    		fi=fopen(files[xhf],"r");
    		seqlen=0;
    		//import seq from txt
    		while(true)
    		{
    			fscanf(fi,"%c",&inchar);
    			if(feof(fi))
    				break;
    			apptbl[inchar]++;
    			if(acptbl[inchar])
    			{
    				seq[seqlen++]=inchar;
    			}
    		}
    		printf("importing file %s, length=%d\n",files[xhf],seqlen);
    		//sum of chars?
    		for(int xh=0;xh<256;xh++)
    		{
    			if(apptbl[xh]!=0)
    			apptbl[xh]=0;
    		}
    		//build prefix sum for 'I'
    		for(int xhp=0;xhp<seqlen;xhp++)
    		{
    			pfxsum[xhp+1]=pfxsum[xhp];
    			if(seq[xhp]=='I')
    				pfxsum[xhp+1]+=1;
    		}
    		/* //wrong way...?
    		//for lengths of range
    		for(int xhl=1;xhl<=maxrng;xhl++)
    		{
    			if(xhl>seqlen)
    				break;
    			int noi=0,br=seqlen-xhl+1;
    			for(int xho=0;xho<br;xho++)
    				if(pfxsum[xho+xhl]==pfxsum[xho])
    					noi++;
    			printf("length %2d, psb of no-I: %8d/%8d\t=%.13f\n",xhl,noi,br,double(noi)/double(br));
    			//printf("%.13f\n",double(noi)/double(br));
    		}
    		*/
    		memset(noirec,0,sizeof(noirec));
    		for(int xho=0;xho<seqlen;xho++)
    		{
    			int bl=xho,br=seqlen,imid;
    			//binary search for the length of longest range starting from xho 
    			while(bl<br)
    			{
    				imid=(bl+br)>>1;
    				if(pfxsum[imid]==pfxsum[xho])
    					bl=imid+1;
    				else
    					br=imid;
    			}
    			if(pfxsum[br]!=pfxsum[xho])
    				br--;
    			noirec[br-xho]++;
    		}
    		for(int xhl=0;xhl<=seqlen;xhl++) //print
    		{
    			if(noirec[xhl]!=0)
    				printf("length %2d, psb: %8d/%8d = %.13f\n",
    					xhl,noirec[xhl],seqlen,double(noirec[xhl])/double(seqlen));
    		}
    		//finish
    		printf("---\n\n");
    		fclose(fi);
    	}
    	printf("finished: %d files\n",filec);
    	//system("pause");
    	return 0;
    }
    
    result:
    Code:
    importing file bag7.txt, length=1000000
    length  0, psb:   142857/ 1000000 = 0.1428570000000
    length  1, psb:   139986/ 1000000 = 0.1399860000000
    length  2, psb:   134277/ 1000000 = 0.1342770000000
    length  3, psb:   125575/ 1000000 = 0.1255750000000
    length  4, psb:   113913/ 1000000 = 0.1139130000000
    length  5, psb:    99161/ 1000000 = 0.0991610000000
    length  6, psb:    81469/ 1000000 = 0.0814690000000
    length  7, psb:    61128/ 1000000 = 0.0611280000000
    length  8, psb:    43713/ 1000000 = 0.0437130000000
    length  9, psb:    28948/ 1000000 = 0.0289480000000
    length 10, psb:    17307/ 1000000 = 0.0173070000000
    length 11, psb:     8777/ 1000000 = 0.0087770000000
    length 12, psb:     2889/ 1000000 = 0.0028890000000
    ---
    
    importing file roll4.txt, length=1000000
    length  0, psb:   143096/ 1000000 = 0.1430960000000
    length  1, psb:   139565/ 1000000 = 0.1395650000000
    length  2, psb:   136025/ 1000000 = 0.1360250000000
    length  3, psb:   132653/ 1000000 = 0.1326530000000
    length  4, psb:   129333/ 1000000 = 0.1293330000000
    length  5, psb:    91820/ 1000000 = 0.0918200000000
    length  6, psb:    65303/ 1000000 = 0.0653030000000
    length  7, psb:    46557/ 1000000 = 0.0465570000000
    length  8, psb:    33217/ 1000000 = 0.0332170000000
    length  9, psb:    23649/ 1000000 = 0.0236490000000
    length 10, psb:    16852/ 1000000 = 0.0168520000000
    length 11, psb:    12048/ 1000000 = 0.0120480000000
    length 12, psb:     8583/ 1000000 = 0.0085830000000
    length 13, psb:     6067/ 1000000 = 0.0060670000000
    length 14, psb:     4325/ 1000000 = 0.0043250000000
    length 15, psb:     3102/ 1000000 = 0.0031020000000
    length 16, psb:     2257/ 1000000 = 0.0022570000000
    length 17, psb:     1579/ 1000000 = 0.0015790000000
    length 18, psb:     1140/ 1000000 = 0.0011400000000
    length 19, psb:      828/ 1000000 = 0.0008280000000
    length 20, psb:      589/ 1000000 = 0.0005890000000
    length 21, psb:      424/ 1000000 = 0.0004240000000
    length 22, psb:      309/ 1000000 = 0.0003090000000
    length 23, psb:      212/ 1000000 = 0.0002120000000
    length 24, psb:      138/ 1000000 = 0.0001380000000
    length 25, psb:       98/ 1000000 = 0.0000980000000
    length 26, psb:       67/ 1000000 = 0.0000670000000
    length 27, psb:       47/ 1000000 = 0.0000470000000
    length 28, psb:       35/ 1000000 = 0.0000350000000
    length 29, psb:       27/ 1000000 = 0.0000270000000
    length 30, psb:       20/ 1000000 = 0.0000200000000
    length 31, psb:       10/ 1000000 = 0.0000100000000
    length 32, psb:        7/ 1000000 = 0.0000070000000
    length 33, psb:        6/ 1000000 = 0.0000060000000
    length 34, psb:        3/ 1000000 = 0.0000030000000
    length 35, psb:        3/ 1000000 = 0.0000030000000
    length 36, psb:        2/ 1000000 = 0.0000020000000
    length 37, psb:        1/ 1000000 = 0.0000010000000
    length 38, psb:        1/ 1000000 = 0.0000010000000
    length 39, psb:        1/ 1000000 = 0.0000010000000
    length 40, psb:        1/ 1000000 = 0.0000010000000
    ---
    
    importing file roll6.txt, length=1000000
    length  0, psb:   142960/ 1000000 = 0.1429600000000
    length  1, psb:   141762/ 1000000 = 0.1417620000000
    length  2, psb:   140501/ 1000000 = 0.1405010000000
    length  3, psb:   139310/ 1000000 = 0.1393100000000
    length  4, psb:   138158/ 1000000 = 0.1381580000000
    length  5, psb:    94251/ 1000000 = 0.0942510000000
    length  6, psb:    64260/ 1000000 = 0.0642600000000
    length  7, psb:    43887/ 1000000 = 0.0438870000000
    length  8, psb:    29976/ 1000000 = 0.0299760000000
    length  9, psb:    20468/ 1000000 = 0.0204680000000
    length 10, psb:    14065/ 1000000 = 0.0140650000000
    length 11, psb:     9572/ 1000000 = 0.0095720000000
    length 12, psb:     6564/ 1000000 = 0.0065640000000
    length 13, psb:     4529/ 1000000 = 0.0045290000000
    length 14, psb:     3117/ 1000000 = 0.0031170000000
    length 15, psb:     2133/ 1000000 = 0.0021330000000
    length 16, psb:     1433/ 1000000 = 0.0014330000000
    length 17, psb:     1013/ 1000000 = 0.0010130000000
    length 18, psb:      661/ 1000000 = 0.0006610000000
    length 19, psb:      447/ 1000000 = 0.0004470000000
    length 20, psb:      307/ 1000000 = 0.0003070000000
    length 21, psb:      220/ 1000000 = 0.0002200000000
    length 22, psb:      149/ 1000000 = 0.0001490000000
    length 23, psb:       98/ 1000000 = 0.0000980000000
    length 24, psb:       61/ 1000000 = 0.0000610000000
    length 25, psb:       42/ 1000000 = 0.0000420000000
    length 26, psb:       24/ 1000000 = 0.0000240000000
    length 27, psb:       14/ 1000000 = 0.0000140000000
    length 28, psb:        7/ 1000000 = 0.0000070000000
    length 29, psb:        3/ 1000000 = 0.0000030000000
    length 30, psb:        2/ 1000000 = 0.0000020000000
    length 31, psb:        2/ 1000000 = 0.0000020000000
    length 32, psb:        1/ 1000000 = 0.0000010000000
    length 33, psb:        1/ 1000000 = 0.0000010000000
    length 34, psb:        1/ 1000000 = 0.0000010000000
    length 35, psb:        1/ 1000000 = 0.0000010000000
    ---
    
    importing file random.txt, length=1000000
    length  0, psb:   142990/ 1000000 = 0.1429900000000
    length  1, psb:   122466/ 1000000 = 0.1224660000000
    length  2, psb:   104965/ 1000000 = 0.1049650000000
    length  3, psb:    90015/ 1000000 = 0.0900150000000
    length  4, psb:    77172/ 1000000 = 0.0771720000000
    length  5, psb:    66088/ 1000000 = 0.0660880000000
    length  6, psb:    56716/ 1000000 = 0.0567160000000
    length  7, psb:    48648/ 1000000 = 0.0486480000000
    length  8, psb:    41674/ 1000000 = 0.0416740000000
    length  9, psb:    35737/ 1000000 = 0.0357370000000
    length 10, psb:    30714/ 1000000 = 0.0307140000000
    length 11, psb:    26265/ 1000000 = 0.0262650000000
    length 12, psb:    22474/ 1000000 = 0.0224740000000
    length 13, psb:    19286/ 1000000 = 0.0192860000000
    length 14, psb:    16496/ 1000000 = 0.0164960000000
    length 15, psb:    14082/ 1000000 = 0.0140820000000
    length 16, psb:    12071/ 1000000 = 0.0120710000000
    length 17, psb:    10364/ 1000000 = 0.0103640000000
    length 18, psb:     8868/ 1000000 = 0.0088680000000
    length 19, psb:     7574/ 1000000 = 0.0075740000000
    length 20, psb:     6426/ 1000000 = 0.0064260000000
    length 21, psb:     5488/ 1000000 = 0.0054880000000
    length 22, psb:     4709/ 1000000 = 0.0047090000000
    length 23, psb:     4057/ 1000000 = 0.0040570000000
    length 24, psb:     3473/ 1000000 = 0.0034730000000
    length 25, psb:     2982/ 1000000 = 0.0029820000000
    length 26, psb:     2563/ 1000000 = 0.0025630000000
    length 27, psb:     2198/ 1000000 = 0.0021980000000
    length 28, psb:     1901/ 1000000 = 0.0019010000000
    length 29, psb:     1623/ 1000000 = 0.0016230000000
    length 30, psb:     1402/ 1000000 = 0.0014020000000
    length 31, psb:     1235/ 1000000 = 0.0012350000000
    length 32, psb:     1052/ 1000000 = 0.0010520000000
    length 33, psb:      897/ 1000000 = 0.0008970000000
    length 34, psb:      778/ 1000000 = 0.0007780000000
    length 35, psb:      655/ 1000000 = 0.0006550000000
    length 36, psb:      554/ 1000000 = 0.0005540000000
    length 37, psb:      468/ 1000000 = 0.0004680000000
    length 38, psb:      396/ 1000000 = 0.0003960000000
    length 39, psb:      344/ 1000000 = 0.0003440000000
    length 40, psb:      298/ 1000000 = 0.0002980000000
    length 41, psb:      257/ 1000000 = 0.0002570000000
    length 42, psb:      224/ 1000000 = 0.0002240000000
    length 43, psb:      195/ 1000000 = 0.0001950000000
    length 44, psb:      171/ 1000000 = 0.0001710000000
    length 45, psb:      150/ 1000000 = 0.0001500000000
    length 46, psb:      123/ 1000000 = 0.0001230000000
    length 47, psb:      108/ 1000000 = 0.0001080000000
    length 48, psb:       91/ 1000000 = 0.0000910000000
    length 49, psb:       79/ 1000000 = 0.0000790000000
    length 50, psb:       66/ 1000000 = 0.0000660000000
    length 51, psb:       55/ 1000000 = 0.0000550000000
    length 52, psb:       47/ 1000000 = 0.0000470000000
    length 53, psb:       35/ 1000000 = 0.0000350000000
    length 54, psb:       31/ 1000000 = 0.0000310000000
    length 55, psb:       28/ 1000000 = 0.0000280000000
    length 56, psb:       23/ 1000000 = 0.0000230000000
    length 57, psb:       20/ 1000000 = 0.0000200000000
    length 58, psb:       17/ 1000000 = 0.0000170000000
    length 59, psb:       16/ 1000000 = 0.0000160000000
    length 60, psb:       13/ 1000000 = 0.0000130000000
    length 61, psb:       12/ 1000000 = 0.0000120000000
    length 62, psb:       11/ 1000000 = 0.0000110000000
    length 63, psb:        9/ 1000000 = 0.0000090000000
    length 64, psb:        6/ 1000000 = 0.0000060000000
    length 65, psb:        3/ 1000000 = 0.0000030000000
    length 66, psb:        3/ 1000000 = 0.0000030000000
    length 67, psb:        3/ 1000000 = 0.0000030000000
    length 68, psb:        3/ 1000000 = 0.0000030000000
    length 69, psb:        2/ 1000000 = 0.0000020000000
    length 70, psb:        2/ 1000000 = 0.0000020000000
    length 71, psb:        2/ 1000000 = 0.0000020000000
    length 72, psb:        2/ 1000000 = 0.0000020000000
    length 73, psb:        2/ 1000000 = 0.0000020000000
    length 74, psb:        2/ 1000000 = 0.0000020000000
    length 75, psb:        2/ 1000000 = 0.0000020000000
    length 76, psb:        2/ 1000000 = 0.0000020000000
    length 77, psb:        2/ 1000000 = 0.0000020000000
    length 78, psb:        2/ 1000000 = 0.0000020000000
    length 79, psb:        2/ 1000000 = 0.0000020000000
    length 80, psb:        2/ 1000000 = 0.0000020000000
    length 81, psb:        2/ 1000000 = 0.0000020000000
    length 82, psb:        2/ 1000000 = 0.0000020000000
    length 83, psb:        2/ 1000000 = 0.0000020000000
    length 84, psb:        1/ 1000000 = 0.0000010000000
    length 85, psb:        1/ 1000000 = 0.0000010000000
    length 86, psb:        1/ 1000000 = 0.0000010000000
    length 87, psb:        1/ 1000000 = 0.0000010000000
    length 88, psb:        1/ 1000000 = 0.0000010000000
    length 89, psb:        1/ 1000000 = 0.0000010000000
    length 90, psb:        1/ 1000000 = 0.0000010000000
    ---
    
    importing file test.txt, length=7
    length  0, psb:        1/       7 = 0.1428571428571
    length  1, psb:        1/       7 = 0.1428571428571
    length  2, psb:        1/       7 = 0.1428571428571
    length  3, psb:        1/       7 = 0.1428571428571
    length  4, psb:        1/       7 = 0.1428571428571
    length  5, psb:        1/       7 = 0.1428571428571
    length  6, psb:        1/       7 = 0.1428571428571
    ---
    
    finished: 5 files
    
    suppose that there is an I right after the given sequence..
     
  17. I took some time to think through it. Even though I am awful with probabilities, I have been able to come up with what seems to be a reasonable model for calculating probabilistic values for starvation length (or maybe not ^^).

    I don't know how much my values agree with CT, since I have only done it for a much more simple case. My method is lengthy and perhaps some parts of it may be superfluous? ... I don't know really.

    In any case at any given time in the game, what we want to do is calculate the probability distribution for various drought lengths (for any given piece, say I). That is at any given time:
    P(0) - I is the very next piece
    P(1) - The very next I comes after a delay of exactly 1
    P(2) - The very next I comes after a delay of exactly 2
    ... and so on. Obviously the delay is only calculated in terms of pieces and hence is just a discrete number.

    First to get some clarity lets see that for a 7n-bag in general, the maximum drought length is 12n. In other words at any given point in the game P(x)=0 for all x greater than 12n. While in general there will be a probability distribution for values of x between 0 and 12n.

    For example, for a 7-bag randomizer no matter what place you are in a game, you can confidently say that you won't be starved for an I (or any other piece for that matter) any more than 6 other pieces. Similarly for a 14-bag this number would be 12 and so on.

    Since bags are completely symmetric, I think that any calculation you do for one particular piece should also apply for all others. In general for example, it
    is perfectly reasonable that maximum drought lengths might be different for different pieces in a given randomizer. But this post isn't about those randomizers.

    For a memoryless randomizer there is no upper bound on the drought length. That is the probability values never fall to 0 after a given interval. This is in contrast to bag randomizer.

    Now coming to states, given 7 pieces, a 7-bag randomizer has a total of 127 states (=2^7-1). In general any 7n-bag randomizer (with 7 pieces) has a total of (n+1)^7-1 states.

    Now as a simple model lets take a randomizer with just 3 pieces 1,2 and 3. First lets draw the transition diagram for a randomizer that works like a 3-bag for these pieces. Firstly the number of states in the diagram are just 7 (=2^3-1). The states are:
    first level: (blank)
    second level: (1), (2), (3)
    third level: (12), (13), (23)
    We make the transition diagram by first writing down the states (each level we go from left to right). So we connect the first level with the second level using arcs/transitions and marking the pieces on the transitions. Similarly we connect the second level to the third level and finally the third level back to the first level.

    To read a sequence of pieces just follow a path on the transition diagram while noting down the pieces marked on the transitions.

    The probabilities for the transitions are simply 1/(multiplicity of the state). I am assuming there is no preference for one particular transition over the other.

    First we calculate the normalized probabilities for the states. I just arrived at the method by trial and error so I don't have a rigorous justification :). All I can say is that it makes sense in terms of the frequency of the states in the long run (no matter which state one starts from). SP just stands for "stable probabilities" in the sense that how would frequencies would be expected to settle in the long run.

    SP(blank) = SP(12) + SP(13) + SP(23)
    SP(1) = (1/3)*SP(blank)
    SP(2) = (1/3)*SP(blank)
    SP(3) = (1/3)*SP(blank)
    SP(12) = (1/2)*SP(1) + (1/2)*SP(2)
    SP(13) = (1/2)*SP(1) + (1/2)*SP(3)
    SP(23) = (1/2)*SP(2) + (1/2)*SP(3)

    Adding all these probabilities and normalizing them to 1 we would get SP(blank) as 1/3 and all the other probabilities as 1/9.

    Now glancing at the posts, Zaphod also has a point with respect to how in a specific game at a specific point we can say much more about the probabilities of a drought. However, I think CT's analysis was directed towards more or less a generic result without assuming anything (we can roughly say long term equilibrium).

    Now lets assume 1 to be the I-piece. Note that the maximum drought length here is 4. Here are the calculated results that can be deduced from the complete transition diagram:
    For (blank) state:
    P(0) = 1/3; P(1) = 1/3; P(2) = 1/3; P(3) = 0; P(4) = 0

    For (1):
    P(0) = 0; P(1) = 0; P(2) = 1/3; P(3) = 1/3; P(4) = 1/3

    For (2) and (3):
    P(0) = 1/2; P(1) = 1/2; P(2) = 0; P(3) = 0; P(4) = 0

    For (12) and (13):
    P(0) = 0; P(1) = 1/3; P(2) = 1/3; P(3) = 1/3; P(4) = 0

    For (23):
    P(0) = 1; P(1) = 0; P(2) = 0; P(3) = 0; P(4) = 0

    If we combine these results with the "stable probabilities" SP we calculated above we get the generic values for P(0), P(1), P(2), P(3), P(4) without any assumption about the state we are in. For me these values turn out to be:
    P(0) = 9/27; P(1) = 8/27; P(2) = 6/27; P(3) = 3/27; P(4) = 1/27

    Now if we try to calculate the probability that, in the long run, at any arbitrary time what would be the chance (without assuming anything about our position in the randomizer) that the very next piece will be 1. Given the symmetry of the bag, one would expect this value to be exactly 1/3, which is reflected in the value of P(0) above.

    Now it would be completely impossible to carry out this analysis by hand for a 7-bag, which is true. But if we informally use the idea of composite states, it might be possible to carry this out. I carried the above analysis using composite states informally (reducing the states to 5 and altering the probabilities) and got the same results. In 7-bag of course, the state reduction would be enormous, although it still might be too tedious. Using nCk to calculate subsets of given size can also help to reduce the tedium. I haven't done it though.

    edit: corrections
     
    Last edited: 3 Dec 2012
  18. K

    K

    Wohou ! CT is back !! :awe:
     
  19. Edo

    Edo a.k.a. FSY

    Let D be the distance between 2 successive I pieces. Just to make it absolutely clear:
    Code:
    I   I
    |<->|    D=1
    
    I   O   I
    |<----->|    D=2
    
    I   J   Z   I
    |<--------->|    D=3
    
    etc.

    For the 7-bag random generator, the probability mass function for D is given as:
    Code:
    P(D=d) = {  d/49,         for d= 1, 2, ... , 7,
               (14-d)/49,     for d= 8, 9, ... , 13.

    That's it; we're done. This is the most useful thing to know about 7-bag drought probabilities, and is exactly what colour_thief said it was. Qualitatively, the pmf tells us that the probabilities decrease linearly and symmetrically about the expected value. Even at a glance, we have learnt something profound about the character of the randomizer.

    Taking a look at the only thing that Zaphod77 has written that is actually correct, and isn't merely a restatement of something previously written by colour_thief:
    I haven't really looked at his code, but I think farter was trying to do something very similar. What the above probabilities are effectively telling us is that, if you picked a piece at random from a 7-bag sequence, then the probability that it is an I is 1/7. That's hardly surprising, considering the 7-bag is guaranteed to give out an I a seventh of the time, even over the relatively short term. Further, if you picked n consecutive pieces at random from the sequence, then the probability that only the nth piece is an I decreases as n increases. In other words, the larger the n, the less likely you are to get a long string of n-1 pieces without a single I. Again, this is intuitively obvious, and will generally hold regardless of choice of randomizer. Qualitatively speaking, these figures tell us nothing that we could not have intuited for ourselves.

    It is also worth noting that the probability distribution Zaphod77 has given can be more concisely defined as:
    Code:
    P(X=x)= (1/7) * P(D>=x),    for x= 1, 2, ... , 13 
    
    Where the I that you're waiting for is the Xth piece you receive. For example:
    The probability that you get it 11th is:
    Code:
    P(X=11)= (1/7) * P(D>=11)
           = (1/7) * ( P(D=11) + P(D=12) + P(D=13) )
           = (1/7) * ( (3/49) + (2/49) + (1/49) )
           = (1/7) * (6/49)
    (compare this with the figure Zaphod77 gave, quoted above).

    You will have noticed that the useful quantitative information contained in Zaphod77's distribution is a direct result of the pmf for D. How he managed to get those figures after so emphatically insisting colour_thief's drought distribution "couldn't be more wrong", is beyond me...
     
  20. Okay this made it clearer for me at what CT was calculating. I thought it was some other quantity before.

    Anyway, interestingly, I did one more calculation for the 3-piece bag style randomizer I analyzed above. The probability values that a 1 will occur right after a 1 has occurred just now are as follows:
    P(0) = 1/9; P(1) = 2/9; P(2) = 3/9; P(3) = 2/9; P(4) = 1/9

    The derivations is based on the fact that using stable probability values I calculated above (1/3 for (blank) and 1/9 for all others) as soon as a 1 occurs, the new state probability distribution becomes (note that these values are obtained after normalization):
    SP(blank) = 1/3
    SP(1) = 1/3
    SP(12) = 1/6
    SP(13) = 1/6
    The values are 0 for the other states since there is no incoming 1 transition into them.
     
    Last edited: 3 Dec 2012

Share This Page