# Randomizer Theory

Thread in 'Research & Development' started by colour_thief, 12 Jan 2007.

1. ### colour_thief

I'm still a little confused. I chose NES specifically because of its 1-history nature. Regardless of piece history context, the branching forward through the "sequence possibility space" has a constant structure. As in it's always covering pure random equally at every decision point. This is unlike eg. TGM where the diversity of pieces in the history changes the branching structure.

So in my mind the flawed calculation is not using anything that isn't 100% known before you started. Am I missing something?

2. ### Arcorann

The distinction here is between "known" and "unknown" information. What I was getting at was that your 25/75 split is a hidden variable - which branch the game selects cannot be determined from preceding pieces in the piece sequence, and therefore you can't split it out when doing the calculation. By contrast, the history can be determined from preceding pieces, so it's possible to condition on it when determining the expected information content of a piece.

3. ### colour_thief

Hey, so I've been trying to figure out a way to measure randomizer entropy using only data (ie. a transcribed or generated sequence). The obvious motivation is that if you don't know the algorithm or if the math is unreasonable (cough TGM3 cough) then you can get some easy entropy stats.

Our randomizer definition of entropy is:
By comparison, the information theory definition of entropy is:
This has a couple problems:

1. All the randomizers we study generate pieces in more or less equal proportions, so by the law of large numbers we'll get entropy of log_7(1/7) = 1 for every randomizer. Not terribly enlightening.

2. We don't know anything about internal randomizer states, so we can't partition the proportions into the appropriate groups to take the average the way we do with our randomizer entropy calculation.

My thought was that what I actually want is the conditional entropy.
t_0 = the current trial / piece outcome
t_-n = the nth previous piece

So from data you can calculate:
H(t_0 | t_-1)
H(t_0 | t_-1, t_-2)
H(t_0 | t_-1, t_-2, t_-3)
etc

You don't know where to stop in advance (you don't know the history size) but in theory I would expect it should stabilize after you've gone back far enough to cover all the information needed (history size) to shape an outcome.

I think the data requirements grow exponentially for each new piece history term, so there's a hard limit to what you can reasonably compute. But it's still very practical:

1.If a randomizer has internal information (like the state of TGM3's dynamic piece ratios) then calculating conditional entropy from data ensures that only public information is used. It saves the hassle of calculating implied internal states which a theoretical calculation of entropy given only public information would require.

2. It's not like a human can actually remember / take advantage of long term history information anyways. If history has long term effects, the extent to which this affects entropy is more of an esoteric mathematical truth than something usefully descriptive. To illustrate, consider 63-bag (~91%) and NES (~97%). On paper 63-bag has much lower theoretical entropy, but if you instead compared 1-history conditional entropy (I expect) it would be higher than NES. One must ask the philosophical question "what is the practical entropy of 63-bag?". I'm not saying where we should draw the line per se, but a plot of the conditional entropy (y-axis = conditional entropy, x-axis = given history size) would be insightful.

SO

My question to people here:

Do you think that conditional entropy is getting at the same thing as our randomizer entropy definition? I am specifically asking if you think that given adequate data, and given random enough PRNG powering the randomizer, the "stable" conditional entropy value (ie. conditional entropy given a sufficiently large history size) should be the same as our randomizer entropy value.

Qlex likes this.
4. ### colour_thief

So I tested this and it works! It's really interesting actually. This is the result of 100,000,000 trials of each randomizer:

Green = Theory and computation match
Grey = Missing theoretical calculation or computation does not extent to a long enough history to even possibly achieve a match
Red = Theory and computation differ

Some interesting results:

1. Cardcaptor Sakura: Eternal Heart appears to be the lowest entropy randomizer Arika has ever made! The low entropy kicks in later though, so it's harder to take advantage of. (Well, technically I'm sure TGM3 would get lower eventually if the entropy calculation were extended to a longer history.)

2. TGM3 has more entropy than TGM2 (if you only consider the recent history). Makes sense when you think about it, but I hadn't thought about it going in.

3. It's hard to make out in the chart, but GB Tetris has sub-1 entropy with a history of 0. This is only possible if the piece ratios are skewed (which they are in this game due to the bug in the algorithm). I actually only implemented the bitwise or behaviour, not the weaksauce deterministic rerolling. I'm curious how much entropy drops further when I capture more of the poor design.

4. My approach failed to duplicate theoretical 7-bag entropy. This is because my approach only considered the history and not also the bag index. I should run it again using index + history as information, I'm sure it would then match. The current method would eventually converge on the same entropy, but it would take A LOT more history. Regardless, it's pretty cool that you can separate the 2 information elements apart like that.

5. 63-bag (tnt64) is basically indistinguishable from pure random. Granted, just like 7-bag, its over-estimating entropy because I don't have the bag index in there. But let's be honest, who is going to remember their index in such a huge bag?

Arcorann, what sort of algorithm does DTET use? Of all the entropies you've calculated, it's the one randomizer I wouldn't know how to implement.

Last edited: 26 Feb 2016
Qlex likes this.
5. ### Qlex

Amazing job, c_t! Do you know how the ccs randomizer works? Also curious what toj and bag2x represent

6. ### colour_thief

To quote a 3 year old post buried in this thread, Cardcaptor Sakura works like this:

It's really weird. Interval of 3 is impossible. Interval of 4 is only possible if the same piece is in index 4 and 6 of the history. I can post an interval plot when I get home if you like.

Bag2x is just 14-bag and toj is 8-bag. I couldn't give the randomizer functions a name starting with a number (programming constraint) so I had to get creative.

Qlex likes this.
7. ### Qlex

Sure thing, I'd love to see what the ccs interval plot looks like

8. ### colour_thief

Oops, took a while, but here it is.

Note some of the entropy dip at history depth 7 is due to thin data, not algorithmic properties.

MaryHadALittle, Jayce and Qlex like this.
9. ### Jayce

I have nothing to add, I'd just like to say this is interesting, good work

10. ### Arcorann

Those are some pretty interesting plots. Conditional entropy never occurred to me, so nice work. Did you consider the possibility of calculating theoretical values?

I never did post the DTET algorithm. Although it hasn't been verified, the algorithm should be the same as that used in this demonstration. The algorithm there is as follows:

* Sort the seven pieces from most recent to least recent
* Assign weights 0, 1, 2, 3, 4, 5, 6 and draw a piece

This would therefore be a randomiser that converges to an asymptote in conditional entropy, albeit much faster than bags. I suspect that Cultris II uses a similar randomiser with different weights (something along the lines of 1, 4.3, 5.7, 6.5, 6.5, 6.5, 6.5 but I haven't formally derived estimated values).

Also, I think there was something about multiply-recursive bags I claimed a long time ago that was subtly wrong. Originally I said that a triply-recursive bag system and a doubly-recursive bag system had the same drought probabilities; this is technically not the case since the probability of two adjacent bags being the same depends on whether the bags are in a double-metabag or single-metabag for the same reasons as the drought probabilities between doubly-recursive and singly-recursive. The difference here though is on the order of 1/5040!, so we can still ignore it.

Also, I recently discovered that NES randomiser is also biased. Specifically, the reroll code (pick from 7) is dependent on the "spawn orientation" of the previous piece, which is not uniform mod 7.

Qlex and colour_thief like this.
11. ### colour_thief

Theoretical conditional entropy calculations are appealing to me, but I can't quite wrap my head around how to go about it. Simulations came more easily to mind. If you wanted to work some out and show your work I would be really grateful.

Nice DTET find. I'm not much of a web dev guy, is the code visible somehow? Or did you just harvest a long sequence and do analysis?

That NES randomizer stuff is news to me as well. Paging @Kitaru
It's an excellent and thorough description, but that person's analysis failed to consider something. If I understand it correctly, the bias shown in the frequency table loops back and affects the probability of the different columns in the 3 bit + spawn matrix. Which in turn affects the frequency table etc. So the final probability table shown on that page is not correct. The true bias would only be revealed with the rest state of the Markov chain representing this feedback loop.

It's interesting. The way it works, not only are the ratios biased, but the 2-gram pairs have weird patterns too. These pieces have a distinct "most likely to follow it" piece:

T -> Z
J -> T
Z -> J
L -> T
I -> S

O and S are special cases. Their favours themselves. This bias is weaker than the larger effect of the reroll mechanic, so all non-dupes should be equally most probable. However compared to the other pieces, dupes should be more common.

Last edited: 1 Apr 2016
12. ### Kitaru

Qlex likes this.
13. ### colour_thief

Thanks @Kitaru!
Was also curious if you'd realized this NES randomizer bias that we're talking about.

I took a stab at calculating the Markov chain equilibrium and I came up with this:

Code:
        _______________________________________________________
|                           To                        |
-------------------------------------------------------
|       J      I      Z      L      O      S      T   |
---------------------------------------------------------------
|       | J |  3.1%  15.6%  15.6%  15.6%  15.6%  15.6%  18.8% |
|       | I | 15.6%   3.1%  15.6%  15.6%  15.6%  18.8%  15.6% |
|       | Z | 18.8   15.6%   3.1%  15.6%  15.6%  15.6%  15.6% |
| from  | L | 15.6%  15.6%  15.6%   3.1%  15.6%  15.6%  18.8% |
|       | O | 15.6%  15.6%  15.6%  15.6%   6.3%  15.6%  15.6% |
|       | S | 15.6%  15.6%  15.6%  15.6%  15.6%   6.3%  15.6% |
|       | T | 15.6%  15.6%  18.8%  15.6%  15.6%  15.6%   3.1% |
---------------------------------------------------------------
________________________________________
|  Exact Probability  |  Approximate % |
--------------------------------------------
| T |    49284/335916     |     14.67%     |
| J |    47989/335916     |     14.29%     |
| Z |    48024/335916     |     14.30%     |
| O |    47988/335916     |     14.29%     |
| S |    49321/335916     |     14.68%     |
| L |    46655/335916     |     13.89%     |
| I |    46655/335916     |     13.89%     |
--------------------------------------------
Curiously, O still comes out to exactly 1/7 probability.
The end result is that that article is overplaying the deviation from pure random. Entropy (given no history) ideally being 100%, the article's proportions come out to entropy of ~99.9857% whereas it is actually just ~99.9888%. Splitting hairs, I know, lol.

Bottom line in plain English for NES players:
1. OO or SS duplicates happen more often (1/16 of the time compared to 1/32 for other pieces). So you get to legitimately curse the game whenever these sequences mess you up.
2. I and L come up the least often on average. If you care about triples this slightly favours a hole on the left. If you haven't got these pieces in a while, you can legitimately curse the game.
3. J tends to lead to T tends to lead to Z tends to lead to J... That's a loop! So JTZ, TZJ, ZJT are the most common 3 pieces runs. Might be useful for strategy planning, not sure.

Last edited: 1 Apr 2016
Qlex likes this.
14. ### colour_thief

I've been thinking about NES some more. I believe there are additional bias elements that were not mentioned in the article.

The following state information determines the outcome of the randomizer:
• 2^8 states for the high byte of the LSFR (note the high bit tap is part of this)
• 2 states for the low bit tap
• 7 states for the previous piece
• 8 states for the piece counter mod 8, ie. "the bag index"
So instead of the nice tidy 7x7 transition matrix like in my previous post, the real* answer a 2^9 * 7 * 8 = 28672x28672 transition matrix.
Overwhelming if you're trying to work it out manually, but still very computable.

With all of this information, the second roll is fully deterministic from the first roll. There are strange biases that enter:
1. The article mentions there are just 32,767 LSFR states. It's impossible to cover the 9 bits we care about equally, which introduces bias in the first roll. Not sure how bad it is without studying the LSFR sequence.
2. The lower 2 bits of the pre-modulus reroll are not very random (though the relationship is very context sensitive). eg. If the piece count mod 7 is 0 and the first roll's high byte ends in bits 111, then the reroll's lower 2 bits will be 1s.
Similar to GB Tetris, these biases are weird spaghetti relationships that are hard to guess without doing the math.

*One last assumption remains: that the number of frames elapsed between pieces is good enough for the LSFR to select one of the LSFR's 32,767 states in a statistically independent way. This is very possibly false (the next LSFR state during piece generation will usually be within say 100-200 iterations from the previous piece generation state) but mapping out these biases if they exist is a really complicated context-sensitive problem. Who knows, maybe at a certain speed level and stack height and abstaining from pressing down, the stars align and bias gets crazy. But I'm going to call this innocent until proven guilty and save this onion layer for another day.

Qlex likes this.
15. ### Qlex

That... Is absolute genius.

Now I can yell at NEStris AND feel justified about that! That also means that we could theoretically make the absolute the perfect TAS for NEStris.

Thanks guys, amazing work as always!

16. ### Muf

NEStris has already been broken in awful ways due to the fact it harvests entropy from the controller, meaning you can get any piece sequence you want just by mashing buttons.

Qlex likes this.
17. ### colour_thief

'Harvests entropy' sounds all fancy. But it's not looking at everything... Mashing buttons should not do much. It cares about the number of frames between pieces, and the previous piece (not just for rerolling).

The part that cares about the previous piece is quite measurably decreasing entropy!

18. ### Qlex

Does that mean that the article about inputting left and right giving you more chances of having a line piece was absolute confirmation bias? Would that affect the random in any way aside from waiting for the next piece?

19. ### Muf

Come again?

lol

colour_thief likes this.
20. ### Qlex

Quoting this submission, I think what is done here is a combination of inputs that delay the apparition of the next piece (either pause or left+right+down). The inputs in themselves do not seem to be the actual cause, but the delay is

Omio9999, colour_thief and Kitaru like this.