# Randomizer Theory

Thread in 'Research & Development' started by colour_thief, 12 Jan 2007.

1. ### Taratang

Simple but effective, I like it. I would set the drought limit at 11 just to equal or better SRS

I would tweak the start though. Just out of personal preference I would disallow O, S, and Z as the first piece (like TAP) and use an initial history of OOSZ.

2. ### tepplesLockjaw developer

What did the TGM developers not like about O as the first piece? If I get O J T first, my field looks like this:

Code:
|     |
| JJ  |
| JOO T |
| JOOTTT |
----------'


3. ### colour_thief

Because if your first piece is an O you can't fit your next piece if it's either S or Z.

4. ### colour_thief

Here is the revised randomizer graph. Changes are:

-Addition of TGM's randomizer.

-TGM and TAP/Ti use proportional weighting to compensate for the fact that the history doesn't always have 4 distinct pieces.

That second part barely affected TAP/Ti... Honestly I'm not even sure it made a pixel's difference on the graph. The story for TGM was quite different, however.

4 Distinct: 0.86

3 Distinct: 0.14

2 Distinct: 0.001

1 Distinct: 0.0000008

The lower prevailance of a 4 distinct history made the need for weighting especially important. In terms of probability, this affected results 0.1% or something which noticeably shifts the graph a small amount.

One thing that struck me was how little the probability was affected by adding 2 rerolls. It affects the dampening of floods, but very little else. If you strictly forbid dealing a piece from the history, the general curve would stay the same.

As for calculating other things...

14-bag is currently giving me headaches. SRS's 7-bag was pretty trivial to do, but 14-bag is really... mathematically confusing to me. If anyone else wants to try and work it out, I think I've had my fill.

5-history Arika style randomizers would entirely be possible to do, but it would be hell to calculate the weighting. I could do it, it's straightforward how to approach the problem, but it would take more time than I care to invest. It wouldn't be hard to cook up something that wasn't weighted, but this wouldn't be accurate unless the rolls were sufficiently high. If there are too few, then it will rarely have a distinct 5-history and the graph will be way off.

Anyhow, I've got some ideas for new randomizers now and I don't think they'd remotely fit in Lockjaw's standardized system. That thing is a cool concept, but I think you need considerable freedom when designing a randomizer. Right now it's only good for SRS/TGM variants, and giving it more power pretty much negates the point... Syntax would get so complicated that C code would be simpler to read.

5. ### colour_thief

I've got a riddle for you guys:

The 7-bag randomizer is not fully understood. What is missing?

And, to clarify, there is still something subtle at the algorithmic level that is left undescribed. Something that is implied in the current model, but not overtly stated. It is not an implementation-specific minor detail, it's at the algorithmic level. And it would slightly affect the probability curve above.

...The currently implied model is almost certainly correct, but not necessarily so.

6. ### tepplesLockjaw developer

Are all 5,040 bags equally probable to within epsilon?

7. ### colour_thief

You're generally on the right track, but no that's not what I'm thinking of. If you think about it, "equally probable" is quite vague. For example, in all the randomizers in the above chart, each of the 7 tetris pieces is "equally probable". (That's actually a big hint.)

To be technical, your description does fit the riddle... Though I don't think the guidelines would randomly forbid bag #1337 or anything. Anyhow, can you think of something else that could differ given that every bag is equally probable?

8. ### makalaka

The same bag won't appear twice in a row.

9. ### kiwibongaUnregistered

Are there bag bags?

10. ### Pineapple

In theory it could, as there's currently no rule to provent it. It's just bloody unlikely. The expected wait for 2 bags in a row that are the same is 5041 bags, which means about 35000 pieces.

11. ### colour_thief

Yeah, you guys have got it now. Though Lard, the current model implies the selection of bags is pure random, so you wouldn't need that many pieces.

Now to blow your mind... Or at least mine...

Assuming everything works in a bag-like fashion:

7! = 5040 bags of pieces

5040! = (7!)! bags of bags

((7!)!)! bags of bag bags

(((7!)!)!)!.... you get the idea.

Infinitely recursive bags! Just trying to think about calculating the exact curve for a theoretic "infinitely recursive bag randomizer" makes my head explode. Though the results would change only slightly from the current curve, I'm sure.

But yeah I doubt anything like this happens in the games. Maybe there could be at least a second layers of bags. Maybe. But the smart money is on pure random bag selection.

12. ### colour_thief

So I've been thinking about randomizers more recently, and have measured them up in some new ways: Predictability. I've been considering both the average predictability, and the standard deviation in predictability. I'd be happy to explain the calculations if anyone's interested, but I'm pretty sure most people just want to look at numbers:

RND

Average: 14.29%

Std. Deviation: 0%

Not very predictable, but constant in it's unpredictability.

TGM1

Average: 31.77%

Std. Deviation: 2.70%

Fairly predictable, and fairly consistently predictable.

TAP

Average: 32.76%

Std. Deviation: 1.78%

Marginally more predicatble than TGM1 and marginally more consistently so. It shows the limit of the TGM randomizer design: Increasing the the number of rerolls will only bring it closer and closer to 1/3 predicatbility with 0 deviation.

SRS

Average: 37.04%

Std. Deviation: 28.07%

The most predictable randomizer by a fair amount, but also very inconsistent.

deleted...

...

15. ### colour_thief

My thoughts: Your system is very powerful, but ultimately maybe unnecessary. I am reminded of your powerful adaptive music engine in Jagoris, which was very much under-utilised. The main difference though is that adaptive music is cool, and a crazy randomizer is... well... umm......

Why do we want this? If you've got some good ideas to use it, excellent, but I can't think of anything. Certainly nothing worth learn well the scripting system.

16. ### Pineapple

I can't find the original post where someone demonstrated the chance for 3 snakes in a row, but I've been thinking about it. (Found it eventually, and it was tepples proving that four snakes in a row happens roughly every 30 sprints)

I was drawn to thinking about it after I heard of reports on 2ch saying that Tetris Online (Japan) will sometimes give 3 pieces the same in a row, and that someone else had suggested tht the reason was because it was using a 14-bag (where each of the 7 pieces is in the bag twice, and the bag is emptied before reloading).

For 3 pieces in a row all being the same to happen in 14-bag, it has to happen across a bag seam; either 2 before and one after (XX/X) or 1 before and 2 after (X/XX). Both are equally likely, so we will just consider the case that is easier to compute, which is the first one.

There are 14C2 (=91) different posibilities for the last 2 pieces of a bag. 7 of these possibilities will have a pair of pieces (1 for each shape), so the probability that a bag will end XX is 1/13. In the next bag, 2 of the 14 pieces will match the end of the last bag, so the chance for XX/X is 1/13 * 2/14 = 1/91

As I said earlier, XX/X and X/XX are equally likely to occur, so the chance is 2/91. However, this counts XX/XX twice, so we must also work out the probability of that and subtract it.

There are 91 possible endings to a bag (considering only the last 2 pieces, regardless of order), and 91 openings for the next bag. For each of these 8281 combinations, only 7 result in XX/XX. This means that the overall chance of 3 or more pieces to happen in a row at a bag seam is 2/91 - 7/8281 = 25/1183 (2.11%)

Since the average player (and I would say that most of us aren't average) can't track bag seams, this will be seen to occur once every 14*1183/25 (=662.4 pieces. Since the average sprint game (40 lines) is somewhere above 100 pieces (tepples used 105 so I'll use the same), it means that you will expect to see 3 in a row roughly once in every 6.3 sprints.

Scary, huh?

17. ### Pineapple

I'm looking at this post now, and I'm thinking "no, it doesn't..."

18. ### colour_thief

Well, it depends what you mean. Even with the same algorithm, a different hashing function would yield a different sequence. And with enough data, I'm sure that could be worked out.

In terms of a practical difference though, 28-bag would be indistinguishable from memoryless. 14-bag is already halfway between 7-bag and memoryless... So 28 would be very close indeed.

19. ### colour_thief

TOJ's new randomizer has been reversed -- here's how!

First someone made this youtube video of CPU level 13. The thought was that the CPU blazes through so many pieces that we'd have enough data to have a crack at reversing it.

But we were wrong!

Lardarse took a stab at transcribing the piece sequences. It became readily apparent that... The CPU was using 7-bag while at the same time the player was definitely not using 7-bag. Sort of not fair eh? So there goes the goldmine of data. Nevertheless, Lardarse did transcribe the human's sequence. Some 568 pieces spread over 17 games.

Code:
otisjzololztisjlztjisolzjzsilostisjojzlttlzjsosilsijitzotlijsojzliszljototzlsoijitozsjlzjoilttzsszjlilot
osjtilzltojltiszijjolztssloitzjlisszlo
sztjlitojtozislllsjoz
tojlijsztzlisjoilsztijjooisjjztljislozst
ilzjiotsjtiljoszjtsloziliozstijlltiozo
szjotzlizjotslisio
zistlojisolzsjitzjltsjioz
tooljszijtlsozilsoliljtzliossz
otiljsjzslztjjioitoslztjtljz
tiszolzjisotjlj
joislztziljzsoltltozizjsizootjlsl
jiszolotzlsotjiiojtszliosjlzioittslotijzzi
tzsjijollojsstizoitszjlji
ljijzotstjsozjlij
stljsiozositjzlsisjloztljsizotlzzot
slitozzjtsjsozliosllzjitt
jzzioltsjizliotstlzjstoilizojtstio`
I then took this data, and measured the observed piece intervals.

The piece ratios seems approximately equal, so there is no piece bias. I then plotted the interval data alongside the theoretical flood/drought curves of known algorithms:

You'll note that the data clearly looks like it's in the general family of bag-like algorithms, and that it is more or less halfway between 7-bag and 14-bag. Jokes about a 10.5-bag aside, this was the first hint to the algorithm's structure.

The next thing to note was that the largest observed interval was 15 pieces. Now we don't have a lot of data, so it's a bit of a jump to assume that this is the theoretical maximum, but for argument's sake I did so. 7-bag has intervals up to 13 pieces: X6|6X. An interval of 15 could be explained by an extra piece in the bag: X7|7X. Could it be that the algorithm starts with standard 7-bag and then simply adds an extra piece to the bag? With this In mind, I checked the data, and... It matched up! TOJ uses an "8-bag" randomizer.

As a final step, it was necessary to see how random the extra 8th piece was. Could it be that it is selected by a secondary 7-bag method? In this respect, we got lucky. Despite a lack of data, the 4th game shows a convincing example that the 8th piece is generated in a memoryless fashion. The sequence of 8th pieces starts JIJJ.

A bit of number crunching later and I had the data for the theoretical TOJ flood/drought curve. How close was the data?

Unlike previous graphs I've shown, this one is non-cummulative. So this time, the X-axis represents getting something in exactly X pieces, not within X pieces or less. You can see that, generally, the TOJ data contains too many small intervals and too few large intervals, relative to the ideal expected amounts. This bias in sampling is to be expected with such short sample sequences. With this in mind, things look pretty good!

Finally, here's the latest flood/drought curve with all known algorithms on it. 14-bag is also included even though it has not been used in an official Tetris game. Special thanks to cgwg for helping me work out the curve for 14-bag many months ago. Without his help I would not have been able to work out TOJ's curve so easily.

...a little more to follow later.

20. ### Pineapple

Well... while CT is preparing his "little more to folllow", I'd like to thank Digital for proviing me with a vdeo that we used to confirm 8 bag, and were going to use to provide more raw data, if CT hadn't got it so quickly...

It's still an interesting video to watch, though.