Randomizer Theory

Thread in 'Research & Development' started by colour_thief, 12 Jan 2007.

  1. "Drought" is a bit fuzzy (e.g., Should we use the properties of 7 Bag as an indicator and consider the minimum condition for an I-piece drought to be >12 pieces before seeing that piece dealt? Should we consider anything beyond the expected value of 1/7?), but "sequence" I think we can get a better handle on.

    The length of the game before reaching the more-or-less unsustainable Level 29 is 230~290 lines depending on starting level. At 2.5 pieces per line, that makes the bare minimum piece count 575~725. If we allow for some play into Level 29+, you could start seeing piece counts around 750~850, but we have not seen much more than that. "A sequence" long enough to constitute an entire game for a human at current standards of play is certainly less than 1000, so we can use that as a very generous upper-bound.
     
  2. Droughts are always a bit context sensitive, aren't they? I don't know what metric they use at the NES tournaments, but certainly anything longer than 12 pieces starts to feel like a proper drought. We may as well throw down a (somewhat) arbitrary marker there. When I have time to go back thru this whole thread, I can probably work out the solution myself, but I was hoping someone already knew the answer, even if it's just intuitively from a deeper well of experience.
     
  3. zid

    zid

    Why arbitrarily rebase them onto an offset from 12?

    Just the number of pieces since you last saw that piece is a perfectly acceptable metric. It'll probably be about 7, 12 is slightly unusual, more is painful.

    It's more like an "x hours since coffee pot was changed" sign than a metric you want to keep as low as possible, so maybe it's just the word 'drought' that needs fixing.
     
  4. Because the question I'm asking is "How many droughts will I have in an average maxout run of NES Tetris?"

    That's a question whose answer has direct implications for stacking strategy.
     
  5. In NES Tetris, the sum of the probability of droughts size 1 through 12 is slightly less than 86%. So if you consider 1000 pieces, your expected I-drought count is roughly (1 - 0.86) * (1000 / 7) = 20. Or put differently, once every 50 pieces.
     
    JBroms, GyRo and Kitaru like this.
  6. Actually, for this purpose, maybe >10 pieces might be more applicable?
    Theoretically (like if you can make the pieces fit), if at least every 10th piece is an I-piece, you'll be able to keep your stack stable infinitely without the need to "burn" lines (ie. clear lines without making a Tetris) for a "perfect transition".

    Of course, if your stack is low enough, you can afford to wait 20+ pieces for an I-piece without burning, but once that happens you probably can't afford the same wait again, so on average you probably still want around one in every 10?
     
    Kitaru likes this.
  7. I now have yet another metric by which we can evaluate randomisers, through considering correlations between consecutive pairs of droughts. For example, due to the design of the bag randomiser, pairs of droughts have a high negative correlation - if the interval between two appearances of an I is very short, the next appearance will be long. In fact, calculating the correlation coefficient gives -0.5, which is a significant negative relationship. Conversely, in memoryless there is no correlation at all. Most randomisers will fall between one of these two extremes.
     
    colour_thief likes this.
  8. I had a similar thought, but instead of looking at correlation I've been looking at entropy. While high correlations will have low entropy, hypothetically some low correlations could as well. So it can reveal when things are going on that might otherwise be hidden.

    Just to clarify your idea. Are you looking at the sequence of droughts for a certain piece type? Or a sequence of droughts in general (eg. as the randomizer sequence is progressing)? Because they are both different in distribution and I haven't convinced myself one is more important to consider than the other.
     
  9. I'm looking at consecutive droughts for the same piece (not sure how a sequence of droughts in general would work). I was inspired by someone complaining about how bag "forces droughts" - if we have two Is in a row we know that the next I must be at least 7 pieces after that. I've already written code for recording the empirical joint distribution in my current simulation code, and added in calculations for the expected drought given the last drought interval.

    I was also considering attempting to calculate the auto-correlation function, but I'm not sure how useful the extra info would be.
     
  10. I've come up with a new randomizer that behaves much like Ti's when used to generate tetrominoes, but is usable for any number of piece types, such as with pentominoes. It has no manually enforced restrictions on its behavior (such as a fixed bag size), yet it seems to have "TGM randomizer" behavior, rarely behaving badly, but usually producing sequences that are appropriate for Tetris games.

    Python 3 version:
    Code:
    import random
    
    # Generic implementation, without special handling for the first piece.
    def Randomizer(pieceTypes):
      droughtLengths = [1 for piece in range(pieceTypes)]
      while True:
        i = random.randrange(sum(droughtLengths))
        end = 0
        for genPiece in range(pieceTypes):
          start = end
          end += droughtLengths[genPiece]
          if i >= start and i < end:
            for checkPiece in range(pieceTypes):
              if checkPiece != genPiece:
                droughtLengths[checkPiece] += 1
              else:
                droughtLengths[checkPiece] = 1
            yield genPiece
            break
    
    if __name__ == '__main__':
      r = Randomizer(7)
      pieceNames = { 0 : 'j', 1 : 'i', 2 : 'z', 3 : 'l', 4 : 'o', 5 : 't', 6 : 's' }
      print(''.join(pieceNames[next(r)] for i in range(70)))
    
    The attached archive contains a Windows program, C source code for that Windows program (the source is portable C, with nothing Windows-specific), the Python 3 source code shown above, and a readme describing how to use the Windows program. If you're doing tests of really large sequences, I'd suggest using the Windows program or the C source code.

    From what little analysis that I've done, the maximum drought length seems to scale proportionally to the number of piece types, and as the generated sequence gets longer the maximum drought length slowly grows.

    I've also found that the average drought length might be about 5 (exactly 5, as the sequence length approaches infinity?) when generating tetrominoes, so maybe the average drought length is some constant for each number of piece types.
     

    Attached Files:

    colour_thief likes this.
  11. The maximum drought size is technically unbounded with that algorithm, and the average drought size has to equal to 7 (or however many distinct piece types there are).

    I'm not sure who made the first version of that but I've seen it before. A natural extension is to weight pieces by the drought size raised to some floating point exponent greater than 1.
     
  12. NullpoMino has linear distance weight, quadratic distance weight and exponential distance weight already implemented. Having played around with it a bit, I've obtained interesting results from functions such as (1.2^d - 0.8), which eliminates both droughts and floods quite nicely.
     
    Last edited: 1 Nov 2019
    colour_thief likes this.
  13. I think that's where I saw it! Thanks for sharing.
     
  14. Last edited: 30 Mar 2021
    Arcorann, Kitaru and colour_thief like this.
  15. Glad you're alive and still care about TNT64 gilly!
     
    Arcorann and Kitaru like this.
  16. The N64 has a CPU count register that increments at a period of 46.875 MHz, with 1 cycle equal to 64/3 nanoseconds. This cycle count is written to and read from the cart's sram, converted to nanoseconds, and used to seed the randomizer.
    The cycle to nsec conversion has a period of 3 * 2^26 (0x0C000000), because of the 64/3 multiplication step. Thus, out of a possible 2^32 (~4.3 billion), only ~200 million games are allowable.
    For example, a cycle count of 1 would yield the same game as a cycle count of 0x0C000001.

    In 2018, we found the memory address where this cycle count is stored after it is read from the cart's sram. We can therefore control which game we receive by editing the contents of this memory address. For example, to play game 0x0BFF234A, use the following code in GameShark Pro or mupencheat.txt
    For USA rom
    8125E464 0BFF
    8125E466 234A

    For European rom
    81250F04 0BFF
    81250F06 234A

    Then, in 2021, we reverse-engineered the piece generation algorithm. We used this algorithm to generate all 201326592 allowable games. This gave us a database whose size is 15 GB on disk to search from.
    Some applications:
    - We transcribed piece sequences from videos found on youtube, searched our database, found the "game id", and replayed it ourselves on actual hardware using gameshark to set the found game id.

    - We unexpectedly discovered an alleged fake video on youtube. The sequence shown in the following video is not found in our database. We even searched the other 4.1 billion non-allowable sequences.


    - We searched the database for all the games that satisfy a set of constraints for a possible perfect ultra game. We found one such example and handed it off to a TAS maker (IcySpeedruns):
    ).
     
    Last edited: 31 Mar 2021
    colour_thief, Arcorann and Kitaru like this.
  17. I can confirm that this is a rediscovery, sorry. I'm glad to see someone else find it, though, as it's a somewhat obvious idea that I've not seen people talk about much. The "distance" randomziers are an interesting family. (Minor nit: I needed to include limits.h with your C version for UINT_MAX.)
    Pretty sure that the first version of this is because of me, heh. If I remember correctly, Zircean added these to Nullpomino when I shared the idea with him.

    One of the biggest problems with just using Distance by itself is that it doesn't always do enough to reign in droughts. It's a lot better than some other randomizer families that basically look like memoryless at the long end, but when you're starting to get deep in a drought, 15 (or worse) to get that piece you really need against best case 21 (I suspect more likely 25 on average) is a coinflip-ish that you can very easily lose five or more times in a row...

    It's interesting that you mention trying other various functions. The approach that I've experimented with previously is to use the distance number in a lookup table of weights to use, which allows you to shape very carefully and particularly the weights to your need, so you can make your floods and your droughts as kind or as mean as you want, and independently from each other. Also, a lookup table is faster than invoking floating point calculations on each piece. Just make sure that you can't fall off past the end of the table!

    One fun thing that you can do is use a Garden of Eden state to handle initial history and first piece. My code has {3, 3, 3, 1, 0, 3, 1} (pieces in IJLOSTZ order), but if you code your algorithm to ignore negative weights, then {1, 1, 1, -2, 0, 1, -2} would be very close to TAP's initial conditions.
    It took me quite a while to work out what this was actually calculating, especially after verifying that the actual piece data output (after formatting changes) matches my implementation. It was only when I had it output all three numbers in the calculation, and saw that it was doing 2464/(7*70) for a very short sequence, that I figured it out. If you remove dividing by the number of pieces, you're calculating the average of the sum of the distances after each piece.

    Maybe the better way to think about this number, then, is not as an average of nearly 5, but an average of 34.464something (just did two long runs to check data, and got 34.465039453 and 34.464739360). The lowest this sum can be (assuming we're not escaping a Garden of Eden) is 28 (1+2+3+4+5+6+7), and is likely to be somewhat above that. I don't think I've ever looked into what this average should be, but 34.464something smells about right.

    While I'm here, I'll also talk about a related randomizer that I discovered around the same time. It's basically the same idea as Distance, but it uses the relative order that the pieces came out in instead, effectively always using the weights {1, 2, 3, 4, 5, 6, 7}. "Ordist" is less good at reigning in drought pieces, as the chance to deal it is always 1/4.

    And if you've been paying attention to this thread, then you'll have probably noticed that that's very similar to how DTET works. And in fact, if you change that {1, 2, 3, 4, 5, 6, 7} to {0, 1, 2, 3, 4, 5, 6}, then you have exactly DTET. Right under my bloody nose. And I missed it, because I never thought to change the weight table in that way...

    Oh well. Anyway, hi?
     
    colour_thief likes this.
  18. According to the source code of Techmino the following is the algorithm for the Cultris II randomiser (discovery of which is attributed to zxc):

    Code:
    Initialise starting "weights" to 0 for each piece
    
    To generate a piece:
        For each piece, divide weight by 2, then add a random number uniformly distributed between 0 and 1
        Select the piece with the largest weight, divide its weight by 3.5, and return it as the next piece
    
    I generated a sample drought distribution and it was close enough to that of the 10 million piece sequence that was posted on the Hard Drop forums that I am confident that this algorithm is correct.
     
    colour_thief likes this.
  19. That's so cool! Thanks for sharing nice to have that mystery solved.
     

Share This Page