My team is currently leading the gocoder bomberland competition. I wrote a very very long thread in their discord (help channel) about all the RL AI stuff I tried and how things failed and what finally worked in the end. (Some people are participating as part of their university studies, so they asked for help on how to get started).
Even for a seemingly "simple" game like Bomberman, tabular reinforcent learning isn't going to work. I tried it with a huge table with 1000+ states and 1mio+ transitions but it still couldn't capture the complexity of that game. Plus you can mathematically show that the value estimates aren't going to converge, due to exploding variance.
In short, I believe you'll need serious research to go from the current "state of the art" in RL AI to something that is remotely tolerable in a AAA video game. But that sounds like a interesting idea, so maybe you should get your feet wet by building a small RL AI for Bomberman yourself, so that you know how things work. I have replays and instructions for that in the discord too, search keywords "gocoder-bomberland-dataset" and "behavioral cloning".
Current game AI seems to love their decision trees, and I'm thinking more along the lines of how can you mix a few bandits into that tree or how to manually discretize the environment into a small table (not 1000+ states). I don't think games are going to give up their decision trees, but they might be able to mix in some simple machine learning techniques that make the AI somewhat adaptive.
https://discord.gg/NkfgvRN
Even for a seemingly "simple" game like Bomberman, tabular reinforcent learning isn't going to work. I tried it with a huge table with 1000+ states and 1mio+ transitions but it still couldn't capture the complexity of that game. Plus you can mathematically show that the value estimates aren't going to converge, due to exploding variance.
In short, I believe you'll need serious research to go from the current "state of the art" in RL AI to something that is remotely tolerable in a AAA video game. But that sounds like a interesting idea, so maybe you should get your feet wet by building a small RL AI for Bomberman yourself, so that you know how things work. I have replays and instructions for that in the discord too, search keywords "gocoder-bomberland-dataset" and "behavioral cloning".