User blog:Dai4lyf/Creating an AI to learn to play Wii Sports Resort: Swordplay: Showdown

If you want some background, see videos on Youtube of Carykh or Code Bullet (Evolution Simulator)

So here's the basics:

3 batches of 70 players with some aesthetic changes to resemble "races" or "groups" or whatever, so there are eight different groups, which we'll talk about in just a second. Now in all eight of these groups retains stats. There are four different stats. We don't say "very low" or "moderate" but it is a number between -1 and 1. with -1 being the worst and +1 being the best. The first stat we look at is Health. If this stat has a value of -1, it will have two hearts. 0 is 3 hearts, and 1 is four hearts. The second stat is Movement Speed. Movement Speed is just how fast or slow the player runs, and does not affect their arms. A value of -1 means the player will be moving at a leisurely pace while a value of +1 means the player is moving at a pace that is equal to that of Usain Bolt racing a medium breed Dog and trying to stay with it at an equal horizontal position. The third stat is Attack Speed, which affects how fast the players can swing their swords. Like a weight, a value of -1 means the sword is heavy, and will be visible for more than 5 frames as the normal swords from Swordplay are, and a value of +1 means the sword is almost weightless, and the speed of which the player's arms move is about 2 frames, making it less likely to block the swing. The last stat is Stability. It is the amount of time spent tipped over if you get parried. A Value of -1 means you will be doing this for around 5 seconds, and a value of +1 means you will just have a half a second.

The first group is called "Player". Their stats are HP 0, MS 0.75, AS -0.25, ST 0.15

The second group is called "A.I.". Their stats are HP -1, MS 0, AS 0.75, ST -0.35

The third group is called "CPU". Their stats are HP 0, MS 1, AS -0.5, ST 0.8

The fourth group is called "Bot". Their stats are HP 1, MS 0.5, AS 1, ST -0.75

The last four groups are a little different. They are mutation groups. Mutation groups are what comes from any possible outcome when a group gets to less than 25 players over the three batches ( (about eight players from a single group in a batch) then there is a high chance a mutation group form and replace four of the eight players and 26 other random players in a different group. Their stats are a little different. Their stats can go as high as 1.25 and as low as -2.

The fifth group is called "Program". Their stats are HP 1 MS -1.5 AS 1.1 ST 0.95

The sixth group is called "Internet". Their stats are HP 0 MS 1.15 AS 1.25 ST -1.25

The seventh group is called "Knowledge". Their stats are HP 1 MS 1.25 AS 0.75, ST 0

The eighth group is called "Search". Their stats are HP -1 MS 0 AS 1.2 ST 1.25

The last group will form when three groups (Only two mutation groups and one normal group) start to go under 6 players a batch. It will replace two of the smallest groups and leave the largest smallest group (the one with the most players at or under 6. If there is a tie between players value at or under 6, the RNN will pick the group with the last name to come alphabetical and replaces it.) And as well as a random number between 4 and 9 and takes that amount and replaces a portion of the group with the most players (cannot choose a mutation group unless the population is 52 or higher) The class name for groups like this is called "Expert" and it's stats can reach as high as 2 and as low as 0.5. If Health is at 1.5, it will have 5 hearts. If it is at 2, it will break the pattern and have six hearts BUT a blue heart means a hit will absorb one hit, so basically 7 health but 6 hearts.

The ninth and last group is called "Hack". Their stats are HP 2, MS 2, AS 1.5, ST 2

If the Expert class falls under 2 players, the evolution will have been ended. However, there is a gradient to kill off the 75 worst players and reproduce to replace the 75 that were killed. The Results aren't based on time or distance. The results are based on things such as How many rivals were killed by the player, how many hearts the player lost after each stage, and how many stages were cleared by the player. One generation is three batches of 70 players, which is equal to 210 players a generation. The whole generation will be at least 150 minutes long, and after the 150 minutes, the players do a quick go-through to save time. If this rule didn't exist, we'd have at least 2 players not done. but to cut down on time, we can select cetain players in certain groups to see their progress with the stages. Once each player is done, the next batch starts. Then the last. Afterwards, you will have to kill off the 75 worst players and reproduce 75 more.

The main screen will show 4 buttons and 3 graphs. the first button is "do a step-by-step gen." which shows you the full 150 minutes of a generation to see the progress. The second button is "do a quick generation" which skips the movies and goes straight to the "kill off 75 worst players". The third button is "Do gens ASAP" you press it and it will skip the whole proccess and will have went through a generation. The final button says "Do gens ALAP" which means you click the button and generations will complete until you click to stop. The first graph is a line graph, showing the 100th percentile, or the best player's progress, and the 1st percentile, or the worst player's progress, which means all players preformed better than the worst player, and the median or 50th percentile. And other lines like increments of 10 and their indivindual percentile lines. Vertically, Progress is measured in the mean of rivals killed, stages completed, and hearts left. which is Rivals Killed before Rivals Killed Them. Because there are 1665 Rivals in all, the players who killed all 1665 are the best, while players that kill 1 rival are the worst. Horizontally, Progress is measured in generations. There are different options for this graph. You can change vertical progress to time completed for a better visual of who the best player is, or hearts left to give a visual of how many skill points each player would have. You can also modify the horizontal progress by batches so get a better visual of which batch was the best. The second graph is a population graph. Each simulation starts with 210 random players from the four normal class groups, which are displayed as C + the first and second letter of their group + the number of their group, for example the group "Player" is displayed as CPL1. This can be modified to give the color of (later) or the full group name of or the group name of the groups. Lets use the Group CPU as an example. It is labelled as CCP3, White, Normal Computer Piloted User Group 3, or CPU. Generally, when the population of a group falls below 7, their labels will not show, however they are not "wiped off". Wiped off is a term used when a group does not exist anymore, but used to. If a group has a higher population than all the others, their infobox will be outlined in black. The third graph is a Common Reports Chart, which is not a chart used in paper because it shows a video. This graph displays what the Best, Median, and Worst player did. There is also some text stats that show the Group, Fitness (also later), and ID of the player. I'm not going to go into how IDs are displayed. Carykh's evolution simulators can help you.

Now let's talk about the gradient of killing the 75 worst players. a few lucky unfit players will survive (the worst player never survives) and a few unlucky fit players will die (the best player will never die, however if the best player is the same after 5 generations, it will be sure to die. We kill off 75 of the worst players because there's always a chance of a mutation, and ill explain some common things that can happen to make players worse or better.

Let's talk about actions that players may do in their first through tenth generation. Common things that happen are: They never unblock their sword therefore they get a low rival kill count, Or they never attack and will die, or they will keep wildly swinging for ever (usually killing a lot of rivals until the red armor stage when counterattacks will be more common) or they will do a combination of never block and wildly swinging which makes them highly vulnerable. We will also use a type of Q-learning that Code Bullet explains in his Snake video, only the players themselves have to learn that one or two of the things they do are causing them to get bad scores, so they eventually grow out of it. The brains of certain players will be copied and merged and the parent that did the worst will make sure the brain of their child skips this problem. So in generation 3 for example, one parent is in the 91st percentile and the other is in the 37th, and so they have a child and the child learns that eccessive swinging isn't helpful and that you should only do it at weak enemies. Then in about 8 or so generations, Players know that they need to attack rivals when they can reach them. Then another problem emerges, when the players face miis with greater health. They look back at their ancestors who swung rapidly so they do that when they come into contact with rivals like that. then in 2 generations they may say "forget rapid swinging, they keep blocking" so they just attack once. Then after 4 generations or less they learn about how to swing to hit a blocking enemy. Then when the rookie players reach the reverse levels in 3 generations or so, they may start to lose some intelligence or retain it, then the brown armor rivals start getting to them and usually at stage 12 the players start dying, so they usually all tie for a place, that is until they find out 4 generations later they learn that after stage 10 the brown rivals get more aggressive, so they try to hit as soon as the player and rival are within a swing's reach. The next obstacle are levels like stage 13, 14, 19, or 20 with groups of black armor rivals, that they know that they should block more often and when the rivals attack and get blocked, that its the time to start attacking them. Then we skip some problems to go 11 generations in (these are all example numbers which means this is not a spoiler) to the final stage where Matt comes in. Usually all the players manage to get their total kill count to be in the 15-16 hundreds, and the 85th percentile can get to the crowd of only black armor and the 96th percentile can get to Matt until 1 generation where they can finally kill Matt and win.

Let's discuss what color is about. Color resembles what group the players are in. The group "Player" wears Pink armor. "A.I." wears Purple, "CPU" wears White, "Bot" wears Light Blue, "Program" wears Brown with a hint of Blue, "Internet" wears Blue, "Knowledge" wears Chartreuse, "Search" wears Yellow, and "Hack" wears Grey. This is also their color on the Population Chart.

The C in front of the group name (example: CHA9 CBO4 for Hack and Bot) means Category. So for CCP3, you would read it as Category of Computer Piloted User in Group 3. There may be nine groups, but there will always be one category: players (not the group name Player, or CPA1). The term Player means the character that is playing the game. So for example each batch has 70 players in it, which means 70 Miis who are not controlled by a real life person are playing to get a good score.

Let's talk about a vocabulary term for our subject. A House (of generations) is ten generations. There will be usually two or three best players in a house, and 10 worst players in a house.

I will explain anything else if you request it.