Captain's Mistress

Place your projects here
User avatar
cicciocb
Site Admin
Posts: 2195
Joined: Mon Feb 03, 2020 1:15 pm
Location: Toulouse
Has thanked: 470 times
Been thanked: 1461 times
Contact:

Re: Captain's Mistress

Post by cicciocb »

No problem for me, I can borrow the unique neuron that the nature gave me :roll:

The problem is that is not really used to make synapses as is always alone inside my brain :lol:
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

[Local Link Removed for Guests] wrote: [Local Link Removed for Guests]Tue Apr 09, 2024 8:21 am No problem for me, I can borrow the unique neuron that the nature gave me :roll:

The problem is that is not really used to make synapses as is always alone inside my brain :lol:
I have a bunch of them but they refuse to get on with each other.
The great thing about dementia is that you forget that you're ill :?

I've made some progress with this.
By treating each output neuron individually and doing a seperate backprop for each error, it learns much quicker.
Also, when pondering about how to handle "don't care" (eg full column) outputs, I found that the above method can be used to simply ignore those outputs in the backprop which saves a lot of processing. That can also be expanded to ignore outputs when they are "good enough".

I'm likeing the idea of some sort of competition. Maybe a seperate ESP playing over ESPnow. No connections, just load the code and go.
Would only need a basic program for the host to display the game (or any other game that comes to mind) as it progresses.
Anyone up for it?
botman
Posts: 91
Joined: Thu Apr 01, 2021 3:04 pm
Has thanked: 10 times
Been thanked: 39 times

Re: Captain's Mistress

Post by botman »

A competition could be interesting. We would need to consider how to make it fair. I'm sure that there are no standard "Marquess of Queensberry" rules for this sort of thing. We wouldn't want to pit apples against oranges, so to speak. Would we just each take our neural net, with randomized weights, and let them start playing against each other and learning from each other as they go?
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

I think we need to keep it as simple as possible. In particular, so there is minimal work on the part of the host.
To be metaphorical with the sport of F1 racing, they stipulate fuel, engine size, weight, how the software controls the clutch, how much fins can flex by,must use at least two different tyre compounds and whole lot more, even what can/cannot be said over the radio. Using this analogy, I think we should have a simple "fits this size" and "no pushing allowed".
For us, that would be "written entirely in Annex RDS", "Fits within an ESP module" and "no external influence."
All the host would need to check is that there is no internet connection getting instructions from an outside source.
Whether or not we need to add anything else such as if it's purely neural-net or allow other functions such as block obvious and do obvious can be debated over. (again, those would incure extra work for the host to check).
I think a simple comms protocol can be done. Nothing more than, "start new game", "I play this column", "see that you played that column" and "game result". Forfeit the game if trying to play a full column or not responding within a set (to be determined) time.
No limits on NN structure, but have a time limit in which to make a move to avoid overly large NN and slow game play.
Hardware structure would be 3-off ESPs. The host and two players.

Not sure that learning "on the fly" is feasable with an ESP running Annex. It would simply take too long between moves and or games.
(early tests suggest an over-night run to learn around 10 states of play. I think several 1000 will be required)
Perhaps, just submit a pre-learned weights file which can easily be updated for subsequent rounds.
Or, maybe a seperate program that can run on a seperate ESP playing itself over-night. Then use those weights for the next round.

We also need to think about the deffinition of best/winner.
The most games won out of 1000 played. The quickest to make a move. The quickest to learn. The lowest number of neurons. All of the above.
Just a few thoughts, open to any suggestions.
Maybe no rules. Just the one who wins the most games (out of X) by whatever means. (other than outside source).
botman
Posts: 91
Joined: Thu Apr 01, 2021 3:04 pm
Has thanked: 10 times
Been thanked: 39 times

Re: Captain's Mistress

Post by botman »

After thinking more about it, I am not interested in a competition. I would rather have a cooperation in which we experiment at our own pace and share ideas and results and program code. I really appreciate having your excellent Connect 4 game play and scoring and display code to experiment with neural net ideas. I would be glad to share the code that I wrote if you want to run comparisons with yours.

My most recent experiment was to train the net by playing games against a round robin sequence of three opponents : AImove_R, AImove_TW, and AImove_TL. Without backpropagation, after 500 games, the ratio of net wins to opponent wins was about 0.8 . With backpropagation, at 260 games, the ratio was 2.87, but continuing to 500 games, it gradually dropped to near 2.0 . I reran it to 260 games and captured the weights to a file. When I reloaded the weights from a file and ran 500 games without backpropagation, it settled to a ratio of about 2.8 . So, with the captured weights, the net wins at about 3.5 times the ratio for random weights with no backpropagation.
Last edited by botman on Sun Apr 14, 2024 2:33 pm, edited 1 time in total.
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

[Local Link Removed for Guests] wrote: [Local Link Removed for Guests]Fri Apr 12, 2024 10:11 pm After thinking more about it, I am not interested in a competition. I would rather have a cooperation in which we experiment at our own pace and share ideas and results and program code. I really appreciate having your excellent Connect 4 game play and scoring and display code to experiment with neural net ideas. I would be glad to share the code that I wrote if you want to run comparisons with yours.

My most recent experiment was to train the net by playing games against a round robin sequence of three opponents : AImove_R, AImove_W, and AImove_L. Without backpropagation, after 500 games, the ratio of net wins to opponent wins was about 0.8 . With backpropagation, at 260 games, the ratio was 2.87, but continuing to 500 games, it gradually dropped to near 2.0 . I reran it to 260 games and captured the weights to a file. When I reloaded the weights from a file and ran 500 games without backpropagation, it settled to a ratio of about 2.8 . So, with the captured weights, the net wins at about 3.5 times the ratio for random weights with no backpropagation.
I understand where you are comming from. The idea of a competition was to spur myself into "getting on with it".
I'm more than happy to work in cooperation with yourself (or others) and share code.
I have already started on some of your suggestions. In particular, your observation that many of the inputs are zero.
Not sure it does a lot for feedforward because the derivatives need to be calculated anyway, but it is a simple test and avoids many calculations during backprop.
I'm also exploring not doing selective backprop calcs when the loss is within a certain tollerance. Doing that helps avoid conflicting weights from upsetting each other for previously learned patterns.

One area where I've shot myself in the foot is the neuron structure.
I chose to make each neuron two dimensional neuron(x,y) with x=0 for bias and 1 onwards representing the neuron number.
The y dimension holds the partial derivatives as these can be calculated during feedforward thus avoiding the sigmoid calc during backprop loops.
This structure works very well for feedforward but has proved to be an absolute nightmare when trying to nest the loops for backprop.
I also chose to make the weights single dimension to help with saving and loading weights files but this too is adding complexity to indexing in both forward and backward prop. I'm considering going back to two dimensional weights. Any thoughts?
I'm also still trying to keep everything configurable so that it's just a simple variable that defines the number of neurons in each layer and also the game-board dimensions. ie a "general purpose" net structure.

EDIT:
The learn rate can also be included in the derivative calculation during feedforward.
That reduces the calculations for the deeper layered backprop and hence speeds things up quite a bit during learning at the expense of a slight slow down for normal play.
A solution would be one version of feedforward for game play and another for learning.
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

[Local Link Removed for Guests] wrote: [Local Link Removed for Guests]Thu Mar 21, 2024 7:38 am .....
I recognize that the "bot" don't seems very smart and you can beat it easily.

....
There's my new competion ;)
botman
Posts: 91
Joined: Thu Apr 01, 2021 3:04 pm
Has thanked: 10 times
Been thanked: 39 times

Re: Captain's Mistress

Post by botman »

The two dimensional neurons with X=0 for the bias is fine for me.
Two dimensional weight indexing would be easier to be sure that the right weight gets updated.
I think that I strained one or two of my own neurons checking that the one dimensional indexing was correct in backpropagation.
I'm still not confident that I have it completely right, but the net seems to be learning.
Training the net by playing against the random "monkey" is satisfying, but it doesn't make it very smart.
That's why I chose to train against a round robin sequence of three competitors, two of which are a little bit smarter about recognizing a winning play and taking it or blocking an impending loss.
My net tends to keep playing the same column until it "overflows" and an unfilled column with the next-best score is played.
I think that training against a variety of more competent and only slightly random competitors could diversify the net's choices.
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

I'm getting similar results. Once it's picked a column, it likes to stick with that column.
Interestingly, I've found quite a few mistakes with my own version of backprop, yet despite those errors, it still appears to learn.
I've also done a few test versions (more to check the weight indexing than anything else) and it does quite a good job of learning even with just one layer being updated. Makes me wonder about dropping the hidden layer altogether.
I'll map something out using two dimensional weight indexes to see if it saves a few of my own neurons being strained. Might just move the problem to somewhere else though.
The training data does seem to be quite crucial. The paper you posted suggested that their NAIVE4 algorithm did not teach very well.
My AImove_TWL algorithm is almost identical to theirs. Maybe the teaching method needs to be improved.
I think a two move look-ahead would be possible but might be a bit slow unless you can think of a way of avoiding going down all the blind alleys.
botman
Posts: 91
Joined: Thu Apr 01, 2021 3:04 pm
Has thanked: 10 times
Been thanked: 39 times

Re: Captain's Mistress

Post by botman »

This is an interesting article:
https://www.quantamagazine.org/how-do-m ... -20240412/
I don't think that my net "groks" Connect 4 yet.
Post Reply