Captain's Mistress

Place your projects here
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Captain's Mistress

Post by BeanieBots »

Although I've called this Captain's Mistress, the real project is actually an excersise in neural networks.
I started this on a classic ESP32 with 2.8" TFT several years ago which probably indicates that I'll never actually get around to finishing it :roll:
One of the basics to a neural network is having a problem for it to solve.
The classic X-OR gate is too boring.
Naughts & Crosses (aka Tic-Tac-Toe) is too simple.
Chess is beyond my own brain processing power. (though should be possible with a ESP-S3)
I felt Backgammon would be too tricky for the graphics. (though with VGA and pages is now very possible).
That left me with Captain's Mistress, aka Connect-4.

I've got as far as creating a few simple algorythms to teach the neural network and created the network structure and feedforward.
I've yet to implement a workable back-propogation. Or at least, one that can run on an ESP and not require over-night processing for each move learned.
However, as part of that journey, I've come up with several versions of the teaching routine which in their own right play a fairly good game of Connect-4.
I've posted the code here for those who have a VGA_LCD and want to have a little fun challenging it to a game.
Hope it might inspire someone to have a go at something similar.
As a little extra, line 32 can be changed to select different board sizes.
C4.jpg

Code: [Local Link Removed for Guests]

'**********************************************************************************
'*                                                                                *
'*  Captain's Mistress. Copyright BeanieBots      20/01/2021  V101                *
'*                                                16/02/2021  V102                *
'*  RAM 178404    FLASH 184320                                                    *
'*  480 * 320 version (Colours are inverted)      06/03/2021                      *
'*  Redone to suit 240*320 screen                 25/08/2022                      *
'*  Code clean & reduction                        29/08/2022  V103                *
'*  Upgrade to 1.47.2                             25/10/2022  V104                *
'*  Removed exit/Do + re-structure of if/then/else due to bugs in 1.47.2          *
'*  Added routine to put game status into inputs  29/10/2022 V105                 *
'*  Used GUI for text                             31/10/2022                      *
'*  Added NN code                                 01/11/2022 V110                 *
'*  Ported to VGA LCD                             20/02/2024 V120                 *
'*  Improved layout                               22/02/2024 V121                 *
'*  Added comments. Lest I should forget!         24/02/2024 V122                 *
'*  Removed redundant/redacted code               25/02/2024 V123                 *
'*  Re-organised integration for button response. 28/02/2024 V124                 *
'*                                                                                *
'**********************************************************************************
Gosub Init_LCD
vga.fill 0
vga.text.draw "Captain's Mistress", 400,30,13
vga.text.draw "Initialising", 400,110,13
vga.text.draw "Please wait" ,400,190,13
vga.show
'OPTION.CPUFREQ 80                '80|160|240
SCNx=640  'TFT Screen x size
SCNy=480  'TFT Screen y size
N = rnd(-1) 'Fix random seed for consistant results
True=1:False=0
BoardSize = 1         'Pick a value 1-6 as described below
Animation = True      'Set to False to skip animated play.
SelfPlay = False      'If true, will use AI for human move.
PlayAgain = False     'Set true to start a new game.

Select case BoardSize 'Cs = number of Columns. Rs = number of rows
  Case 1:Cs= 7:Rs= 6  'Classic size
  Case 2:Cs= 8:Rs= 7  'Bigger       
  Case 3:Cs= 9:Rs= 8  'Large. Time for AI move might be high after here
  Case 4:Cs=10:Rs= 9  'Extra large
  Case 5:Cs=11:Rs=10  'Huge
  Case 6:Cs=12:Rs=11  'Massive
  case else:Wlog "Error"
End Select

S=(SCNy-20)/(Rs+1)     'Scaling factor
Move=0                 'The column number to be played
MoveScore = 0          'Value assigned to the move by the AI
P1wins=0               'number of times human has won
P2wins=0               'number of times the AI has won
Draws=0                'Number of games resuting in a draw
Busy=False             'Flag to prevent interrupting AI calculations
Sig = 0                'count of sigmoid itterations (used to show progress)
'------------------------------------------------------------------------------
numInputs=Rs*Cs*2                                 'Number of inputs
numInputNeurons=9                                 'Number of input neurons
numHiddenNeurons=9                                'Number of hidden neurons
numOutputNeurons=Cs                               'Number of output neurons
numSigs = (numInputNeurons+1)+(numHiddenNeurons+1)+(numOutputNeurons+1)
numTrainingData=10                                'Training sets held in RAM
LearnRate = 0.001                                 'The NN learn rate

Dim InputData(numInputs)                          'The network inputs
Dim InputWeight(numInputs*numInputNeurons)        'The input weights
Dim InputSum(numInputNeurons)                     'The input neuron sums
Dim InputResult(numInputNeurons,4)                'The input neuron outputs

Dim HiddenWeight(numInputNeurons*numHiddenNeurons)'The hidden weights
Dim HiddenSum(numHiddenNeurons)                   'inputs to the hidden neurons
Dim HiddenResult(numHiddenNeurons,4)              'Result from the hidden neurons

Dim OutputWeight(numHiddenNeurons*NumOutputNeurons)
Dim OutputSum(numOutputNeurons)
Dim OutputResult(numOutputNeurons,4)

Dim TrainingInputs(numInputs,numTrainingData)
Dim TrainingTargets(numOutputNeurons,numTrainingData)

Gosub Initialise_NN            'Load NN weights
Player = 1                     'Human goes first after a reset/power up.
Dim Board(1,Cs,Rs)             'Holds the state of play independant for each player.
Dim H(Cs)                      'Played Height of each column.
Dim L(1,Cs,5)                  'MovePlayed. (Player,Column,Move_Index) Used by AI.
Dim P(Cs)                      'AI optional move
Dim Pcolour(1)                 'Player colours

Pcolour(0)=tft.rgb(255,32,0)   'Player 1 [Human]
Pcolour(1)=tft.rgb(8,128,8)    'Player 2 [AI]
GridCol=tft.rgb(0,0,0)         'Main colour of grid
BlankCol=tft.rgb(16,16,16)     'Colour of non-played grid slots
ForeCol=tft.rgb(255,255,255)   'Text colour
BackCol=tft.rgb(0,0,64)        'Background colour
CoreCol=tft.rgb(32,32,32)      'Colour of space between grid and tokens.
WinCol=tft.rgb(0,0,0)          'Colour used to highlight a winning line.
vga.fill BackCol               'Clear the screen
vga.text.color ForeCol,BackCol 'Set text & background colours
vga.text.font 10               'Set to suit screen used.

PosX = 580:Wy = 170:H=45
Prg1 = vgaGUI.ProgressBar(PosX, 170, Wy, H, 0 ,0 )
vgaGui.SetColor Prg1, tft.rgb(0,128,0) , tft.rgb(240,0,0) ,tft.rgb(0,255,0)
vgagui.setvalue Prg1,100

txtPlayer = vgagui.textline(PosX,20,Wy,H," --- ",11,tft.rgb(255,255,255),tft.rgb(0,0,0),tft.rgb(0,255,0),4,0)
txtResult = vgagui.textline(PosX,70,Wy,H," --- ",11,tft.rgb(255,255,255),tft.rgb(0,0,0),tft.rgb(0,255,0),4,0)
txtProgress = vgagui.textline(PosX,120,Wy,H,"Progress",10,tft.rgb(255,255,255),tft.rgb(0,0,0),tft.rgb(0,255,0),4,0)

btn1 = VGAGUI.Button(Posx,220,Wy,H,"Play again",10,12,0,0,white,blue,green,white)
btn2 = VGAGUI.Button(Posx,270,Wy,H,"Animation OFF",10,11,0,0,white,blue,black,white)
btn3 = VGAGUI.Button(Posx,320,Wy,H,"Self-Play ON",10,11,0,0,white,blue,black,white)

vgagui.setevent btn1,1,btnPlay
vgagui.setevent btn2,1,btnAni
vgagui.setevent btn3,1,btnSelf

Posx = posx + 50
vga.text.align 1
vga.text.draw "Player 1",Posx,380,10
vga.text.draw "Player 2",Posx,402,10
vga.text.draw "   Draws",Posx,424,10
vga.text.draw "   Ratio",Posx,446,10
Posx = PosX + 60:H=22
txtP1wins = vgagui.textline(PosX,380,60,H,"0",10,tft.rgb(255,255,255),BackCol,BackCol,2,0)'P1
txtP2wins = vgagui.textline(PosX,402,60,H,"0",10,tft.rgb(255,255,255),BackCol,BackCol,2,0)'P2
txtDraws = vgagui.textline(PosX,424,60,H,"0",10,tft.rgb(255,255,255),BackCol,BackCol,2,0)'Draws
txtRatio = vgagui.textline(Posx,446,60,H,"-",10,tft.rgb(255,255,255),BackCol,BackCol,2,0)'Ratio

vgagui.refresh 1, touch.x, touch.y, touch.z
vga.show

vgaGui.SetText txtPlayer, " --- "
'vgagui.autorefresh 50 '#debug
vga.text.font 11
vgagui.refresh 1, touch.x, touch.y, touch.z'#debug
vga.show

Timer0 200, RefreshVGA 'Remove when autorefresh is implemented
'##################################################################################
Do  
  Gosub NewGame
  '-----------------------------------------------------
  Do   
    If Player = 0 then 'Either Human or selfplay as player 0
      If SelfPlay = False then
        WaitForHuman = True
        PlayAgain = False
        Gosub HumanMove
        Else
        PlayAgain = True
      'Uncomment the AI version for Player 0 (Red)[selfplay]
      '-----------------------------------------------------
      Gosub AI_Random 'required
      Gosub AddBias   'optional
      'Gosub AImoveNN 'Use the neural network
      'Gosub AImove_R 'A purely random move
      'Gosub AImove_TW 'Will play an obvious winning line
      Gosub AImove_TL 'Will block an obvious winning line
      'Gosub AImove_TWL 'Will do both TW & TL
      '-----------------------------------------------------
      EndIf      
      vga.show
    Else                     'AI plays as player 1
      vgagui.setvalue Prg1,0
      vgagui.refresh 1, touch.x, touch.y, touch.z
      'Uncomment the AI version for Player 1 (Green)
      '---------------------------------------------
      Gosub AI_Random 'required
      Gosub AddBias   'optional
      'Gosub AImoveNN 'Use the neural network.
      'Gosub AImove_R 'A purely random move
      'Gosub AImove_TW 'Will play an obvious winning line
      'Gosub AImove_TL 'Will block an obvious winning line
      Gosub AImove_TWL 'Will do both TW & TL
      '---------------------------------------------
      vgagui.setvalue Prg1,100
      vga.show'#debug     
    EndIf
    Pause 1
    vga.show
    vgagui.refresh 1, touch.x, touch.y, touch.z
  Loop Until Result > 0  'Stop playing after win/lose/draw
  
  Gosub ShowWin 
  
  Do
    vgagui.refresh 1, touch.x, touch.y, touch.z 
    vga.show      
  Loop until playagain = true 'wait until button pressed if not selfplay
  
  Pause 250 'Gives time to see result during selfplay
  
Loop

vga.fill red'Indicate that we've had an oops!

'------------------------------------------------------
wait
'##################################################################################
NewGame:
  Busy = true
  vgagui.settext txtResult,""
  Column = 0:Row = 0:Result = 0      'Clear the play positions
  SP = 0                             'Start point used by test for win
  vga.rect s/2-2,SCNy-(S*(Rs+0.5))-2, S*Cs+5,(S*Rs+5),GridCol,1,4  'Clear the grid
  VGA.TEXT.COLOR white ,BackCol
  vga.text.font 11
  vga.text.align 4
  vga.text.padding 25
  for Col=1 to Cs                                          'for each column
    vga.text.draw str$(Col), (Col*S-Rs+3),((SCNy-(S+2)*(Rs+1))+11),11
    For Row = Rs to 1 step-1                               'For each row
      X=Col*S:Y=SCNy-(Row * S)                             'Calculate X & Y coordinates
      vga.rect (X-(s/2)+2),(Y-(s/2)+2),S-3,S-3,CoreCol,1,7 'draw each play position
      vga.circle X,Y,(S/3),BlankCol,1                      'set to non-played
    Next Row
  Next Col
  vga.show
  Gosub NextPlayer
  For C=0 to Cs
    H(C)=0
    For R = 0 to Rs
      Board(0,C,R)=0:Board(1,C,R)=0
    Next R
  Next C
  Busy = False
Return
'**********************************************************************************
HumanMove:
  Pause 1
  Touch.read
  vgagui.refresh 1, touch.x, touch.y, touch.z
  vga.show '#debug
  If WaitForHuman=False then
    Return
  EndIf
  Touch.Read
  If touch.z=1 then
    X=(touch.x+12)/S
    Y=(SCNy-touch.y+9)/S
    Column=int(X)
    Row=int(Y)
    If Column<1 or Row<1 then goto HumanMove
    if Column > Cs or Row > (Rs+1) then goto HumanMove
    If H(Column)>=Rs then goto HumanMove
    Do
      vgagui.refresh 1, touch.x, touch.y, touch.z
      Touch.read
      Pause 50
    Loop Until touch.z=0 'wait until touch is released
    WaitForHuman=False   'Disable any input from touching 
    Gosub DoMove
    Gosub ShowMove
    Gosub TestForWin
    Gosub NextPlayer
  endif
  
return
'*********************************************************************************
RefreshVGA:
  If Busy = False then
   touch.read
   vgagui.refresh 1, touch.x, touch.y, touch.z
   vga.show
  EndIf
Return
'*********************************************************************************
btnPlay:
  PlayAgain = True
  vgagui.refresh 1, touch.x, touch.y, touch.z
  vga.show
Return
'*********************************************************************************
btnAni:
  If Animation = True then
    Animation = False
    vgagui.settext btn2,"Animation ON"
    else
    vgagui.settext btn2,"Animation OFF"
    Animation = True
  EndIf
  vgagui.refresh 1, touch.x, touch.y, touch.z
  vga.show
Return
'*********************************************************************************
btnSelf:
  If SelfPlay = True then
    SelfPlay = False
    vgagui.settext btn3,"SelfPlay ON"
    else
    vgagui.settext btn3,"SelfPlay OFF"
    SelfPlay = True
  EndIf
  vgagui.refresh 1, touch.x, touch.y, touch.z
  vga.show
Return
'*********************************************************************************
NextPlayer:
  Player=1-Player
  Select case Player      
    Case 0'Human
      vgagui.settext txtPlayer,"YOUR MOVE"
      vgagui.settext txtResult,"WAITING"    
    Case 1'AI
      sig = 0 'Reset the sigmoid iteration counter
      vgagui.settext txtPlayer,"MY MOVE"
      vgagui.settext txtResult,"THINKING"
  end select
  vga.show
  vgagui.refresh 1, touch.x, touch.y, touch.z
Return
'*********************************************************************************
AImove_R:                'A purely random move selection 
  Gosub PickBest         'Select the highest
  Column = Move          'Assign it
  Gosub DoMove           'Make the move
  Gosub ShowMove         'Update the display (inc. animation if applicable)
  Gosub TestForWin       'Check for a win
  Gosub NextPlayer       'Next player if no play result 
Return
'*********************************************************************************
AImove_TW:
  For M = 1 to Cs
    If P(M) > 0 then 'ignore invalid moves already identified
      Column = M
      Gosub DoMove
      Gosub TestForWin
      Gosub GetMoveScore
      If MoveScore = 2 then
        P(M)=P(M)+5  'Add high value (5) for a win
      EndIf 
      Gosub UndoMove
    EndIf
  Next M   
  Gosub PickBest         'Select the highest
  Column = Move          'Assign it
  Gosub DoMove           'Make the move
  Gosub ShowMove         'Update the display (inc. animation if applicable)
  Gosub TestForWin       'Check for a win
  Gosub NextPlayer       'Next player if no play result 
Return
'*********************************************************************************
AImove_TL:
  Player = 1 - Player
  For M = 1 to Cs
    If P(M) > 0 then 'ignore invalid moves already identified
      Column = M
      Gosub DoMove
      Gosub TestForWin
      Gosub GetMoveScore
      If MoveScore = 2 then
        P(M)=P(M)+2  'Add smaller value (2) for a lose
      EndIf
      Gosub UndoMove
    EndIf
  Next M  
  Player = 1 - Player  
  Gosub PickBest         'Select move with the highest score
  Column = Move          'Assign it
  Gosub DoMove           'Make the move
  Gosub ShowMove         'Update the display (inc. animation if applicable)
  Gosub TestForWin       'Check for a win
  Gosub NextPlayer       'Next player if no play result 
Return
'*********************************************************************************
AImove_TWL:
  Progress = 0      'Starting progress value
  Div = Cs * 2 + 1  'Divisor for progress bar  
  'Busy = True       'Speeds things up a little
  For M = 1 to Cs
    If P(M) > 0 then 'ignore invalid moves already identified
      Column = M
      Gosub DoMove         'try the position
      Gosub TestForWin     'check the outcome
      Gosub GetMoveScore   'give it a score
      If MoveScore = 2 then
        P(M)=P(M)+5  'Add high value for a win
      EndIf 
      Gosub UndoMove       'undo the tested move
    EndIf
     Progress = Progress + 1
     Vgagui.setvalue Prg1,Progress/Div * 100
     vgagui.refresh 1, touch.x, touch.y, touch.z   
  Next M  
  Player = 1 - Player    'Now try for the other player
  For M = 1 to Cs
    If P(M) > 0 then     'ignore invalid moves already identified
      Column = M
      Gosub DoMove
      Gosub TestForWin
      Gosub GetMoveScore
      If MoveScore = 2 then
        P(M)=P(M)+2      'Add smaller value for a lose
      EndIf
      Gosub UndoMove
    EndIf
     Progress = Progress + 1
     Vgagui.setvalue Prg1,Progress/Div * 100
     vgagui.refresh 1, touch.x, touch.y, touch.z
  Next M  
  Player = 1 - Player    'Put back to original player
  Gosub PickBest         'Select the highest
  Column = Move          'Assign it
  Gosub DoMove           'Make the move
  Vgagui.setvalue Prg1,100
  vgagui.refresh 1, touch.x, touch.y, touch.z
  Gosub ShowMove         'Update the display (inc. animation if applicable)
  Gosub TestForWin       'Check for a win
  Gosub NextPlayer       'Next player if no play result 
  Busy = False           're-enable other events
Return
'*********************************************************************************
AI_Random:
  For N = 1 to Cs
   If H(N) < Rs then
     P(N)=RND(1)        'Random value (0-1) if valid move
     else
     P(N) = -10         ' -10 if move invalid
   EndIf
  Next N
Return
'*********************************************************************************
AddBias:
  BiasDelta = 0.63      'Variance for center bias value (typ 0.63)
                        '0 will apply no bias
                        '0.003 is about the lowest value of any impact.
                        'Values >1 may cause problems with AI choice.
                        
  Bias = 0              'Clear accumulated bias value
  For N = 1 to Cs
     If N < int(Cs/2+1.5) then
       Bias = Bias + (BiasDelta/Cs)'Inc up to the middle
     EndIf 
     If N > int(Cs/2+1.0) then
       Bias = Bias - (BiasDelta/Cs)'Dec from middle onwards
     EndIf 
   P(N) = P(N) + sqr(Bias) 'using SQR reduces impact of spread
  Next N
'  for n=1 to cs '#debug causes crash after many itterations
'  wlog p(n)
'  next n
'  wlog ""
Return
'*********************************************************************************
AImoveNN:
  Busy = True 'speeds it up a bit but buttons less responsive.
  Gosub FeedForward
  Score = 0
  Move=1

  For j = 1 to numOutputNeurons
    If H(j) >= Rs then
     OutputResult(j,0)=0  'Reject if column is already full
    EndIf
  Next j

  For j = 1 to numOutputNeurons
    If Score < OutputResult(j,0) then
       Score = OutputResult(j,0) 'Use output with highest value
       Move = j   
    EndIf
  Next j
 
  Column = Move
  Gosub DoMove
  Gosub ShowMove
  Gosub TestForWin
  Gosub NextPlayer
  Busy = False
Return
'*********************************************************************************
GetMoveScore:           'Convert Result into play score
  MoveScore = 0
  Select Case Result
    Case 0              'Nothing changed, so do nothing
      MoveScore = 0
    Case 1 to 4         'There was a win
      MoveScore = 2
    Case 5              'It was a draw
      MoveScore = 1
  End Select
Return
'*********************************************************************************
PickBest:
  Temp = -10
  For j = 1 to Cs
    If P(j) > Temp then
      Temp = P(j)
      Move = j
    EndIf
  Next j
Return
'*********************************************************************************
DoMove:
  H(Column)=H(Column)+1                                'Increment column height
  H(0) = H(0) + 1                                      'Increment Total move count
  Row=H(Column)                                        'Get the row number
  Board(Player,Column,Row)=1                           'Set position as played
  Board(Player,0,Row) = Board(Player,0,Row) + 1        'Icrement row count
  Board(Player,Column,0) = Board(Player,Column,0) + 1  'Increment column count
Return
'*********************************************************************************
UndoMove:
  H(0) = H(0) - 1
  Board(Player,Column,Row)=0
  Board(Player,0,Row) = Board(Player,0,Row) - 1
  Board(Player,Column,0) = Board(Player,Column,0) - 1
  H(Column)=H(Column)-1
  Row=H(Column)
  Result = 0 'Clear any test result that may have been made (move this to testmove)
Return
'*********************************************************************************
TestForWin:
  T = millis
  R=H(Column)

  For N = 1 to (Cs-3)  'Check the played Row (Horizontal)
    Acc = 0
    For C = 0 to 3
      Acc = Acc + Board(Player,(C+N),R)
    Next C
    If Acc = 4 then
      Result = 3
      SP=N
      Exit For
    Else
      Acc = 0
    EndIf
  Next N

  If Result <> 0 then
    Goto TestDone
  EndIf

  'Check the played column
  'If H(Column)<4 then goto Vnotpossible to speed up testing.
  For R = 1 to (Rs-3)
    Acc = 0
    For SP = 0 to 3
      Acc = Acc + Board(Player,Column,(R+SP))
    Next SP
    If Acc = 4 then
      Result = 4
      Exit For
    Else
      Acc = 0
    EndIf
  Next R

  'Check the diagonals
  Acc = 0
  If Column >= Row then
    C = Column - Row + 1
    R = 1
  Else
    R = Row - Column + 1
    C = 1
  EndIf

  SP=0
  Do
    If Board(Player,(C+SP),(R+SP))=1 then
      Acc = Acc + Board(Player,(C+SP),(R+SP))
    else
      Acc = 0
    EndIf
    SP = SP + 1
  Loop Until (((R+SP) > Rs) or ((C+SP)> Cs) or (Acc=4))

  If Acc = 4 then
    Result = 1
    Goto TestDone 'skip all other tests
  EndIf
  'Up and Left
  Acc=0
  If (Column+Row) <= Cs then
    C = Column + Row - 1
    R = 1
  Else
    C = Cs
    R = Row  - (Cs - Column)
  EndIf

  SP=0
  Do
    If Board(Player,(C-SP),(R+SP))=1 then
      Acc = Acc + Board(Player,(C-SP),(R+SP))
    else
      Acc = 0
    EndIf
    SP = SP + 1
  Loop Until (((R+SP) > Rs) or ((C-SP)< 1) or (Acc=4))

  If Acc = 4 then
    Result = 2
    Goto TestDone
  EndIf

  If H(0)>=(Cs*Rs) then
    Result=5
  EndIf

TestDone: 'Label to jump to for avoiding tests once result is known.
  T=millis-t
  'wlog "Test_Time ";T
  vga.show
Return
'*********************************************************************************
ShowMove:
  'Busy = true
  If Animation = False Then
    Px=S*int(Column)
    Py=SCNy-(S*H(Column))
    vga.circle Px,Py,(S/3),Pcolour(Player),1
    vga.show
  else
    Px=S*int(Column)
    For N=Rs to H(Column) step-1 'Drop down the column
      Py=SCNy-(S*N)
      vga.circle Px,Py,(S/3),Pcolour(Player),1
      vga.show
      Pause 80
      vga.circle Px,Py,(S/3),BlankCol,1
      vga.show
    Next N
    vga.circle Px,Py,(S/3),Pcolour(Player),1
    For N = 1 to 4                'Flash at the resting place
      Pause 50
      vga.circle Px,Py,(S/3),BlankCol,1
      vga.show
      Pause 70
      vga.circle Px,Py,(S/3),Pcolour(Player),1
      vga.show
    Next N
  EndIf
  
    for N=1 to Cs
      If H(N)>=Rs then
        vga.text.color red,BackCol
        vga.text.padding 25
        vga.text.draw "X",(N*S-Rs+3),((SCNy-(S+2)*(Rs+1))+11)
      endif
    next N
  vga.show
  Busy = false
Return
'*********************************************************************************
ShowWin:
  'The player & their last move are known.
  'and so is the starting point (SP) of the winning line.
  'N=SP 'Only need to preserve this for diagonals

  Select Case Result
    
    Case 1'wlog "Diagonal Right"
      For W = 1 to 4
        Px=S*(C+SP-W)
        Py=SCNy-(S*(R+SP-W))
        vga.circle Px,Py,(S/6),WinCol,1
      Next W
      vga.LINE px+s*3, py-s*3, px, py, tft.rgb(255,255,255),7
    '#######################################
    Case 2'wlog "Diagonal Left"
      For W = 1 to 4
        Px=S*(C-SP+W)
        Py=SCNy-(S*(R+SP-W))
        vga.circle Px,Py,(S/6),WinCol,1
      Next W
      vga.LINE px-s*3, py-s*3, px, py, tft.rgb(255,255,255),7
    '#######################################
    Case 3'Wlog "Horizontal"
      Py=SCNy-(S*Row)
      For W = 0 to 3
        Px=S*(SP+W)
        vga.circle Px,Py,(S/6),WinCol,1
      Next W
      vga.LINE px, py, px-s*3, py, tft.rgb(255,255,255),7
    
    Case 4'Wlog "Vertical"
      Px=S*Column
      For W = 0 to 3
        Py=SCNy-(S*(Row-W))
        vga.circle Px,Py,(S/6),WinCol,1
      Next W
      vga.LINE px, py, px, py-s*3, tft.rgb(255,255,255),7
    
    Case 5'Wlog "Draw"
      vgagui.settext txtResult,"DRAW"
      Draws = Draws + 1
    
    Case else
      wlog "Error"
    
  End Select

  vgagui.settext txtDraws, str$(Draws)
  vgagui.settext txtPlayer,"RESULT"

  If Result <> 5 then 'ie. not a draw 
      
    If Player = 1 then
      P1wins = P1wins + 1
      vgagui.settext txtP1wins, str$(P1wins)
      vgagui.settext txtResult,"YOU WIN"
    Else
      P2wins = P2wins + 1
      vgagui.settext txtP2wins, str$(P2wins)
      vgagui.settext txtResult,"I WIN"
    EndIf
    
  EndIf
  
  If P1wins > 0 then 'avoid divide by zero
    vgagui.settext txtRatio, STR$((P2wins)/(P1wins), "%2.2f")
  EndIf
 
  vga.show '#debug
Return
'*********************************************************************************
Initialise_NN:
  T=millis        'Initialise timer
  D=0.7            'Random value divisor
  For N=0 to (numInputs*numInputNeurons)          'Initialise the input weights
    InputWeight(N)=((RND(1)-0.5)/D)
  Next N
  For N=0 to (numInputNeurons*numHiddenNeurons)   'Initialise the hidden layer weights
    HiddenWeight(N)=(RND(1)-0.5)/D
  Next N
  For N=0 to (numHiddenNeurons*numOutputNeurons)  'Initialise the output weights
    OutputWeight(N)=(RND(1)-0.5)/D
  Next N
  wlog "Init = ";(millis-T) 'Note time taken to initialise
Return
'**********************************************************************************
Sigmoid:
  Sig = Sig+1 'Running total of sigmoid itterations
  Sigmoid = 1 / (1 + Exp(-DataValue))
  Rate = Sigmoid * (1 - Sigmoid)       'While we're here, do the derivative as well
  Pause 15                             'REMOVE ME LATER
  vgagui.setvalue Prg1,100*sig/numSigs
  vgagui.refresh 1, touch.x, touch.y, touch.z  '#debug
  vga.show        '#debug
Return
'**********************************************************************************
GetGamePlay: 'Get the board state and convert to neuron inputs
  InputData(0) = H(0) 'The total move count (might have an effect? TBD)
  For i = 1 to Cs
    For j = 1 to Rs
      N = i*j                             'Input neuron index
     InputData(N)=Board(0,i,j)            'Player0 status
     InputData(N + (Cs*Rs))=Board(1,i,j)  'Player1 status
    Next j
  Next i
Return
'**********************************************************************************
FeedForward:
  T = millis 'Start the routine timer
  'Clear the input neuron sums
  For N=0 to numInputNeurons
    InputSum(N)=0
  Next N
  Gosub GetGamePlay 'Populate the inputs with the current state of play
  'Calculate the sums for the input neurons
  For I = 0 to numInputNeurons                            'for each input sum
    For N=0 to numInputs                      'point to each input
      Src=(I*numInputs)                           'calculate the source index
      InputSum(I)=InputSum(I) + ((InputData(N)*InputWeight(Src)))'get the sum
    Next N
  Next I
  'get the output value of each input neuron
  For I = 0 to numInputNeurons
    DataValue = InputSum(I)
    Gosub Sigmoid
    InputResult(I,0)=Sigmoid
    InputResult(I,1)=Rate
    Pause 2 'remove after speed tests complete
  Next I
  'Clear the hidden sums
  For N=0 to numHiddenNeurons
    HiddenSum(N)=0
  Next N
  'Calculate the sums for the hidden neurons
  For I = 0 to numHiddenNeurons
    For N = 0 to numInputNeurons
      Src=(I*numInputNeurons)                            'calculate the source index
      HiddenSum(I)=HiddenSum(I) + ((InputResult(N,0)*HiddenWeight(Src)))'get the sum
    Next N
    Pause 2 'remove after speed tests complete
  Next I
  'Calculate the output value for each hidden neuron
  For I = 0 to numHiddenNeurons
    DataValue = HiddenSum(I)
    Gosub Sigmoid
    HiddenResult(I,0)=Sigmoid
    HiddenResult(I,1)=Rate
    Pause 2 'remove after speed tests complete
  Next I
  'Clear the output sums
  For N = 0 to numOutputNeurons
    OutputSum(N)=0
  Next N
  'Calculate the sums for the output neurons
  For I = 0 to numOutputNeurons
    For N = 0 to numHiddenNeurons
      Src=(I*numHiddenNeurons)
      OutputSum(I)=OutputSum(I) + ((HiddenResult(N,0)*OutputWeight(Src)))
    Next N
    Pause 2 'remove after speed tests complete
  Next I
'Calculate output values for output neurons
  For I = 0 to numOutputNeurons
    DataValue = OutputSum(I)
    Gosub Sigmoid
    OutputResult(I,0)=Sigmoid
    OutputResult(I,1)=Rate
    Pause 2 'remove after speed tests complete
  Next I
  For N=0 to numOutputNeurons
    wlog N," ",str$(OutputResult(N,0), "%3.17f")," ",str$(OutputResult(N,1),"%1.17f")
  Next N
  wlog "FeedForward = ";(millis - T) 'Note time taken to run
Return
'*********************************************************************************

'#################################################################################
Init_LCD:
Num_Pages = 2   'Reserve two pages.
'Note: Config options module must be set to custom - No TFT
'SPI Pins: MOSI=11, MISO=13, SCLK=12
'TFT pins all set to -1 SDcard CS=10
'I2S pins BCLK=0, WSEL=18, DOUT=17
  vga.pinout 14, 21, 47, 48, 45, 9, 46, 3, 8, 16, 1, 15, 7, 6, 5, 4, 39, 40, 41, 42
  vga.delete 'Reclaim memory
  option.touch 1 'setup capacitive touch
  i2c.setup 19, 20
  touch.init 'initialise the capacitive touch
  vga.init 12, Num_Pages
  pwm.setup 2, 7, 128, 200,8  'set backlight brightness 0-255
  tft.loadfont "/fonts/FreeMono9pt7b.bin", 10
  tft.loadfont "/fonts/FreeSans12pt7b.bin", 11
  tft.loadfont "/fonts/FreeSans18pt7b.bin", 12
  'tft.loadfont "/fonts/Cast_Iron24pt7b.bin", 13 
  tft.loadfont "/fonts/FreeSans24pt7b.bin", 13
  vgagui.init 15  'Reserve space for gui objects
Return
'###############################################################################


You do not have the required permissions to view the files attached to this post.
User avatar
cicciocb
Site Admin
Posts: 2195
Joined: Mon Feb 03, 2020 1:15 pm
Location: Toulouse
Has thanked: 470 times
Been thanked: 1461 times
Contact:

Re: Captain's Mistress

Post by cicciocb »

Thank BeanieBots,
I'm happy to see finally an interesting project published, in particular one that involves the VGA output.

I can't wait to test it myself.
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

I'm glad you find it interesting.
However, if you have already tried it out, you are probably dissapointed because I'm sure that the bit you have an interest in is the bit that is still outstanding.
Namely, BackPropogation.
As you can see from the version log, I've been messing with the Neural-Network for quite some time.
The problem is that 60+ years ago I only dedicated a small cluster of brain cells to school lesson attention and dementia has now destroyed those cells!
I am struggling with implementing the "chain rule" for partial differentiation in the back-prop algoryhtm for updating the weights.
Also, BASIC does not support matrix multiplication, so it has to be done old-school long hand.
The forward propogation is working fine and thoroughly tested. In particular, the sigmoid function and it's derivative (calculated at the same time) is working well. The 'only' bit left to do is sort out the partial derivatives and apply them to update the weights.
I've done some preliminary speed tests. Feedforward takes about 1.5s per itteration. Therefore, backprop should take a comparable amount of time. The problem is that it needs to itterate over several thousand passes for each new pattern to learn effectively. I have read that it is possible to apply 'momentum' to the learning. This might speed things up.
Meanwhile, I'm considering adding a second ESP-S3 via serial comms to run the backprop as an independant background task.

None-the-less. As it stands, it does play a respectable game even if the algoryhtm is a basic brute-force method.
The VGA_LCD has made it a much more pleasant experience over using a 2.8" display with stylus. So, thanks for that.
That's what made me pick it up again after a long period of dormancy.
Please make any suggestions for improving the code. In particular, you may notice a few goto's in there.
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

I knocked up a quick neural net in VB6 to solve XOR with loads of text output to see what's going on.
I now have a better understanding of where things have gone wrong with back-prop.
The good news is that it can easily be fixed and implemented in Annex.
The bad news, it took over 50,000 epochs to solve! Annex takes ~1.4S per epoch.
That's about 18 hours to learn each new move.
Things can be improved by cutting down on the number neurons and using a learn rate of 1 to reduce the number of calculations.
Will also explore a linear transfer function to replace the time consuming sigmoid.
Not giving up just yet.
User avatar
cicciocb
Site Admin
Posts: 2195
Joined: Mon Feb 03, 2020 1:15 pm
Location: Toulouse
Has thanked: 470 times
Been thanked: 1461 times
Contact:

Re: Captain's Mistress

Post by cicciocb »

Hi, I tested your "first release" and I must see that is nice, a very good example of what is possible to do with Annex.
I recognize that the "bot" don't seems very smart and you can beat it easily.

I admit that I'm not good at neural networks, but I do know that there are functions available through the ESP-NN and the ESP-DSP libraries.

Let me know if this can be helpful for you, I can try to integrate some of these functions.
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

Thanks for the offer, but for now I would like to continue with the "old-school" method even if just to prove it can (or cannot) be done.
Also, this was originally aimed at the classic ESP32 with a 2.8" TFT so that a larger audience would be able to try it out.
I dont' think many have the specific hardware required.
I've had a few breakthroughs using VB6 to cut down on the amount of processing by using alternative ways of doing the math for learning so it's not dead yet. (still like the idea of a second ESP working in the background and sending updates via serial).
I'll have a closer look at those links because they might be of interest for other projects rather than this specific one.
I might also look at adding another "brute force" level to the existing algorythm to make it a bit smarter. Need to be careful though because Captain's Mistress (just like tic-tac-toe) is a solvable problem and nobody wants to play a game that is impossible to beat.
I'm also thinking about Backgammon as an option now that VGA can be used to swap pages. The graphics should be quite easy to implement.
Thanks again for producing such a fun platform to play with.
User avatar
cicciocb
Site Admin
Posts: 2195
Joined: Mon Feb 03, 2020 1:15 pm
Location: Toulouse
Has thanked: 470 times
Been thanked: 1461 times
Contact:

Re: Captain's Mistress

Post by cicciocb »

[Local Link Removed for Guests] wrote: [Local Link Removed for Guests]Fri Mar 22, 2024 8:43 am Thanks for the offer, but for now I would like to continue with the "old-school" method even if just to prove it can (or cannot) be done.
Also, this was originally aimed at the classic ESP32 with a 2.8" TFT so that a larger audience would be able to try it out.
I dont' think many have the specific hardware required.
I've had a few breakthroughs using VB6 to cut down on the amount of processing by using alternative ways of doing the math for learning so it's not dead yet. (still like the idea of a second ESP working in the background and sending updates via serial).
I'll have a closer look at those links because they might be of interest for other projects rather than this specific one.
I might also look at adding another "brute force" level to the existing algorythm to make it a bit smarter. Need to be careful though because Captain's Mistress (just like tic-tac-toe) is a solvable problem and nobody wants to play a game that is impossible to beat.
I'm also thinking about Backgammon as an option now that VGA can be used to swap pages. The graphics should be quite easy to implement.
Thanks again for producing such a fun platform to play with.

Thanks to you, I'm really glad to know that you can have fun with Annex
botman
Posts: 91
Joined: Thu Apr 01, 2021 3:04 pm
Has thanked: 10 times
Been thanked: 39 times

Re: Captain's Mistress

Post by botman »

The mention of "old-school" and AI reminded me of this picture. I made the machine using relays and lights from old pinball machines. That is me standing on the left.
1962 Science Fair.jpg
You do not have the required permissions to view the files attached to this post.
botman
Posts: 91
Joined: Thu Apr 01, 2021 3:04 pm
Has thanked: 10 times
Been thanked: 39 times

Re: Captain's Mistress

Post by botman »

I converted your program to run on my 3.5 inch CYD variant display module with capacitive touch.
It would run OK until one of the columns reached the top, and then it would display text on top of the board with each move.
By changing line 638 from >= to > , that small problem seems to be fixed.

I ran selfplay for more than 1000 games just as a first test:
20240325_065817.jpg
Player 2 seems to be, on the average, more than 2.5 times as likely to win as player 1.
I am now trying to understand the play algorithms that you implemented.

Thank you for sharing your work on this.
You do not have the required permissions to view the files attached to this post.
BeanieBots
Posts: 361
Joined: Tue Jun 21, 2022 2:17 pm
Location: South coast UK
Has thanked: 197 times
Been thanked: 115 times

Re: Captain's Mistress

Post by BeanieBots »

Thanks for trying it out and giving feedback.
All line 638 does is check to see if the column is full and puts a red X on top instead of the column number.
It needs to be ">=" to work correctly. You can comment out that entire for/next loop without effecting the game.
What is probably wrong is the scaling when I converted from 320*240 to 800*480.
Have you changed lines 28 and 29 to suit your display? Try slightly smaller than your actual resolution.
Lines 169 to 174 determine which algorythm it will use for the second player (player1. You are player0)
Please note AImoveNN is not yet fully implemented. (it will always try the same move)
At the moment, the best it can do is about four times better than a totally random move.

I'm trying out different activation functions at the moment. ReLU is MUCH quicker than the classic sigmoid but can result in dead neurons. My brain already has a few of those and it really does hamper progress :?
Post Reply