forked from SamsungSAILMontreal/TinyRecursiveModels
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathnotes
More file actions
39 lines (37 loc) · 1.28 KB
/
Copy pathnotes
File metadata and controls
39 lines (37 loc) · 1.28 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
Original
Training
Init: steps = 0, halted = 1
For halted puzzles:
steps = 0
reset z_L, z_H
replace data
Run model => new (z_L, z_H, logits, q_halt_logits) => outputs
steps += 1
halted = (steps >= 16) or (q_halt_logits > 0) or (exploration)
return new (z_L, z_H, steps, halted, data, outputs)
Evaluation
Init: steps = 0, halted = 1
For halted puzzles:
steps = 0
reset z_L, z_H
replace data
Run model => new (z_L, z_H, logits, q_halt_logits) => outputs
steps += 1
halted = (steps >= 16)
return new (z_L, z_H, steps, halted, data, outputs)
New
Training
Init: steps = 0, halted = 1
For halted puzzle:
steps = 0
reset z_L, z_H
replace data
Run model => new (z_L, z_H, logits, q_halt_logits) => outputs
steps += 1
halted = (steps >= 16) or (q_halt_logits > 0) or (exploration)
return new (z_L, z_H, steps, halted, data, outputs)
Evaluation
Init: steps = 0, halted = 1
Run model => new (z_L, z_H, logits, q_halt_logits) => outputs (wrong)
steps += 1 where not halted
halted = (steps >= 16) or (q_halt_logits > 0)