so the question is, how can we get computers to know which branches are ok to prune, and which aren't?
Sigh. The reason that humans experts are no longer competitive is because human experts prune where Deep Ply fears to trust static analysis. Pitted against a relentless algorithm which resists intuitive pruning, grand-master human pruning leaks a full pawn or two per game.
It's damn amazing how well grand-master level pruning actually works, but don't mistake this for flawless chess. Beautiful? Maybe. Flawless? Not even close.
When it was still somewhat competitive between man and machine, the human chess players would think they were pressing an overwhelming advantage, only to discover themselves mired in tiny, unanticipated tactical disadvantages move after move after move after move. "The damn thing keeps finding these fiddling resources!" If you weren't careful, you could easily lose from what had initially appeared to be a won position (and it probably would have been, against a human opponent blind to all those fiddling resources).
The trick for the competitive chess programmer was to achieve the right balance in the static evaluator so that tangible material gains didn't consistently outweigh less tangible advantage of tempo. Matthew Lai in his paper does not seem to grasp this essential trajectory of computer chess. He seems to think it's remarkable that his Oldsmobile displays more rigidity on the impact sled than the lunar lander, when it's pretty clear to everyone else involved that no Oldsmobile ever made was going to win the space race. The ply-based chess engines had their static evaluators hand-tuned by experts over many decades within a space gram clock-cycle budget.
Until he actually defeats all these programs on existing commodity hardware at existing tournament time controls, he's comparing watermelons to kiln-dried coconut flakes.
It's the same problem with new technology. It isn't enough to merely be better in some personally favoured dimension of merit. Your immature new thing has to be better enough to actually pass the mature old thing on its own terms.
Got a better substrate than silicon? Yeah? What's your defect density cranking out 10,000 wafers per month? Oh, you haven't actually developed all that quality-control infrastructure yet, but you figure you can do it at half the price once you work out the final kink from your strained bullerene crystal lattice?
Awesome progress, pal, but I think I'll invest my own Bitcoin elsewhere.
For the record, I've long believed that the trade-off moving from depth to sophistication wouldn't prove particularly steep (for the right sophistication). But any gradient that's a net loss (no matter how small) provides pretty much no immediate competitive incentive for anyone to invest any real effort hoeing that row.
The great thing about neural networks is that they don't actually require much real effort. The machine itself does most of the work in 72 hours. And then what have you got? A RISC chip that never actually kills x86 (because those idiots were busy touting microcosmic instruction set efficiency long after the real game had shifted to streamlining the cache hierarchy, where's there's no low-hanging ideological shortcut to help you overcome the first-mover fat-payroll advantage).
I have seen something else under the sun: The race is not to the swift or the battle to the strong, nor does food come to the wise or wealth to the brilliant or favor to the learned; but sunk cost and legacy happen to them all.