Deep Analysis of "The Divine Move"

On March 10, 2016, during Game 2 between AlphaGo and Lee Sedol, Move 37 saw AlphaGo play a "shoulder hit" on the fifth line in the upper right area.

This move came to be known as "The Divine Move." It not only helped AlphaGo win the game but also changed humanity's understanding of Go.

This article will deeply analyze this move from multiple perspectives: the game context, traditional Go theory, expert reactions, the AI perspective, and its long-term impact on Go theory.

Game Position Review

The Opening of Game 2

After losing Game 1, Lee Sedol made adjustments in Game 2. He chose to play White (second), hoping to observe AlphaGo's opening tendencies before formulating his strategy.

Opening phase:

Black 1: Star point in the upper right corner
White 2: Star point in the lower left corner
Black 3-White 4: Each side occupies a corner

Up to Move 36, the game developed normally. AlphaGo played Black and engaged in a local battle in the upper right corner. White (Lee Sedol) had built influence on the right side, while Black had some territorial potential on the top.

Position After Move 36

Let's look at the board state after Move 36:

	D	K	P	Q
19
18
17	○			●
16	+	+		+
15				●
14			○		White's influence
13
12
11
10	+	+		+
9
8
7
6
5
4	+	+		+
3	○			●
2
1

Simplified diagram; the actual position was more complex

Key observations:

White has outside influence on the right
Black has territorial potential on the top
The battle in the upper right corner has paused

It was Black's (AlphaGo's) turn to play.

Traditional Move Analysis

Professional Players' Expectations

Before Move 37, professional players in the commentary room were enthusiastically discussing. They generally expected Black to choose one of the following moves:

Option A: Approach in the lower right corner

This was the most "normal" choice. Black could:

Claim the last big point (lower right corner)
Maintain balance in the game
Follow the traditional value of "corner-side-center"

Option B: Enclose the top

Black could also extend two or three spaces on the top to solidify his sphere of influence. This would:

Convert the potential on top into territory
Limit White's development space

Option C: Center invasion

Some players thought Black might play in the center to constrain White's right-side influence. While not the most common choice, it was strategically justifiable.

The Unexpected Choice

However, AlphaGo chose a position almost no one anticipated:

E5 (Fifth-line shoulder hit)

This move was placed on the right half of the board, near the center, a "shoulder hit" against White's right-side influence.

Move 37: The Fifth-Line Shoulder Hit

Where Is This Move?

	D	K	P	Q
19
18
17	○			●
16	+	+		+
15		37		●	Move 37
14			○
13
12

Move 37 was played at K15 (or J5, depending on the coordinate system).

What Is a "Shoulder Hit"?

A "shoulder hit" is a technique in Go that refers to playing diagonally close to an opponent's stone. Its characteristics are:

No direct contact: Maintains one space distance from the opponent's stone
Disrupts structure: Throws off the opponent's expected development
Difficult to respond: Any response from the opponent comes with some cost

Traditionally, shoulder hits are played on the third or fourth line. A fifth-line shoulder hit is extremely rare because:

Position too high: The fifth line is close to the center, traditionally considered inefficient
Easy to attack: Isolated stones can become targets
Unclear value: Unlike corners and sides, it lacks clear territorial value

Expert Reactions in Real-Time

Shock in the Commentary Room

The moment Move 37 was played, the commentary room fell into brief silence.

Korean commentary (Kim Seong-ryong 9p):

"This... what is this? A move on the fifth line? I don't understand. This must be a mistake, right?"

Chinese commentary (Gu Li 9p):

"I can't understand this move. If one of my students played this, I would scold them severely."

American commentary (Michael Redmond 9p):

"Very unusual move. I don't think any human would play this."

Real-Time Comments from Professional Players

On various live streaming platforms, professional players were commenting:

Ke Jie (World No. 1 at the time):

"I cannot understand the intention of this move. If AlphaGo wins, I will study it carefully."

Park Junghwan (Top Korean player):

"This move is too strange. Is there a bug in the program?"

Mi Yuting (Chinese World Champion):

"Fifth-line shoulder hit? I've never seen this kind of move."

"One in Ten Thousand Probability"

After the match, the DeepMind team revealed a stunning statistic:

"According to our analysis, if a professional player faced the same position, the probability of choosing Move 37's position would be about one in ten thousand."

In other words, in the human Go knowledge system, this move was virtually a "non-existent" option.

The AI Perspective

Policy Network Probability Distribution

Let's see how AlphaGo's Policy Network evaluated this position:

載入中...

The chart above shows AlphaGo's probability assessment for each position.

Key observations:

Move 37's position: About 8% probability, not the highest
Traditional choices (like lower right corner): About 12% probability
Other candidate positions: Scattered across different areas

Interestingly, Move 37 was not the highest probability choice in the Policy Network's evaluation. So why did AlphaGo choose it?

MCTS Deep Evaluation

The answer lies in Monte Carlo Tree Search (MCTS).

The Policy Network only provides "intuition"; the real decision comes from MCTS's deep simulation. AlphaGo simulates thousands of possible futures before making a decision.

For Move 37, the MCTS evaluation process was:

Position K15 (Move 37):
├── Simulation 1: Black wins (+0.3)
├── Simulation 2: Black wins (+0.5)
├── Simulation 3: Black wins (+0.2)
├── ...
└── Average win rate: 58%

Position R3 (Lower right approach):
├── Simulation 1: Black wins (+0.1)
├── Simulation 2: White wins (-0.2)
├── Simulation 3: Black wins (+0.2)
├── ...
└── Average win rate: 52%

Although the lower right corner had higher "intuitive probability," after deep simulation, Move 37's expected win rate was higher.

Value Network Global Assessment

The Value Network assessed Move 37's value from a global perspective:

Win rate before Move 37: About 52% (Black slightly ahead)

Win rate after Move 37: About 58% (Black with clear advantage)

This means Move 37 increased AlphaGo's expected win rate by 6 percentage points.

This improvement is quite significant in Go. Usually, a good move brings only 2-3% win rate improvement.

Go Theory Analysis: Why the Fifth-Line Shoulder Hit?

From a Local Perspective

On the surface, Move 37 seems inefficient:

Position too high: Fifth line is closer to the center than fourth or third
No territory: Unlike corners and sides, it doesn't directly claim territory
Vulnerable to attack: Isolated stones could be targeted by White

But if we analyze carefully, this move has several subtle benefits:

Disrupts White's influence: White had planned to develop on the right; Move 37 disrupted this plan
Establishes presence: Though not enclosing territory, it establishes presence in the center
Increases complexity: Creates a complex position, favoring the side with stronger calculation

From a Global Perspective

The true value of this move needs to be understood globally:

The Thickness vs. Territory Trade-off

Traditional Go theory holds that "corners are gold, sides are silver, center is grass" — corners are most valuable, center least valuable. But Move 37 challenged this notion.

AlphaGo's evaluation showed: in this specific position, central influence was more valuable than corner territory.

This is because:

Black already had sufficient territorial foundation
White's right-side influence would be powerful if allowed to develop
Constraining White was more important than expanding oneself

The Value of "Sente"

Move 37 had another underestimated benefit: it maintained "sente" (initiative).

In Go, "sente" means controlling the initiative. After Move 37, White had to respond, allowing Black to continue directing the game's flow.

If Black had chosen a "normal" approach in the lower right corner, both sides might have engaged in joseki, and the position would have balanced. But Move 37 broke this balance, filling the game with uncertainty — exactly what AlphaGo excelled at.

Lee Sedol's Dilemma

After Move 37, Lee Sedol thought for a long time. His dilemma was:

If he responds directly (like jumping or flying):

It acknowledges Move 37's value
Black achieves the goal of disrupting White's influence

If he ignores it:

Black might further develop the center
White's right-side influence would struggle to become territory

In the end, Lee Sedol chose to respond. But regardless of his choice, Move 37 had already achieved its purpose.

Subsequent Development: From Move 37 to Victory

Middle Game Evolution

After Move 37, the game entered a complex middle game battle.

Key developments:

Moves 40-50: Both sides engaged in fierce contact fighting on the right
Moves 50-70: AlphaGo leveraged the influence established by Move 37 to gain advantage in the center
Moves 70-100: Black gradually converted the advantage into territory

By around Move 100, AlphaGo's lead was quite clear. Although Lee Sedol tried to fight back, he couldn't turn the situation around.

Final Result

AlphaGo wins by resignation

This game's victory was largely due to Move 37. Post-game analysis showed that without Move 37, the position would have been much closer, and White might even have had the advantage.

Impact on Go Theory

Birth of New Joseki

Move 37 triggered a reconsideration of the "shoulder hit" technique in the Go world.

Traditional view:

Shoulder hits should be on the third or fourth line
Fifth-line shoulder hits are too inefficient
Isolated stones are vulnerable to attack

After AlphaGo:

Fifth-line shoulder hits are the best choice in certain positions
Position "height" matters less than "effect"
Each move's value needs to be evaluated from a global perspective

Human Players Learning

After Move 37, many professional players began trying similar moves:

Ke Jie used fifth-line shoulder hits successfully in several games in 2017:

"AlphaGo taught me that many moves we thought were 'bad' are simply moves we didn't understand."

Park Junghwan also incorporated this way of thinking into his games:

"The important thing isn't remembering the specific position of Move 37, but learning to see the board with new eyes."

Implications for Go AI Training

Move 37 also had far-reaching implications for Go AI research:

Reflection on Policy Network:

Why did the Policy Network give Move 37 a lower probability? Because it learned from human game records, and humans rarely play such moves.

This shows: supervised learning alone (learning from humans) is not enough. AI needs to explore on its own to discover good moves unknown to humans.

This was one reason why AlphaGo Zero later adopted pure self-play training.

Affirmation of MCTS:

Move 37 proved the value of deep MCTS search. Even when intuition (Policy Network) doesn't favor a move, deep analysis can discover its potential value.

This insight was later applied to many other fields.

Technical Details: Recreating Move 37's Decision Process

Policy Network Input Features

After Move 36, the Policy Network's input included:

Feature Plane	Description
1-8	Black stone positions (past 8 moves)
9-16	White stone positions (past 8 moves)
17	Whose turn it is
18-48	Other features (liberties, atari, etc.)

Total of 48 feature planes of 19x19, forming the input tensor.

Policy Network Output

The Policy Network outputs a 19x19 = 361 dimensional probability distribution.

For Move 37's position:

# Top 5 candidate positions (simplified)
{
    "R3": 0.12,   # Lower right approach
    "Q17": 0.10,  # Upper right corner
    "C10": 0.09,  # Left side big point
    "K15": 0.08,  # Move 37's position
    "D16": 0.07,  # Upper left corner
    # ... 356 other positions
}

MCTS Exploration Process

AlphaGo uses the PUCT formula to balance exploration and exploitation:

U(s,a) = Q(s,a) + c_puct × P(s,a) × sqrt(sum_b N(s,b)) / (1 + N(s,a))

Where:

Q(s,a): Average value of position a
P(s,a): Probability given by Policy Network
N(s,a): Number of times this position was explored
c_puct: Exploration constant

For Move 37, although the initial probability P was low, after multiple simulations, the Q value kept increasing, eventually surpassing other candidates.

Impact of Simulation Count

The DeepMind team later analyzed that "discovering" Move 37 required sufficient simulations:

Simulation Count	Best Choice
100	R3 (lower right)
1,000	Q17 (upper right)
10,000	K15 (Move 37)
100,000	K15 (more certain)

This shows: deep search can discover good moves that shallow search cannot find.

Philosophical Reflections: Cognitive Differences Between Humans and AI

Why Couldn't Humans Think of Move 37?

This is a profound question. Possible reasons include:

1. Limitations of Experience

Human players' knowledge comes from studying predecessors' game records. If predecessors never played a certain move, we won't consider it.

2. Bias of Intuition

Human intuition is useful but limited. Our intuition makes us "blind" to certain options.

3. Difference in Computational Ability

Move 37's value required deep calculation to discover. Human computational ability is limited; we can't simulate thousands of possibilities like AI.

What Is Machine "Intuition"?

Does AlphaGo have "intuition"?

In a sense, the Policy Network is AlphaGo's "intuition" — it can evaluate each position's potential in milliseconds.

But this "intuition" differs from human intuition:

Human intuition: Comes from experience and pattern recognition
AI intuition: Comes from statistical learning on massive data

Interestingly, Move 37 proved that: AI's "intuition" can be corrected by MCTS. This means AI can "reflect" on its own intuition and find better choices.

What Can Humans Learn from AI?

The biggest insight from Move 37 for human players may be:

Don't let experience become shackles

Many "bad" moves may simply be moves we don't understand. Opening our minds and being willing to try unconventional moves may reveal new possibilities.

This insight applies not just to Go, but to many areas of life.

Animation Reference

Core concepts in this article and their animation numbers:

Number	Concept	Physics/Math Correspondence
C3	Traditional Go value judgment	Heuristic function
C5	Geometric properties of shoulder hit	Spatial relations
C7	Gap between expert intuition and AI evaluation	Prediction error
C9	Policy Network output distribution	Softmax probability
C11	How MCTS corrects Policy Network	Bayesian update
C13	Value Network incremental evaluation	Value function
C15	Global value function calculation	Integral approximation
C17	Forced choice in game theory	Dominant strategy
C19	How one move changes the entire game	Bifurcation point
C21	How AI expands human cognitive boundaries	Search space expansion
C23	Importance of feature engineering in Go AI	Representation learning
C25	How PUCT formula discovers non-intuitive good moves	Exploration-exploitation tradeoff
C27	Cognitive bias and AI transcendence	Unbiased estimation

Interactive Exploration

Policy Network Probability Distribution

Use the interactive visualization below to explore the Policy Network's output in different positions:

載入中...

Try switching between different preset positions to observe how AI evaluates each position's probability of being a good move.

References

Silver, D., et al. (2016). "Mastering the game of Go with deep neural networks and tree search." Nature, 529, 484-489.
DeepMind Blog: "AlphaGo: The story so far"
AlphaGo Documentary (2017), Director Greg Kohs.
Lee Sedol vs AlphaGo Game 2 Official Game Record
Go4Go.net Professional Game Analysis
Korea Baduk Association Post-Match Technical Report

Game Position Review​

The Opening of Game 2​

Position After Move 36​

Traditional Move Analysis​

Professional Players' Expectations​

The Unexpected Choice​

Move 37: The Fifth-Line Shoulder Hit​

Where Is This Move?​

What Is a "Shoulder Hit"?​

Expert Reactions in Real-Time​

Shock in the Commentary Room​

Real-Time Comments from Professional Players​

"One in Ten Thousand Probability"​

The AI Perspective​

Policy Network Probability Distribution​

MCTS Deep Evaluation​

Value Network Global Assessment​

Go Theory Analysis: Why the Fifth-Line Shoulder Hit?​

From a Local Perspective​

From a Global Perspective​

Lee Sedol's Dilemma​

Subsequent Development: From Move 37 to Victory​

Middle Game Evolution​

Final Result​

Impact on Go Theory​

Birth of New Joseki​

Human Players Learning​

Implications for Go AI Training​

Technical Details: Recreating Move 37's Decision Process​

Policy Network Input Features​

Policy Network Output​

MCTS Exploration Process​

Impact of Simulation Count​

Philosophical Reflections: Cognitive Differences Between Humans and AI​

Why Couldn't Humans Think of Move 37?​

What Is Machine "Intuition"?​

What Can Humans Learn from AI?​

Animation Reference​

Further Reading​

Interactive Exploration​

Policy Network Probability Distribution​

References​

Game Position Review

The Opening of Game 2

Position After Move 36

Traditional Move Analysis

Professional Players' Expectations

The Unexpected Choice

Move 37: The Fifth-Line Shoulder Hit

Where Is This Move?

What Is a "Shoulder Hit"?

Expert Reactions in Real-Time

Shock in the Commentary Room

Real-Time Comments from Professional Players

"One in Ten Thousand Probability"

The AI Perspective

Policy Network Probability Distribution

MCTS Deep Evaluation

Value Network Global Assessment

Go Theory Analysis: Why the Fifth-Line Shoulder Hit?

From a Local Perspective

From a Global Perspective

Lee Sedol's Dilemma

Subsequent Development: From Move 37 to Victory

Middle Game Evolution

Final Result

Impact on Go Theory

Birth of New Joseki

Human Players Learning

Implications for Go AI Training

Technical Details: Recreating Move 37's Decision Process

Policy Network Input Features

Policy Network Output

MCTS Exploration Process

Impact of Simulation Count

Philosophical Reflections: Cognitive Differences Between Humans and AI

Why Couldn't Humans Think of Move 37?

What Is Machine "Intuition"?

What Can Humans Learn from AI?

Animation Reference

Further Reading

Interactive Exploration

Policy Network Probability Distribution

References