Skip to content

Luca — AI, Coffee & Structural Thinking

AI Works
도구와 기술
커피의 구조
감각과 경험

Tag: value function

The Critic and the Actor

Jan 22, 2026

—

by

Luca

in AI Works

“I’ll do it this way.” The actor speaks. “That’s not great.” The critic responds. The most successful structure in reinforcement learning separates action and evaluation. Actor-Critic combines value-based efficiency with policy-based flexibility—the foundation of A2C, A3C, PPO, SAC, and ChatGPT’s RLHF.
The Number Called Value

Jan 20, 2026

—

by

Luca

in AI Works

10,000 won tomorrow or 9,000 won today—which is more valuable? This question sits at the heart of reinforcement learning. Value functions compress uncertain futures into present numbers. The Bellman equation, TD learning, and Q-learning: the mathematics of foresight.

Type your email…

Luca — AI, Coffee & Structural Thinking

© Luca. All rights reserved.

About

About Luca
Contact

Legal

Privacy Policy
Terms & Conditions
Cookie Policy

Loading Comments...

Write a Comment...

Email (Required)

Name (Required)

Website