AI Works - Luca — AI, Coffee & Structural Thinking

Integration of Senses

Mar 2, 2026

—

by

A robot picks up a hot cup. Eyes locate it. Hands feel the heat. Ears hear water pouring. All senses work together—this is Embodied AI. From Tesla Optimus to Boston Dynamics Atlas, humanoid robots are fusing vision, touch, and proprioception. The final chapter: how multimodal AI understands the world like humans do.

Text Guides Image

Feb 11, 2026

—

by

Luca

in AI Works

Noise has no direction. Without text, it stays noise. “A cat flying through space”—this sentence guides the generation. The image asks: what should I become? Text answers through cross-attention. How Stable Diffusion uses Query, Key, Value to turn prompts into pixels.

Language Models That Read Images

Feb 9, 2026

—

by

Luca

in AI Works

Language models process text. Images are pixels. How can GPT-4V ‘understand’ photos? The answer: three components. A vision encoder converts images to tokens, a projection layer bridges dimensions, and an LLM reasons over both. The architecture behind Vision-Language Models—and why they still hallucinate.

Contrast Creates Meaning

Feb 1, 2026

—

by

Luca

in AI Works

Labels aren’t necessary. ImageNet needed 25,000 workers to label 14 million images. But the internet already has the answers—400 million image-text pairs. CLIP learned without labels and classifies things it’s never seen. How contrastive learning aligned images and text into one space.

Into a Shared Space

Jan 27, 2026

—

by

Luca

in AI Works

2012. CNN conquered images. Transformer conquered text. But each lived in separate worlds—vectors that couldn’t compare. What if a cat photo and the word “cat” existed at the same location? Shared embedding space makes this possible. How CLIP and ImageBind unified different senses into one language.

Learning from Human Feedback

Jan 22, 2026

—

by

Luca

in AI Works

November 2022. ChatGPT launched. 100 million users in 2 months. But GPT-3 existed since 2020—175 billion parameters. Why wasn’t it ChatGPT? The answer: RLHF. Reinforcement Learning from Human Feedback turned a language model into an assistant. How human preferences became the reward function.

The Critic and the Actor

Jan 22, 2026

—

by

Luca

in AI Works

“I’ll do it this way.” The actor speaks. “That’s not great.” The critic responds. The most successful structure in reinforcement learning separates action and evaluation. Actor-Critic combines value-based efficiency with policy-based flexibility—the foundation of A2C, A3C, PPO, SAC, and ChatGPT’s RLHF.

Learning the Policy Directly

Jan 20, 2026

—

by

Luca

in AI Works

Don’t calculate value. Just act. Like a basketball player who shoots without computing probabilities, Policy Gradient learns actions directly—no value function required. REINFORCE, continuous action spaces, and why both Physical AI and ChatGPT’s RLHF depend on this approach.

The Number Called Value

Jan 20, 2026

—

by

Luca

in AI Works

10,000 won tomorrow or 9,000 won today—which is more valuable? This question sits at the heart of reinforcement learning. Value functions compress uncertain futures into present numbers. The Bellman equation, TD learning, and Q-learning: the mathematics of foresight.

Reward Shapes Behavior

Jan 18, 2026

—

by

Luca

in AI Works

2016. A robot arm tries to pick up a cup. First attempt: miss. Hundredth attempt: drops it. Thousandth attempt: success. No one said “grasp like this.” Just a signal: pick up = +1, miss = 0. Reward shaped the behavior. This is reinforcement learning—learning without correct answers.

Category: AI Works