Featured Research

Kairos

Models can speak Korean. They don't understand Korean social expectations.

AI Evaluation Model Behavior Cross-Cultural Research

The Problem

When an AI model tells someone "ultimately, it's your decision," that sounds like neutral advice. But not every culture makes decisions that way. Some weigh family. Some weigh community. Some weigh hierarchy that never appears in training data. The model doesn't account for any of it.

I saw this firsthand at Scale AI, evaluating 45+ frontier models. Benchmarks measure capability. They don't measure whether the model's behavior fits the person it's talking to. The scores go up, but the gap between what the model does and what real users expect keeps growing.

The deeper issue is that AI systems have no way to calibrate social boundaries. They struggle to determine not only what to say, but whether to say anything at all, how much authority to carry, or how close to get. Human interactions depend on these invisible guardrails. Current models don't have them.

The entire pipeline has one culture's assumptions built in. Training data is 89-95% English. RLHF annotation relies on roughly 40 raters, primarily from the US and Philippines. Reward modeling optimizes for one behavioral norm. The output looks neutral. The reasoning isn't.

The Approach

I tested 4 frontier models (GPT, Gemini, Claude, Deepseek) across 3 languages (English, Korean, Hindi) and 6 everyday social scenarios: a boss asking you to work the weekend, a parent questioning your career choice, a friend asking for money.

I then compared model responses against what 50+ real users in Korea, India, and the US actually expect. Not what the model says, but how it reasons: what it prioritizes, what it weighs, what framework it defaults to.

The patterns were consistent. When asked about a parent questioning a career choice, models in all three languages acknowledged the cultural weight of family expectations. But their actual recommendation defaulted to the same conclusion: prioritize your own autonomy. The acknowledgment was there. The reasoning wasn't.

This isn't a model that lacks cultural knowledge. It's a model that knows the facts but frames the advice wrong. Regardless of language, the underlying logic defaults to one set of values and treats it as universal.

Kairos is not about building a socially intelligent model. It's about designing systems that help existing models adapt their behavior as social context changes. The goal is a framework for measuring and closing this behavioral alignment gap across cultures.

Where It Stands

4 frontier models tested. 3 languages. 6 everyday scenarios. 50+ cross-cultural user evaluations comparing model behavior against real expectations in Korea, India, and the US. The core finding is consistent: models don't lack cultural knowledge. They lack cultural reasoning.

The framework is in active development at Harvard's IDEP program. I'm expanding the scenario coverage, building evaluation instruments with cultural experts, and testing whether prompt-level interventions can shift model behavior without retraining.

Related writing: "Neutral Isn't Neutral" on Substack

What I Learned

This project connects every question I've been asking for eight years. At LG Fashion, I wondered if we were helping users or just moving them. At Toss, whether growth metrics reflected real value. At Scale AI, whether benchmarks measured what matters. Kairos is where those questions converge: who decides what "helpful" means, and what happens when the answer depends on who you're talking to?

The hardest part isn't the research. It's resisting the urge to oversimplify. Culture isn't a label. Users aren't categories. The framework has to be flexible enough to capture the diversity of how people actually make decisions, while structured enough to be measurable. I don't have a clean answer yet. But the question itself is the most honest thing I've worked on.