Context

There is a limit on how much one can hold at once.

This week I started to figure out what mine is. And realised I had been thinking about this problem for years - just in different languages.

Week 9 of life post-MAS (past weeks in my newsletter).

The familiar. Collaborating on a research paper on Agentic AI risk management. And suddenly realising a paper they cited is someone I know well. An ex-colleague whose brain I wanted to pick on some AI training ideas. She wondered aloud why I took so long to understand what she was trying to tell me weeks ago. Someone I met earlier, between jobs like me, figuring out what’s next. But in no hurry, like me too.

A venture studio I met earlier who wanted to see if I would be interested in AI governance training for board members. Someone steeped in compliance who I met last year while moderating a panel. Someone from a credit ratings agency building AI agents I met at the start of this adventure. The chat with her connected so many dots for me.

The unfamiliar. Consultants who I may collaborate with on my domain - AI risk management. Governance, technology, and HR folks from a bank who I met to discuss AI training ideas for finance. I was expecting to meet 2 of them. The whole team joined. An AI governance head in a big pharmaceutical company who coincidentally wanted to pick my brain on Agentic AI risk management. Someone working in a standards organisation who just wanted my perspective on life.

While the chats were interesting, I started thinking about what could fit in my limited window. Which again led me to another AI concept.

Some people have asked how I keep finding connections between life and AI concepts. Honestly, I don’t know either. But here’s another one.

The Context Window

If you’ve used large language models seriously, you know what a context window is.

It’s the maximum amount of text a model can hold in mind at one time. Everything the model knows about your conversation, the instructions, the history, what you said three messages ago, lives inside this window. Memory in these models isn’t separate. It’s just context that was retrieved and loaded in the window. When the window fills up, something gives. Some models like Claude have around 200k tokens for their window, some like Gemini have 2m tokens. Last I checked. But even with 2m tokens, Gemini can miss things if you stuff it with too much. Even with a large context window, attention degrades.

Other Places I’ve Seen This Before

The context window problem isn’t unique to LLMs. I’ve encountered versions of it across everything I’ve worked on.

In time series modelling, which is where my PhD research lives, the equivalent is the lookback window period. How far back should the model look to make a prediction? Too short, and it misses the pattern. Too long, and it loads historical regimes that no longer apply. Markets change structure. Relationships between variables shift. The right window isn’t the longest one. It’s the most relevant one.

And then there’s graph neural networks, which I also work with a lot. In a graph, context is defined by hops. One hop, your immediate neighbours. Two hops, their neighbours. Three hops, the wider network. Deeper context, richer information.But too much context here can cause everything to just be a muddling average. It’s called over-smoothing. Go too many hops, and every node starts to look the same. The signal drowns in aggregated neighbourhood noise. You lose the very distinctiveness you were trying to capture.

The same phenomena, three times over. More context degrades the signal. The right context isn’t the broadest context. It’s the most relevant context.

What I Noticed

I started the week with a clear list.

By the end of the week, the shape of the week had changed. Things kept arriving. A few weeks ago, I would have felt the pull of each one equally. Now, I realized that walking away was actually really easy. And I noticed I had developed filters I couldn’t quite articulate a month ago.

Is this interesting work? Are these people I think I will enjoy working with? Does this constrain me in ways that feel like a job? Is this AI done properly, or is it the shallow end of the pool dressed up as depth? I have spent too long watching organisations conflate prompt engineering with AI literacy. I don’t want to spend my context window there.

Not everything belongs in the window. I am slowly learning to tell the difference.

Still Learning the Lookback

In time series, the right lookback is regime-aware, you avoid history that no longer predicts. In graphs, you stop at the hop before signal turns to noise. Or you weight what matters properly.

I think the same applies here.

9 weeks out. Busier than I expected. Less scary than I feared.

But the context window is still finite. That used to feel like a constraint. Increasingly, it feels like the point.

#AI #AIRiskManagement #Transitions #Reflections #Context