Notes / 2026-06-28

The pattern recognition enhancement

Instead of feeding AI the entire sugarcane farm, we can process the cane ourselves and provide cleaner sugar.

The pattern recognition enhancement

My weekday shift in what I typically call my private practice usually ends just before 12 PM MNL then a little later, it is time for chow, usually around 12ish PM because apparently lunch needs its own soft launch.

One thing we always have during lunchtime is a TV playing in the background. It is a few yards away from the dining area, so we may not always see what is happening, but we can definitely hear it.

Yes, we can hear It’s Showtime, one of the longest-running noontime shows in the Philippines.

While we were having lunch, I heard Baby Dolls performing Oohh Lala Baby.

I will admit it: the song made me dance.

And yes, I briefly paused eating just to see how the girl group performed it. They were great!

This weekend, I tried to dig deeper into who wrote the song. That small investigation made me think about how pattern recognition is wired into humans whenever we consume or produce something.

The song was composed by Christian Martinez.

Upon further research, I learned that Christian also composed Kikay by Viva Hot Babes.

No wonder Oohh Lala Baby made me dance.

It carried the energy of another phenomenal song composed by someone who already made us dance when I was in 8th grade. Good thing we did not have smartphones recording everything back then. Dios mio, evidence permanently unavailable.

Kudos to Christian for making me dance again in 2026!

Does Martinez have a bias toward writing songs that make Filipinos dance?

I do not know the exact answer. I do not have access to the repository inside Martinez’s head. The repo is private: no clone access, and git pull is definitely out of the question.

Christian has been an artist for quite some time, and I know that every musical work requires effort and talent. But based on the outputs I have heard, one thing appears repeatedly: energy is always present in the music.

That is not proof that Martinez follows a formula because it may simply be Martinez’s artistic signature.

Humans recognize patterns like this naturally. We encounter one output, remember another, and eventually connect them. The more evidence we accumulate, the easier it becomes to recognize a recurring style.

But while we may not know the exact methodology individuals use when creating their products, we now have tools that can help us preserve patterns, maintain consistency, and remain factual about our own output.

After the first quarter of 2026, I started using Codex by OpenAI.

I already had experience using coding agents built around LLMs, so Codex became another opportunity to explore a new tool. It had already been around for roughly a year, but it only captured my interest this year when I started using it to accelerate personal dev projects.

I also wanted to separate those projects from what I use in my private practice. Separation of concerns, hehe.

When using ChatGPT through the web or mobile app, it can use persistent memory when the feature is enabled. I opted in.

It helps, but it does not always preserve the exact patterns I want it to remember. Sometimes the output feels bland. Sometimes details drift. In the worst cases, the model confidently produces something that does not match the record I remember.

That does not necessarily mean something broke inside an OpenAI data center.

Hallucinations can happen for several reasons: insufficient context, ambiguous prompts, incorrect sources, faulty retrieval, or the probabilistic nature of language models.

But in my particular use case, I realized that the memories and patterns I wanted the model to recognize were not structured well enough.

So I went to Codex and decided to work on that part myself.

Codex has an archiving option for its threads. Archived session transcripts are stored under:

$CODEX_HOME/archived_sessions

On Windows, the default location is typically:

C:\Users\<username>\.codex\archived_sessions

Inside that directory are newline-delimited files, commonly called .jsonl files. These can serve as records of the conversations, instructions, tool calls, and outputs produced during Codex sessions.

They are Codex session transcripts and not automatic copies of ordinary conversations from ChatGPT web or mobile.

This is where we can improve continuity and pattern recognition.

An AGENTS.md file can provide persistent guidance about how we want Codex to behave: communication style, working agreements, directory conventions, legal checks, and other instructions that should apply consistently.

But archived .jsonl files do not automatically become an additional index simply because AGENTS.md exists.

Codex still needs to be explicitly instructed to read or search them. Better yet, we can build a retrieval process that extracts only the parts relevant to the current conversation.

The problem is that it cannot simply be: Hey, archived sessions, please resync everything that happened. Once you are ready, we will start this new day.

I mean, why make AI do all the harvesting?

Why give it the entire sugarcane farm and ask it to cut the cane, operate the mill, remove the waste, refine the sugar, and serve the final product?

That is where human involvement should begin.

We can trim the properties that are not necessary for memory retrieval. We can remove repeated system instructions, tool noise, expired authentication data, and unrelated events. We can retain useful timestamps, user messages, final responses, decisions, and shift recaps.

Instead of feeding AI the entire sugarcane farm, we can process the cane ourselves and provide cleaner sugar.

Then AI can handle the final refinement, and I am guilty of not doing this yet.

At the moment, I am still working on an automation or script that can capture the information that is genuinely useful to my coding agent. A rollout .jsonl file contains several record types that can be filtered, normalized, or discarded depending on the use case.

For remembering the context of one session, a single .jsonl file may be manageable.

But what happens when we are dealing with a month or an entire quarter of archived rollout files?

Loading everything consumes more tokens. It introduces more noise. And more context does not automatically produce better understanding.

Sometimes better pattern recognition begins with humans doing the cleaning first.

Christian Martinez may not need any of this. Martinez already has the artistic ability to recognize rhythm, energy, and resonance and turn them into songs that make people move. Maybe Martinez uses AI today; maybe not. That is not something I should invent.

One thing is certain: if Christian also uses AI to enhance the craft, I will probably dance again when the next phenomenal song arrives.

Oh, btw, Christian also composed Totoy Bibo, Don Romantiko, and yes Kendeng Kendeng by Willie Revillame.

Oh, ang galing-galing mong sumayaw!

Stay tuned for the automation scripts once I figure out the anatomy of these .jsonl files.

Less noise, fewer tokens, better continuity.

Better craft and output, just like the songs Martinez creates.

Gosh! Christian, you really made us dance with your products.

View source markdown