Tuesday, June 16, 2026

Towards Conducting AI's through High-Bandwidth Colloration

We are all experimenting, feeling out the path to the future of creative AI  / human collaboration.


Along the way, I think there is a real danger in getting over focused on the curation of "preferences" in how working "feels" instead of results. 

Like the coffee afficuanato, who insists a $3000 coffee grinder increases their experience of the morning drip measurably, when I can scientifically prove he's not getting more caffeine into his bloodstream than I am with a $75 Cuisinart grinder. When we focus on the "how it feels" of methods, instead of the "what comes out" results, then IMO we are talking about detatched token snobbery. Not value. Not method innovation.

TLDR - When the LLM talks in a way that's wrong, it will code in a way that's wrong. We can only correct it if we read what it's saying. LLMS process information differently than we do. IMO, the more can get *both* of our capabilities deployed at the token emission site, the better the value velocity is. I have a *strong* preference for Opus 4.6, because it is on the razors edge of smart enough to do the work and sycophantic enough to listen to me. When I already know the solution or algorithm I know will work, I don't have much tolorance for arguing with a word calculator about it.


------

I admit this is probably something I should make a more public post about... You all are my test audience, to get my words sharp and figure out what I'm saying. I also find the public channels frustratingly unsatisfying.

LLMs work in a next token probability space that i call the "context manifold", because mathetmatically it's an N-dimensional shape function that combines with Model probabilties to produce token ouput.. My assertions are (a) the the more tightly I can keep that next token probability space clean and aligned, the faster value comes out the other side, (b) I can *feel* or *sense* the context manifold alignment or misalignment by reading it, (c) I can only do this if I can SEE the token steam.

Tools like Claude Code, Cursor, and Antigravity, for me, feel like telling someone how to play baseball without being able to see the game. Logical instructions, thrown over a wall, that I can't learn from or guide. I need to see the token stream, so I can detect drift as early as possible. So I can fix the context shape in this session, and learn to better shape the next 100 sessions.

I call my workflow conducting, not babysitting, because im making the orchestra performance possible, not changing diapers. It feels more like juggling 100 balls than queueing a prompt and getting coffee or sitting around reviewing diffs.

...And i took it to an extreme, where i built my own LLM creative collaboration and coding harness because my goal is not "less annoying", my goal is increasing the bandwidth of human/llm communication. More automated tooling like Cursor, Claude Code, and Antigravity are not just massively less productive for me, they are actively obstructive to my productivity by hiding information and getting in the way of me shaping the the next-token prediction streams for maximum value creation.

-----

At a higher level, I see an important bifrucation happening, whcih I break down in this way:
  • eyes-off agentic - Most tooling I see is trending to more automated loops and to hiding and summarizing information. In this model, the human brain provides the high level goals, and lets the AI token stream do what it wants to, with lower frequency check-ins on functionality value more than form.
  • eyes-on agentic - This is what I call it when we *partner* our human brain intelligence deeply into the LLM next token decisions and the shape of the probability space. It's admitting the code-is-context, and getting involved tightly-shapeing-code-to-shape-LLM behavior. This doesn't mean always code-reviewing or always intervening. This means human brain modeling of what the LLM will do and is doing, in order to catch probably space drift and code/context shape drift early. To keep velocity and parallelism highest, by minimizing length of time absolute nonsense LLM probably space word salad affects the surviving code-base. 
By measurement - I believe I'm conservatively 50x more productive (quality and velocity) using eyes-on agentic.

My productivity value from eyes-on is not abstrat. I quantify it, and it's accelerating. (graphs below)

Tom has pointed out many ways that my workflow may be something that most people simply can-not manage, and I accept that possibility. You can't stick a 16 year old new driver in an indy car and expect anything other than disaster.  

I run and read/skim 2-6 simultaneously running and visible llm chat sessions. I suspect not alot of people can keep up with this. Howveer, I assert that the value many are leaving on the table by being eyes-off, is as I stated above, the chance to "get our human brain intelligence deeply woven into the LLM next token decisions" at whatever bandwidth they can handle.

My custom coded LLM chat-interaction is WYSIWYG Markdown with no "elliding" of LLM tokens, and i turn off "thinking" in coding sessions because i find it counterproductive and annoying (ive been told GPT codex thinking adds value, have not tried it enough). I don't use canned "skills", but instead I write and refine and hone custom implementation specs per task. My system prompt for coding is BARE - tool use definitions and 8 lines of general framing. I don't use system prompt "rules", as I find they pollute the context and do more harm than good in the long haul. (i'm using methods other than prompt "rules" to get adherence to pattern requirements).

I have a *strong* preference for Opus 4.6, because it is on the razors edge of smart enough to do the work and sycophantic enough to listen to me. It's easier for me to manage the sycophancy than it is for me to suffer the wastefull long-winded opinioned arguing and deviations from my instructions I get from Opus 4.8 and Gemini 3+. I admit I have not used GPT Codex enough. 

I have not been able to code with Fable yet. My long 2 hour chat with Fable suggests that it will be notably better at eye-off autonomous coding work, but I did *not* enjoy the design session I had with it. Fable still takes too many turns of arguments for it to stop answering with model bias and start reasoning from facts and ground truth. When I already know the solution or algorithm I want to use, I don't have much energy for arguing with a word-calculator about it.

-----

The "eyes-off" coding ceiling is certainly improving...

My non-coder wife built an entire mobile web babysitter organizer app by herself! (And its good!) She is more of the magic there than she takes credit for (she has a cs degree and did y2k programming in cobol in her 20s before she shifted to sales). im it blown away. Its also a categorically bounded type of work. Also,opus dumped everything in a 5500 line jsx that would like have ate itself eventually if i jadnt intervened.

I setup our 12 year old son Jack with claude desktop and a pattern for working on 2d webgames. He messed around making his version of a side scroller category called "gravity flip obstacle course". Then he wanted to do AI unreal engine vibing. That is not viable right now. I did some research. I experimented with claude code and Godot. Gdscript was a full fail mess, but i was already intending to do godot c#. I pivoted and got him into a claude code chat that converted his 2d gravity flip into godot c#. Categorically better at godot c# than gdscript,than trying to vibe code unreal.

I sat him down at it, walked away, and 1.5 hours later, he had a 3d viewport blocky godot knight running around an undulating terrain of sand, with a comic proportion medieval castle "town" where he could walk up to a vendor, by a sword, and swing it by clicking right mouse button. This is a 12 year old who cant program, barely can do bounded programming class "puzzles". That is insane. That is also not becoming a product without skilled intervention, but the *learning* happening there is the closest thing to Diamond Age and the "young womans illustrated primer ive seen"

Part of the magic in both of hese cases, is putting the AI into a space it can succeed, and keeping it in that space.

If one doesn't do this, it's more like going to the roulette table and betting on black.

------------

Below is a graph of code+markdown lines contributed to my coding projects since August 2025. 

Of course we can all admit that "lines of code+markdown" is a narrow metric. It doesn't tell you anything about work product. And so, in that respect we merely have to decide how much to trust the conversation and the presenter. 

I stopped using any other AI coding tools April 9th 2026, because i find my harness more productive and more pleasant. (code-named AstroNMCL) I'm not writing a tool to write a tool. I'm writing a tool to produce software and creative output. And I'm producing it.

10 mos - September August 2025 to June 2026 - 1.1M lines (110k/mo)
6 mos - December 2025 to June 2026 - 750k lines (125k/mo)
4 mos - Feb 2026 to June 2026 - 550k lines  (137k/mo)

image.png

Below is a similar graph of "Story Fiction" Prose Lines I've conducted / co-written over the same timeframe.  

It sits at about 2.7M words as of 6/15. 

What is the quality? I like to say better than Twilight, worse than Hemmingway. The key thing here is that this isn't random chunky paragraphs out of an LLM, and it isn't "write this chapter for me". This is collaboratively constructed long-form novel-fiction as a constructed artifact - almost like software, produced from world+character+goal+harness design. My prompts and harness itself are designed to do things to scaffold the LLMs needs when constructing fiction, and I'm deeply involved in the next-token preduction. 

image.png

If you made it this far. Thank you, I appreciate it. 

What my 3 monitor setup looks like during a typical session:

image.png

Friday, May 29, 2026

What 30,000+ hours of typing has taught me about keyswitch feel

A frequently pondered question is what type of keyboard key is the fastest to type on?

That's the wrong question. The right question is...

   what type of keyboard is the fastest to type on comfortably?

TLDR - For comfortable speed typing (120wpm+) the essential element is a way to avoid slamming into a hard key bottom - either by floating (avoiding the key bottom), or by bouncing off a cushioned key bottom, or both. 

As a 120wpm life long computer programmer, my personal two favorite key mechanisms are the Kailh low profile blue clicky switch, and the Keychron B1 series scissor-membrane switch. However, this is as much a product of muscle memory as mechanism. The thing that makes a key mechanism comfortable is that YOU can type on it without hitting any hard bottom.





Friday, November 1, 2024

Last Epoch is a really good Diablo-3 / POE mashup

If four years from now Diablo becomes a dead franchise, Last Epoch (steam store link) will be the game that stole all those players. 

This game is doing to Path of Exile what Blizzard-WoW did to Everquest.. taking the kernel of the best ideas and making it pretty and accessible for the rest of us.

If you want spoiler free, stop here, go buy it on steam, and try it out.

If you want some (minor) spoilers, read on....

Last Epoch has the "lite" versions of POE mechanics like class-ascendancy, mapping, and gear-reforging, and then added a mountain of quality-of-life that makes everything just feel *so* good as I play more.... My favorites are gear-modding *anywhere* (literally anywhere, you just open a menu), and an in-game loot-filter configuration UI! 

The new-player-experience has a bit of clunkiness to it, and the story and cutscene production value is pretty darn low, so it took me 2-4 hours to "get into it", but once I did, I got hooked. Their class/skill system is just such a unique blend of ideas from D3/D4/POE, and is such a neat balance between them. 

The monsters and world feel is very much like D3 (which is a great thing). 

The classes are like Path of Exile, where there is a base-type, and then just a short way through the campaign, you pick from three class-subtypes (which can not be changed!). This selection has a *much* bigger effect on skills and skill trees than it does in POE. (in POE it dictates ~12 ascendancy points, but in LE it controls the entire second half of the passive-skill-tree and what remaining spells you can choose from)




The skill and passive-tree systems are vaguely D3/D4 class-specific structures, with very *easy* respecing (though somst costs, especially for changing spells) -- I'm usually for D3 style free respeccing, but I also can see the value in there being *some* cost to respeccing, as it creates more consequence for choices. 

Which brings me to the the best part, the gearing and gear-upgrading system.... Last Epoch took a page out of minecraft, fortnite and "you're the crafter" model, and lets you modify or "disenchant" your gear anytime, anywhere. You are your own personal gear enchanting system, and the way it feels in the game is amazing, because...



In-game Loot-Filter! There is a really big gap from the Diablo world of no-loot-filter mayhem, and the Path of Exile world where loot-filters are created with outside software and imported into the game....


Enter Last Epoch, which has a really simple in-game UI for creating a loot-filter. It's not nearly as rich as what POE can do, but the in-game feel of being able to push a key, and make a loot filter spec at any time, is actually really great. This feels like the kind of thing Blizzard usually does to the competitors.. takes their ideas and just makes them accessible and great, but in this case, it's in a quasi-indie game.

There are some things I don't love... probably the biggest of which is that there are some unique-affix-bonuses on gear that are so powerful and pivotal, you get stuck with a crap piece of gear because you can't afford to lose that unique bonus (see highlighted bonus on the right). 

I think the D4's aspect system was an attempt to fix this (which is
pretty good in S6), but then they have unique gear with affixes you can't aspect-craft, and then you're stuck again, wearing that level 8 glove until level 60. However, this entire category of ARPGs has this problem, so it exists unless you go further away from the Diablo-subgenre into games like Warframe or VRising.

Every ARPG has moved to a quasi-seasonal model now, with content releases once or twice a year, and Last-Epoch is following this model. They've only release a few patches so far, so it's too early to tell how good this will go. Path of Exile sets the bar on this, so we'll see.

If you have some time to kill, give it a try.

Here are some excellent overview videos: