More

bethekind · 2026-04-27T00:14:28 1777248868

Where does discussion on gut training occur? All I know is you need a 5:4 ratio of glucose to fructose? Then when you train, you use the gels and the more you do it, the more capable your gut gets at absorbing without distress.

Is that all the science to it?

andreareina · 2026-04-27T10:35:44 1777286144

AFAIK 5:4 is just the lowest ratio they've tested. Personally I use table sugar (1:1) and can sustain rates above 100g/h. Haven't hit the ceiling yet, don't really feel the need to explore where that is yet because exceeding the absorption rate comes with the risk of diarrhoea which is bad at any time but especially when you're in the middle of a training session and who knows where the nearest toilet is.

Gut training is consuming large amounts of carbohydrate (preferably in the same form you intend to use when racing), yes.

adrian_b · 2026-04-28T05:47:40 1777355260

Eating the same amount of table sugar or of a commercial gel should have pretty much the same effect on performance.

However, for many people eating so much of a very sweet food becomes very unpleasant.

It is very easy and cheap to make at home a gel by boiling in water corn starch mixed with fructose in a microwave oven, for a few minutes. Swallowing such a gel should feel much less sweet than the same amount of a sugar solution.

As far as I know, the only difference between such a gel made at home and the commercial gels for athletes is that in the latter the starch is pre-digested with some bacterial enzyme, so that the long starch molecules are broken into short molecules of dextrine and maltose.

This processing shortens the time until the absorption in the gut, but I am not sure if this is really an advantage in all cases. A slower absorption will maintain an elevated blood glucose level for a longer time after ingestion, which may be preferable if you feed periodically, because it avoids wide fluctuations in the glucose level, while a faster absorption might be useful for an immediate recovery when the glucose level has been severely depleted by not feeding for a long time.

scott_w · 2026-04-27T08:45:57 1777279557

Yes but the science is actually achieving that and finding the limits. It used to be thought that 60g carbs/hour was the limit, then 100g, now it’s thought to be 120g.

It’s also about the methods of achieving that under stress without spewing it all back up. Ironman athletes would stuff their faces on the bike under the assumption that this volume of carb absorption wasn’t possible while running.

Some of the challenge in research will come from competitors not wanting to publish results to maintain an edge. It is mitigated by the visual of the race by (you can see athletes pounding carbs), as well as the nutrition companies wanting to sell more product. This will cause them to publish some information to convince us amateurs to quadruple our purchase volume ;-)

bethekind · 2026-04-19T00:06:25 1776557185

Off topic, but this is where I see AI going. A tool that condenses work down from requiring a team and a room to a box. We're decades away from that

bethekind · 2026-04-10T06:09:29 1775801369

My favorite is Brownian motion fractal art. Something about it just tickles my brain just right

bethekind · 2026-04-09T21:58:01 1775771881

Me too

bethekind · 2026-04-06T19:47:08 1775504828

1 client, 1 agent? Interesting

bethekind · 2026-04-06T02:42:31 1775443351

Poolometer looks cool! I will say your smiley face icons look a lil odd and ai generated, but otherwise I love the graph tracking and suggestions

bethekind · 2026-04-05T19:12:06 1775416326

I agree. Gemini models are held back by their segmentation of usage between multiple products, combined with their awful harnesses and tooling. Gemini cli, antigravity, Gemini code assist, Jules.... The list goes on. Each of these products has only a small limit and they must share usage.

It gets worse than that though. Most harnesses that are made to handle codex and Claude cannot handle Gemini 3.1 correctly. Google has trained Gemini 3.1 to return different json keys than most harnesses expect resulting in awful results and failure. (Based on me perusing multiple harness GitHub issues after Gemini 3.1 came out)

bethekind · 2026-04-04T07:12:25 1775286745

They likely already have. You can use all caps and yell at Claude and it'll react normally, while doing do so with chatgpt scares it, resulting in timid answers

vlabakje90 · 2026-04-04T08:24:21 1775291061

I think this is simply a result of what's in the Claude system prompt.

> If the person becomes abusive over the course of a conversation, Claude avoids becoming increasingly submissive in response.

See: https://platform.claude.com/docs/en/release-notes/system-pro...

orbital-decay · 2026-04-04T21:41:37 1775338897

This is something inherently hard to avoid with a prompt. The model is instruction-tuned and trained to interpret anything sent under the user role as an instruction, not necessarily in a straightforward manner. Even if you train it to refuse or dodge some inputs (which they do), it's going to affect model's response, often in subtle ways, especially in a multiturn convo. Anthropic themselves call this the character drift.

parasti · 2026-04-04T07:17:46 1775287066

For me GPT always seems to get stuck in a particular state where it responds with a single sentence per paragraph, short sentences, and becomes weirdly philosophical. This eventually happens in every session. I wish I knew what triggers it because it's annoying and completely reduces its usefulness.

pbhjpbhj · 2026-04-04T08:52:40 1775292760

Usually a session is delivered as context, up to the token limit, for inference to be performed on. Are you keeping each session to one subject? Have you made personalizations? Do you add lots of data?

It would be interesting if you posted a couple of sessions to see what 'philosophical' things it's arriving at and what proceeds it.

bethekind · 2026-04-04T00:52:22 1775263942

I think my next steps are: 1) try out openai $20/month. I've heard they're much more generous. 2) try out open router free models. I don't need geniuses, so long as I can see the thinking (something that Claude code obfuscates by default) I should be good. I've heard good things about the CLIO harness and want to try openrouter+clio

beering · 2026-04-04T01:25:07 1775265907

Word on the street is that Opus is much much larger of a model than GPT-5.4 and that’s why the rate limits on Codex are so much more generous. But I guess you could also just switch to Sonnet or Haiku in Claude Code?

Flere-Imsaho · 2026-04-04T07:59:05 1775289545

I'm taking a bet on local models to do the non genius work. Gemma 4 (released yesterday) has been designed to run on laptops / edge devices....and so far is running pretty well for me.

neal_jones · 2026-04-04T09:58:16 1775296696

How’s Gemma 4 been?

renewiltord · 2026-04-04T10:13:53 1775297633

Edge models are good for their purpose but putting them in agentic flow with current ollama quants on a Mac Mini I see high tool use error rate and output hallucination.

For JSON to text formatting it works well on a one-round basis. So I think you should realistically have an evaluation ready to go so you can use it on these models. I currently judge them myself but people often use a smart LLM as judge.

Today writing eval harness with Claude is 5 min job. Do it yourself so you can explore as quants on Gemma get better.

cmrdporcupine · 2026-04-04T23:51:38 1775346698

OpenAI has the better coding model anyways. You will be pleasantly surprised by Codex. The TUI tool is less buggy and runs faster and it's a more careful and less error-prone model. It's not as "creative" but it's more intelligent.

On top of that their $20 plan has much higher usage limits than Anthropic's $20 plan and they allow its use in e.g. opencode. So you can set up opencode to use both OpenAI's codex plan plus one of the more intelligent Chinese models so you can maximize your usage. Have it fully plan things out using GPT 5.4, write code using e.g. Qwen 3.6, then switch back to GPT 5.4 for review

admiralrohan · 2026-04-04T02:15:13 1775268913

Openrouter free models have 50 requests per day limit + data collection. As per their doc.

nodja · 2026-04-04T03:00:25 1775271625

You can charge $10 on the account and get unlimited requests. I abused this last week with the nemotron super to test out some stuff and made probably over 10000 requests over a couple of days and didn't get blocked or anything, expect 5xx errors and slowdowns tho.

merlindru · 2026-04-04T08:52:17 1775292737

i tried out gpt 5.4 xhigh and it did meaningfully worse with the same prompt as opus 4.6. like, obvious mistakes

josh_p · 2026-04-05T01:09:48 1775351388

I've been pretty satisfied using oh-my-openagent (omo) on opencode with both opus-4.6 and gpt-5.4 lately. The author of omo suggests different prompting strategies for different models and goes into some detail here. https://github.com/code-yeongyu/oh-my-openagent/blob/dev/doc... For each agent they define, they change the prompt depending on which model is being used to fit it. I wonder how much of the "x did worse than y for the same prompt" tests could be improved if the prompts were actually tailored to what the model is good at. I also wonder if any of this matters or if it's all a crock of bologna..

merlindru · 2026-04-06T14:52:05 1775487125

i think it may matter a good bit. i definitely have to write in different styles with different models (and catch myself doing so unintentionally) now that you mention it...

definitely not bologna, at least anecdotally :)

kasey_junk · 2026-04-04T13:50:59 1775310659

Fwiw I run this eval every week on a set of known prompts and I believe the in group differences are bigger than out group.

That is I get more variance between opus 4.6 and itself than I do between the sota models.

I don’t have the budget for statistical relevance but I’m convinced people claiming broad differences are just vibing, or there are times when agent features make a big difference.

merlindru · 2026-04-06T14:54:05 1775487245

it may be the agent features in my case. now that i think about it, i also forgot that my CLAUDE.md is different from my AGENTS.md

either way, all that one can really rely on is the benchmarks, and those are easily cheated/overfitted to.

i think it's all very hard to quantify, so take my previous comment with a massive rock of salt

bethekind · 2026-03-31T14:41:13 1774968073

This. If I run 4 Claude code opus agents with subagents, my 8gb of RAM just dies.

I know they can do better