Where does discussion on gut training occur? All I know is you need a 5:4 ratio of glucose to fructose? Then when you train, you use the gels and the more you do it, the more capable your gut gets at absorbing without distress.
AFAIK 5:4 is just the lowest ratio they've tested. Personally I use table sugar (1:1) and can sustain rates above 100g/h. Haven't hit the ceiling yet, don't really feel the need to explore where that is yet because exceeding the absorption rate comes with the risk of diarrhoea which is bad at any time but especially when you're in the middle of a training session and who knows where the nearest toilet is.
Gut training is consuming large amounts of carbohydrate (preferably in the same form you intend to use when racing), yes.
Eating the same amount of table sugar or of a commercial gel should have pretty much the same effect on performance.
However, for many people eating so much of a very sweet food becomes very unpleasant.
It is very easy and cheap to make at home a gel by boiling in water corn starch mixed with fructose in a microwave oven, for a few minutes. Swallowing such a gel should feel much less sweet than the same amount of a sugar solution.
As far as I know, the only difference between such a gel made at home and the commercial gels for athletes is that in the latter the starch is pre-digested with some bacterial enzyme, so that the long starch molecules are broken into short molecules of dextrine and maltose.
This processing shortens the time until the absorption in the gut, but I am not sure if this is really an advantage in all cases. A slower absorption will maintain an elevated blood glucose level for a longer time after ingestion, which may be preferable if you feed periodically, because it avoids wide fluctuations in the glucose level, while a faster absorption might be useful for an immediate recovery when the glucose level has been severely depleted by not feeding for a long time.
Yes but the science is actually achieving that and finding the limits. It used to be thought that 60g carbs/hour was the limit, then 100g, now it’s thought to be 120g.
It’s also about the methods of achieving that under stress without spewing it all back up. Ironman athletes would stuff their faces on the bike under the assumption that this volume of carb absorption wasn’t possible while running.
Some of the challenge in research will come from competitors not wanting to publish results to maintain an edge. It is mitigated by the visual of the race by (you can see athletes pounding carbs), as well as the nutrition companies wanting to sell more product. This will cause them to publish some information to convince us amateurs to quadruple our purchase volume ;-)
I agree. Gemini models are held back by their segmentation of usage between multiple products, combined with their awful harnesses and tooling. Gemini cli, antigravity, Gemini code assist, Jules.... The list goes on. Each of these products has only a small limit and they must share usage.
It gets worse than that though. Most harnesses that are made to handle codex and Claude cannot handle Gemini 3.1 correctly. Google has trained Gemini 3.1 to return different json keys than most harnesses expect resulting in awful results and failure. (Based on me perusing multiple harness GitHub issues after Gemini 3.1 came out)
They likely already have. You can use all caps and yell at Claude and it'll react normally, while doing do so with chatgpt scares it, resulting in timid answers
This is something inherently hard to avoid with a prompt. The model is instruction-tuned and trained to interpret anything sent under the user role as an instruction, not necessarily in a straightforward manner. Even if you train it to refuse or dodge some inputs (which they do), it's going to affect model's response, often in subtle ways, especially in a multiturn convo. Anthropic themselves call this the character drift.
For me GPT always seems to get stuck in a particular state where it responds with a single sentence per paragraph, short sentences, and becomes weirdly philosophical. This eventually happens in every session. I wish I knew what triggers it because it's annoying and completely reduces its usefulness.
Usually a session is delivered as context, up to the token limit, for inference to be performed on. Are you keeping each session to one subject? Have you made personalizations? Do you add lots of data?
It would be interesting if you posted a couple of sessions to see what 'philosophical' things it's arriving at and what proceeds it.
I think my next steps are:
1) try out openai $20/month. I've heard they're much more generous.
2) try out open router free models. I don't need geniuses, so long as I can see the thinking (something that Claude code obfuscates by default) I should be good. I've heard good things about the CLIO harness and want to try openrouter+clio
Word on the street is that Opus is much much larger of a model than GPT-5.4 and that’s why the rate limits on Codex are so much more generous. But I guess you could also just switch to Sonnet or Haiku in Claude Code?
I'm taking a bet on local models to do the non genius work. Gemma 4 (released yesterday) has been designed to run on laptops / edge devices....and so far is running pretty well for me.
Edge models are good for their purpose but putting them in agentic flow with current ollama quants on a Mac Mini I see high tool use error rate and output hallucination.
For JSON to text formatting it works well on a one-round basis. So I think you should realistically have an evaluation ready to go so you can use it on these models. I currently judge them myself but people often use a smart LLM as judge.
Today writing eval harness with Claude is 5 min job. Do it yourself so you can explore as quants on Gemma get better.
OpenAI has the better coding model anyways. You will be pleasantly surprised by Codex. The TUI tool is less buggy and runs faster and it's a more careful and less error-prone model. It's not as "creative" but it's more intelligent.
On top of that their $20 plan has much higher usage limits than Anthropic's $20 plan and they allow its use in e.g. opencode. So you can set up opencode to use both OpenAI's codex plan plus one of the more intelligent Chinese models so you can maximize your usage. Have it fully plan things out using GPT 5.4, write code using e.g. Qwen 3.6, then switch back to GPT 5.4 for review
You can charge $10 on the account and get unlimited requests. I abused this last week with the nemotron super to test out some stuff and made probably over 10000 requests over a couple of days and didn't get blocked or anything, expect 5xx errors and slowdowns tho.
I've been pretty satisfied using oh-my-openagent (omo) on opencode with both opus-4.6 and gpt-5.4 lately.
The author of omo suggests different prompting strategies for different models and goes into some detail here.
https://github.com/code-yeongyu/oh-my-openagent/blob/dev/doc...
For each agent they define, they change the prompt depending on which model is being used to fit it.
I wonder how much of the "x did worse than y for the same prompt" tests could be improved if the prompts were actually tailored to what the model is good at.
I also wonder if any of this matters or if it's all a crock of bologna..
i think it may matter a good bit. i definitely have to write in different styles with different models (and catch myself doing so unintentionally) now that you mention it...
Fwiw I run this eval every week on a set of known prompts and I believe the in group differences are bigger than out group.
That is I get more variance between opus 4.6 and itself than I do between the sota models.
I don’t have the budget for statistical relevance but I’m convinced people claiming broad differences are just vibing, or there are times when agent features make a big difference.
Is that all the science to it?
reply