Wondering if I'm being unreasonable, I'm running a test, taking turns trying each ...

2026-01-15 01:20:53 UTC

Wondering if I'm being unreasonable, I'm running a test, taking turns trying each model within one chat, with the same prompt: "I'll upload a file - don't do anything with it until I tell you more." I'm uploading a different txt file each time.
Llama 3.3, OpenAI GPT-OSS, Gemma, and Qwen3-VL passed the test fine. Qwen 3 Coder and even Kimi K2 showed considerable restraint. DeepSeek R1 thought for 44 seconds and responded with a long answer, but didn't write code.
So I can try uploading files with the chat set to a model that obeys more easily, then switching to an overly eager one when it's time to build.

Author Public Key

npub1adnnfrsjn7tlhqlkyptkgdpdv9958575gra3s2s7puzppkkrvzfqnhq9e6

Seen on

wss://nostr-01.yakihonne.com

Show more details

Grace and Truth on Nostr: Wondering if I'm being unreasonable, I'm running a test, taking turns trying each ...