Exploring AI – Prompts: Contracts or Conversations?

Ever downloaded a free AI model to run privately on your own laptop—no cloud, no fees—only to find it acting oddly stubborn, like it only heard your first message and ignored everything after? I explored to try to get to the root of the issue.

The odd fixation

Running a local AI sounds appealing: total privacy, works offline, and you can tinker without limits. Tools like LM Studio make it surprisingly easy—just download the app, pick a small model, and start chatting. At first, it feels magical. You ask for a recipe, a story idea, or help debugging code, and it responds intelligently.

Then you try to refine it. You say, “Actually, make it vegetarian” or “No, I meant Python 3, not JavaScript” or even switch topics entirely—“Forget the recipe, help me plan a weekend hike instead.” The response? It politely acknowledges you… and then circles right back to the original idea, as if the new input barely registered. It’s not rude, just oddly fixated, like talking to someone wearing noise-canceling headphones tuned only to your opening sentence.

Why the disconnect?

The difference boils down to how the models were built and trained.

Many smaller or older local models were designed as “completion engines.” You give them a block of text (the prompt), and they continue it. That’s it. They treat your input like a contract: every detail, constraint, and goal needs to be spelled out up front, because follow-up messages don’t override or deeply integrate with what came before. New information gets tacked on, but the original direction still dominates.

Newer models—especially the “instruct” or “chat” variants released in the last couple of years—receive extra training on thousands of back-and-forth conversations. They learn to track context better, adjust on the fly, and treat prompts more like an ongoing discussion. Size helps too: models around 7–13 billion parameters (like recent Llama, Gemma, Phi, or Mistral releases) with proper chat tuning usually feel noticeably more flexible than tiny 1–3 billion ones.

Think of it this way:

Contract style (classic local/small model)  

– All terms laid out in one clear document.  

– Reliable and consistent if you get the wording right.  

– No room for mid-stream negotiation.

Conversation style (modern chat-tuned model)  

– Builds naturally, turn by turn.  

– Adapts to new info and corrections.  

– Feels more human… when the model is up to it.

The fix

If you’re using a small or older local model—or even a cloud one that’s feeling rigid—the safest bet is to write “contract” prompts: pack everything important into a single, self-contained message.

Example:

Instead of:  

First message: “Write a 3-day workout plan.”  

Follow-up: “Actually no running, I hate it. Use swimming.”

Try:  

“Create a 3-day beginner workout plan focused on swimming and bodyweight strength exercises. No running at all. Include warm-ups, sets/reps, and rest recommendations.”

You’ll get a cleaner, more accurate result with far less drift.

Want the conversational feel? Grab LM Studio (free, simple interface at lmstudio.ai), search for a recent chat-tuned model like “Llama 3.2 8B Instruct,” “Gemma 2 9B,” or “Ministral 8B,” and load it up. Most modern laptops can run these at reasonable speed, and you’ll immediately notice how follow-ups actually stick.

Either way, understanding the difference turns a frustrating quirk into a predictable tool. Match your prompting style to what the model can handle, and you’ll waste a lot less time talking past each other.

Aegisyx

Copyright  © Aegisyx