NodeLLM 1.5.0: Putting Security in the Driver's Seat

Connect an LLM to real tools, and you quickly realize the risks aren't just theoretical. They look like hallucination loops, drained credit cards, or a single hung request taking down your Node event loop.

NodeLLM 1.5.0 is built to handle those specific failures. We moved safety from "documentation best practices" directly into the runtime.

Many agent frameworks optimize for autonomy. NodeLLM 1.5.0 optimizes for control.

Here’s what changed.

The 30-Second Baseline (Request Timeouts)

Hanging requests kill Node.js apps. If an LLM provider stalls for 3 minutes, it ties up resources your other users need.

In 1.5.0, everything has a timeout. The default is 30 seconds. Whether you’re generating text, creating images, or transcribing audio, the library enforces a limit so your app doesn't hang.

// Global baseline
NodeLLM.configure({ requestTimeout: 15000 });

// Or per-request for heavy tasks
await chat.ask("Detailed analysis...", { requestTimeout: 60000 });

Stopping the Loop (Tool Guard)

The "Tool Calling Loop" is powerful, but dangerous. Sometimes a model gets confused and tries to call a tool 50 times in a row.

We added maxToolCalls. This is a hard limit on the number of turns an LLM can take in a single request. By default, it's set to 5. If the model hasn't solved the problem by then, we cut the execution.

This stops runaway loops and prevents accidental DoS scenarios where a model repeatedly hammers an expensive external API.

Human-in-the-loop (Execution Policies)

Sometimes you don't want the AI to just "do it." If an agent tries to delete a record or transfer funds, you need a sanity check.

We introduced Tool Execution Policies. You can set your chat to confirm mode, exposing a hook called onConfirmToolCall.

It acts like a pause button. When the LLM decides to use a tool, execution stops and yields control to you. You can inspect the arguments and decide whether to proceed.

Real-world use cases:

Admin UI: Auto-approve "Get Status", but wait for a human click before running "Refund Order".
Slack Approvals: The agent pauses and messages a channel. It only resumes once someone clicks "Approve."

Safe vs. Dangerous checks:

agent.onConfirmToolCall(async (call) => {
  const args = JSON.parse(call.function.arguments);
  
  // Auto-approve read-only database queries
  if (args.action === 'read') return true;
  
  // Wait for manual approval for anything else
  return await askHumanForApproval(call);
});

While paused, the agent consumes no extra tokens. It turns the AI from a black box into a collaborator that asks for permission.

Cost Protection (Global maxTokens)

Prompt injection isn't just about data theft; it's about wallet theft. A malicious user—or a buggy loop—can trick a model into generating massive amounts of text.

You can now set a Global maxTokens limit. This acts as a circuit breaker for your budget. Even if you forget to limit a specific request in your code, the global config ensures you won't wake up to a $500 bill from a single runaway chat session.

Infrastructure, Not Just Magic

LLMs should be treated as infrastructure. Good infrastructure is predictable, boring, and secure.

NodeLLM 1.5.0 doesn’t make your AI smarter. It makes it safe enough to run in production without constant supervision.

You can grab the update at @node-llm/core@1.5.0. Check out the security docs or the runnable examples to test it out.

Building with NodeLLM? Let me know on GitHub.