LM-Kit .NET SDK now supports tool calling for building AI agents in C#. Create on-device agents that discover, invoke, and chain tools with structured JSON schemas, safety policies, and human-in-the-loop controls, all running locally with full privacy. Works with thousands of local models from Mistral, LLaMA, Qwen, Granite, GPT-OSS, and more. Supports all tool calling modes: simple function, multiple function, parallel function, and parallel multiple function. No cloud dependencies, no API costs, complete control over your agent workflowsLM-Kit .NET SDK now supports tool calling for building AI agents in C#. Create on-device agents that discover, invoke, and chain tools with structured JSON schemas, safety policies, and human-in-the-loop controls, all running locally with full privacy. Works with thousands of local models from Mistral, LLaMA, Qwen, Granite, GPT-OSS, and more. Supports all tool calling modes: simple function, multiple function, parallel function, and parallel multiple function. No cloud dependencies, no API costs, complete control over your agent workflows

Tool Calling for Local AI Agents in C#

By: Hackernoon

2025/10/22 14:07

Tools are a fundamental part of agentic AI, alongside these core capabilities:

While language models excel at understanding and generating text, tools extend their abilities by letting them interact with the real world: searching the web for current information, executing code for calculations, accessing databases, reading files, or connecting to external services through APIs. Think of tools as the hands and eyes of an AI agent. They transform a conversational system into an agent that can accomplish tasks by bridging the gap between reasoning and action. When an agent needs to check the weather, analyze a spreadsheet, or send an email, it invokes the appropriate tool, receives the result, and incorporates that information into its response. This moves AI beyond pure text generation toward practical, real-world problem solving.

Interested in how agents retain and use context over time? Explore our deep dive on agent memory.

Why Local Agents Have Been Hard

Building AI agents that can actually do things locally has been surprisingly hard. You need:

Models that understand when and how to call external functions
Privacy without sending data to the cloud
A runtime that can parse tool calls, validate arguments, and inject results
Model-specific flows because each model has different tool calling formats and interaction patterns, requiring custom logic for interception, result injection, and action ordering
Safety controls to prevent infinite loops and runaway costs
Clear observability so you know what your agent is doing

Until now, most agentic frameworks forced a choice: powerful cloud-based agents with latency and privacy concerns, or limited local models without proper tool support. Today, that changes.

Why Tool Calling Changes Everything

With LM-Kit's new tool calling capabilities, your local agents can:

Ground answers in real data. No more hallucinated weather forecasts or exchange rates. Agents fetch actual API responses and can cite sources.
Chain complex workflows. For example: check the weather, convert temperature to the user's preferred units, then suggest activities. All in one conversational turn.
Maintain full privacy. Everything runs on-device. Your users' queries, tool arguments, and results never leave their machines.
Stay deterministic and safe. Typed schemas, validated inputs, policy controls, and approval hooks prevent agents from going rogue.
Scale with your domain. Add business APIs, internal databases, or external MCP catalogs as tools. The model learns to use them from descriptions and schemas alone.

What's New at a Glance

State-of-the-art tool calling, right in chatbot flows. Models decide when to call tools, pass structured JSON args, and use results to answer users accurately.
Dedicated flow support across model families like Mistral, GPT-OSS, Qwen, Granite, LLaMA, and more, all via one runtime.
Three ways to add tools:
Implement ITool
Annotate methods with [LMFunction]
Import catalogs from MCP servers
Unified API that runs local SLMs with per-turn policy, guardrails, and events for human-in-the-loop and observability at every stage.
All function calling modes supported. Simple Function, Multiple Function, Parallel Function, and Parallel Multiple Function, choose strict sequencing or safe parallelism.
Model-aware tool call flow. Modern SLMs emit structured tool calls. LM-Kit parses calls, routes them to your tools, and feeds results back with correlation and clear result types for a reliable inference path.

How It Works: Getting Started

Here's a complete working example in under 20 lines:

using LMKit.Model; using LMKit.TextGeneration; using LMKit.Agents.Tools; using System.Text.Json; // 1) Load a local model from the catalog var model = LM.LoadFromModelID("gptoss:20b"); // OpenAI GPT-OSS 20B // Optional: confirm tool-calling capability if (!model.HasToolCalls) { /* choose a different model or fallback */ } // 2) Create a multi-turn conversation using var chat = new MultiTurnConversation(model); // 3) Register tools (see three options below) chat.Tools.Register(new WeatherTool()); // 4) Shape the behavior per turn chat.ToolPolicy.Choice = ToolChoice.Auto; // let the model decide chat.ToolPolicy.MaxCallsPerTurn = 3; // guard against loops // 5) Ask a question var reply = chat.Submit("Plan my weekend and check the weather in Toulouse."); Console.WriteLine(reply.Content);

The model catalog includes GPT-OSS and many other families. LM.LoadFromModelID lets you pull a named card like gptoss:20b. You can also check HasToolCalls before you rely on tools.

See the Model Catalog documentation for details.

Try it now: GitHub sample

A production-ready console sample demonstrates multi-turn chat with tool calling (currency, weather, unit conversion), per-turn policies, progress feedback, and special commands. Jump to:

Create Multi-Turn Chatbot with Tools in .NET Applications

Three Ways to Add Tools

1) Implement ITool (Full Control)

Best when you need clear contracts and custom validation.

This snippet demonstrates implementing the ITool interface so an LLM can call your tool directly. It declares the tool contract (Name, Description, InputSchema), parses JSON args, runs your logic, and returns structured JSON to the model.

public sealed class WeatherTool : ITool { public string Name => "get_weather"; public string Description => "Get current weather for a city. Returns temperature, conditions, and optional hourly forecast."; // JSON Schema defines expected arguments public string InputSchema => """ { "type": "object", "properties": { "city": {"type": "string", "description": "City name (e.g., 'Paris' or 'New York')"} }, "required": ["city"] } """; public async Task<string> InvokeAsync(string arguments, CancellationToken ct = default) { // Parse the model's JSON arguments var city = JsonDocument.Parse(arguments).RootElement.GetProperty("city").GetString(); // Call your weather API var weatherData = await FetchWeatherAsync(city); // Return structured JSON the model can understand var result = new { city, temp_c = weatherData.Temp, conditions = weatherData.Conditions }; return JsonSerializer.Serialize(result); } } // Register it chat.Tools.Register(new WeatherTool());

Why use ITool? Complete control over validation, async execution, error handling, and result formatting.

2) Annotate Methods with [LMFunction] (Quick Binding)

Best for rapid prototyping and simple synchronous tools.

What it does: Add [LMFunction(name, description)] to public instance methods. LM-Kit discovers them and exposes each as an ITool, generating a JSON schema from method parameters.

How it's wired: Reflect and bind with LMFunctionToolBinder.FromType<MyDomainTools>() (or FromInstance/FromAssembly), then register the resulting tools via chat.Tools.Register(...).

public sealed class MyDomainTools { [LMFunction("search_docs", "Search internal documentation by keyword. Returns top 5 matches.")] public string SearchDocs(string query) { var results = _documentIndex.Search(query).Take(5); return JsonSerializer.Serialize(new { hits = results }); } [LMFunction("get_user_info", "Retrieve user profile and preferences.")] public string GetUserInfo(int userId) { var user = _database.GetUser(userId); return JsonSerializer.Serialize(user); } } // Automatically scan and register all annotated methods var tools = LMFunctionToolBinder.FromType<MyDomainTools>(); chat.Tools.Register(tools);

Why use [LMFunction]? Less boilerplate. The binder generates schemas from parameter types and registers everything in one line.

3) Import MCP Catalogs (External Services)

Best for connecting to third-party tool ecosystems via the Model Context Protocol.

What it does: Uses McpClient to establish a JSON-RPC session with an MCP server, fetch its tool catalog, and adapt those tools so your agent can call them.

How it's wired: Create new McpClient(uri, httpClient) (optionally set a bearer token), then chat.Tools.Register(mcp, overwrite: false) to import the catalog; LM-Kit manages tools/list, tools/call, retries, and session persistence.

using LMKit.Mcp.Client; // Connect to an MCP server var mcp = new McpClient( new Uri("https://mcp.example.com/api"), new HttpClient() ); // Import all available tools from the server int toolCount = chat.Tools.Register(mcp, overwrite: false); Console.WriteLine($"Imported {toolCount} tools from MCP server");

Why use MCP? Instant access to curated tool catalogs. The server handles tools/list and tools/call over JSON-RPC; LM-Kit validates schemas locally.

See McpClient documentation.

Execution Modes That Match Your Workflow

Choose the right policy for each conversational turn:

Simple Function

One tool, one answer.

chat.ToolPolicy.MaxCallsPerTurn = 1; chat.ToolPolicy.Choice = ToolChoice.Required; // force at least one call

Example: "What is the weather in Tokyo?" calls get_weather once and answers.

Multiple Function

Chain tools sequentially.

chat.ToolPolicy.MaxCallsPerTurn = 5; chat.ToolPolicy.Choice = ToolChoice.Auto;

Example: "Convert 75°F to Celsius, then tell me if I need a jacket."

Calls convert_temperature(75, "F", "C") and gets 23.9°C
Calls get_weather("current_location") and gets conditions
Synthesizes answer: "It is 24°C and sunny. A light jacket should be fine."

Parallel Function

Execute multiple tools concurrently.

chat.ToolPolicy.AllowParallelCalls = true; chat.ToolPolicy.MaxCallsPerTurn = 10;

Example: "Compare weather in Paris, London, and Berlin."

Calls get_weather("Paris"), get_weather("London"), get_weather("Berlin") simultaneously
Waits for all results, compares, and answers

Only enable if your tools are idempotent and thread-safe.

Parallel Multiple Function

Combine chaining and parallelism.

Example: "Check weather in 3 cities, convert all temps to Fahrenheit, and recommend which to visit."

Parallel: fetches weather for 3 cities
Parallel: converts all temperatures
Sequential: recommends based on results

See ToolCallPolicy documentation for all options including ToolChoice.Specific and ForcedToolName. Defaults are conservative: parallel off, max calls capped.

Safety, Control, and Observability

Policy Controls

Configure safe defaults and per-turn limits. See ToolCallPolicy documentation.

chat.ToolPolicy = new ToolCallPolicy { Choice = ToolChoice.Auto, // let model decide MaxCallsPerTurn = 3, // prevent runaway loops AllowParallelCalls = false, // safe default: sequential only // Optional: force a specific tool first // Choice = ToolChoice.Specific, // ForcedToolName = "authenticate_user" };

Human in the Loop

Review, approve, or block tool execution. Hooks: BeforeToolInvocation, AfterToolInvocation, BeforeTokenSampling, MemoryRecall.

// Approve tool calls before execution chat.BeforeToolInvocation += (sender, e) => { Console.WriteLine($"About to call: {e.ToolCall.Name}"); Console.WriteLine($" Arguments: {e.ToolCall.ArgumentsJson}"); // Block sensitive operations if (e.ToolCall.Name == "delete_user" && !UserHasApproved()) { e.Cancel = true; Console.WriteLine(" Blocked by policy"); } }; // Audit results after execution chat.AfterToolInvocation += (sender, e) => { var result = e.ToolCallResult; Console.WriteLine($"{result.ToolName} completed"); Console.WriteLine($" Status: {result.Type}"); Console.WriteLine($" Result: {result.ResultJson}"); _telemetry.LogToolCall(result); // send to monitoring }; // Override token sampling in real time chat.BeforeTokenSampling += (sender, e) => { if (_needsDeterministicOutput) e.Sampling.Temperature = 0.1f; }; // Control memory injection chat.MemoryRecall += (sender, e) => { Console.WriteLine($"Injecting memory: {e.Text.Substring(0, 50)}..."); // e.Cancel = true; // optionally cancel };

Structured Data Flow

Every call flows through a typed pipeline for reproducibility and clear logs.

Incoming: ToolCall with stable Id and ArgumentsJson.
Outgoing: ToolCallResult with ToolCallId, ToolName, ResultJson, and Type (Success or Error).

Try It: Multi-Turn Chat Sample

Create Multi-Turn Chatbot with Tools in .NET Applications

Purpose: Demonstrates LM-Kit.NET's agentic tool-calling: during a conversation, the model can decide to call one or multiple tools to fetch data or run computations, pass JSON arguments that match each tool's InputSchema, and use each tool's JSON result to produce a grounded reply while preserving full multi-turn context. Tools implement ITool and are managed by a registry; per-turn behavior is shaped via ToolChoice.

Why tools in chatbots?

Reliable, source-backed answers (weather, FX, conversions, business APIs).
Agentic chaining: call several tools in one turn and combine results.
Determinism and safety: typed schemas, clear failure modes, policy control.
Extensibility: implement ITool for domain logic; keep code auditable.
Efficiency: offload math/lookup to tools; keep the model focused on reasoning.

Target audience: Product and platform teams; DevOps and internal tools; B2B apps; educators and demos.

Problem solved: Actionable answers, deterministic conversions/quotes, multi-turn memory, easy extensibility.

Sample app:

Lets you choose a local model (or a custom URI)
Registers three tools (currency, weather, unit conversion)
Runs a multi-turn chat where the model decides when to call tools
Prints generation stats (tokens, stop reason, speed, context usage)

Key features:

Tool calling via JSON arguments
Full dialogue memory
Progress feedback (download/load bars)
Special commands: /reset, /continue, /regenerate
Multiple tool calls per turn (and across turns)

Built-in Tools

| Tool name | Purpose | Online? | Notes | |----|----|----|----| | convert_currency | ECB rates via Frankfurter (latest or historical) plus optional trend | Yes | No API key; business days; rounding and date support | | get_weather | Open-Meteo current weather plus optional short hourly forecast | Yes | No API key; geocoding plus metric/us/si | | convert_units | Offline conversions (length, mass, temperature, speed, etc.) | No | Temperature is non-linear; can list supported units |

Tools implement ITool: Name, Description, InputSchema (JSON Schema), and InvokeAsync(string json) returning JSON.

Extend with your own tool:

chat.Tools.Register(new MyCustomTool()); // implements ITool

Use unique, stable, lowercase snake_case names.

Supported Models (Pick per Hardware)

Mistral Nemo 2407 12.2B (around 7.7 GB VRAM)
Meta Llama 3.1 8B (around 6 GB VRAM)
Google Gemma 3 4B Medium (around 4 GB VRAM)
Microsoft Phi-4 Mini 3.82B (around 3.3 GB VRAM)
Alibaba Qwen-3 8B (around 5.6 GB VRAM)
Microsoft Phi-4 14.7B (around 11 GB VRAM)
IBM Granite 4 7B (around 6 GB VRAM)
Open-AI GPT-OSS 20B (around 16 GB VRAM)
Or provide a custom model URI (GGUF/LMK)

Commands

/reset - clear conversation
/continue - continue last assistant message
/regenerate - new answer for last user input

Example Prompts

"Convert 125 USD to EUR and show a 7-day trend."
"Weather in Toulouse next 6 hours (metric)."
"Convert 65 mph to km/h." / "List pressure units."
"Now 75 °F to °C, then 2 km to miles."

Behavior and Policies (Quick Reference)

Tool selection policy: By default the sample lets the model decide (ToolChoice.Auto). You can Require / Forbid / Force a specific tool per turn.
Multiple tool calls: Supports several tool invocations per turn; outputs are injected back into context.
Schemas matter: Precise InputSchema plus concise Description improve argument construction.
Networking: Currency and weather require internet; unit conversion is offline.
Errors: Clear exceptions for invalid inputs (units, dates, locations).

Getting Started

Prerequisites: .NET Framework 4.6.2 or .NET 6.0

Download:

git clone https://github.com/LM-Kit/lm-kit-net-samples.git cd lm-kit-net-samples/console_net/multi_turn_chat_with_tools

Run:

dotnet build dotnet run

Then pick a model or paste a custom URI. Chat naturally; the assistant will call one or multiple tools as needed. Use /reset, /continue, /regenerate anytime.

Project link: GitHub Repository

Complete Example: All Three Integration Paths

// Load a capable local model var model = LM.LoadFromModelID("gptoss:20b"); using var chat = new MultiTurnConversation(model); // 1) ITool implementation chat.Tools.Register(new WeatherTool()); // 2) LMFunctionAttribute methods var tools = LMFunctionToolBinder.FromType<MyDomainTools>(); chat.Tools.Register(tools); // 3) MCP import var mcp = new McpClient(new Uri("https://mcp.example/api"), new HttpClient()); chat.Tools.Register(mcp); // Safety and behavior chat.ToolPolicy = new ToolCallPolicy { Choice = ToolChoice.Auto, MaxCallsPerTurn = 3, // AllowParallelCalls = true // enable only if tools are idempotent }; // Human-in-the-loop chat.BeforeToolInvocation += (_, e) => { /* approve or cancel */ }; chat.AfterToolInvocation += (_, e) => { /* log results */ }; // Run var answer = chat.Submit( "Find 3 relevant docs for 'safety policy' and summarize."); Console.WriteLine(answer.Content);

Why Go Local with LM-Kit?

vs. Cloud Agent Frameworks

Zero API costs: No per-token charges. Run unlimited conversations.
Complete privacy: User data never leaves the device. GDPR/HIPAA friendly.
Sub-100ms latency: Local inference eliminates network roundtrips entirely.
Works offline: Agents function without internet connectivity.
No rate limits: Scale to millions of requests without throttling.
Full control: Own the stack. No vendor lock-in or API deprecations.

vs. Basic Prompt Engineering

Type-safe schemas: JSON Schema validation catches bad arguments before execution.
Deterministic results: Clear success/error states, not fragile regex parsing.
Parallel execution: Run multiple tools concurrently when safe.
Full observability: Structured events at every stage, not log archaeology.
Testable contracts: Mock tools, inject results, replay conversations.
Error boundaries: Graceful failures with retry logic and fallbacks.

vs. Manual Function Calling

Model decides: Agent autonomously picks tools and arguments, no brittle if/else chains.
Auto-chaining: Multiple tool calls per turn, results fed back automatically.
90% less boilerplate: Register tools once, not per-model or per-prompt.
Built-in safety: Loop prevention, max-calls limits, approval hooks out of the box.
Model-agnostic API: Same code works across Mistral, LLaMA, Qwen, Granite, GPT-OSS.
Progressive enhancement: Add tools without refactoring conversation logic.

Performance and Limitations

Performance Expectations

Tool invocation overhead: Around 2 to 5 ms per call (parsing plus validation)
Network tools: 50 to 500 ms depending on API
Local tools: Less than 1 ms
Model inference remains the primary latency factor.

Requirements

Models must support tool calling (check HasToolCalls).
Network-dependent tools require internet connectivity.
Parallel execution requires thread-safe, idempotent tools.
Recommended GPU memory: 6 to 16 GB VRAM depending on model size.

Known Limitations

Tool selection quality depends on clear descriptions and schemas.
Complex nested objects in arguments may confuse smaller models.
Very long tool chains (more than 10 calls) may exceed context windows.

Ready to Build?

Clone the sample

git clone https://github.com/LM-Kit/lm-kit-net-samples.git cd lm-kit-net-samples/console_net/multi_turn_chat_with_tools

Pick your integration approach

Need full control? Use ITool
Prototyping quickly? Use [LMFunction]
Using external catalogs? Use McpClient

Add your domain logic Replace demo tools with your APIs, databases, or business logic.
Set policies that fit your use case

Simple lookups: MaxCallsPerTurn = 1
Complex workflows: MaxCallsPerTurn = 10 with approval hooks

Ship agents that actually work On-device. Private. Reliable. Observable.

Start building agentic workflows that respect user privacy, run anywhere, and stay under your control.

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

Share Insights

Tool Calling for Local AI Agents in C#

Why Local Agents Have Been Hard

Why Tool Calling Changes Everything

What's New at a Glance

How It Works: Getting Started

Try it now: GitHub sample

Three Ways to Add Tools

1) Implement ITool (Full Control)

2) Annotate Methods with [LMFunction] (Quick Binding)

3) Import MCP Catalogs (External Services)

Execution Modes That Match Your Workflow

Simple Function

Multiple Function

Parallel Function

Parallel Multiple Function

Safety, Control, and Observability

Policy Controls

Human in the Loop

Structured Data Flow

Try It: Multi-Turn Chat Sample

Create Multi-Turn Chatbot with Tools in .NET Applications

Built-in Tools

Supported Models (Pick per Hardware)

Commands

Example Prompts

Behavior and Policies (Quick Reference)

Getting Started

Complete Example: All Three Integration Paths

Why Go Local with LM-Kit?

vs. Cloud Agent Frameworks

vs. Basic Prompt Engineering

vs. Manual Function Calling

Performance and Limitations

Performance Expectations

Requirements

Known Limitations

Ready to Build?

You May Also Like

Bitcoin Price Prediction: BTC Projected to Clear $112,000 This Week as This $0.035 Token Tops Altcoin Watchlists

Ethereum Foundation Moves Entire $650M+ Treasury to Safe Multisig

Citadel’s Stake in Solana Treasury Firm DeFi Dev Corp Highlights Potential Crypto Exposure

Trending News

Bitcoin Price Prediction: BTC Projected to Clear $112,000 This Week as This $0.035 Token Tops Altcoin Watchlists

Ethereum Foundation Moves Entire $650M+ Treasury to Safe Multisig

Citadel’s Stake in Solana Treasury Firm DeFi Dev Corp Highlights Potential Crypto Exposure

Stablecoins outrun Visa as onchain volume hits $46 trillion

All Hail Trump? Crypto Fans Build 12-Foot Golden Trump Bitcoin Statue in DC