Building RAG Agents with the AI Agent Manager

Overview

The goal of this article is to demonstrate a no-code approach to create an RAG based MCP Server for the purposes of querying documents like user manuals, troubleshooting documents etc, and integrating the MCP Server with the AI Agent Manager.

For this article, we will use the Pinecone Assistant for demonstration. The Pinecone Assistant is a service that allows you to build production grade chat and agent-based applications quickly.

It provides an MCP Server out of the box.

Introduction

Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references a knowledge base outside of its training data sources before generating a response. RAG extends the already powerful capabilities of LLMs to specific domains or an organization’s internal knowledge base, all without the need to retrain the model.

Use cases of RAG in IoT

RAG enables users to integrate manuals and troubleshooting documentation into their agents. They can use the relevant information obtained from these documents and coalesce then with Cumulocity data like Measurements, Events, Alarms, Streaming analytics data etc and then use the LLMs to interpret the data and provide the results in the agents.

An example where this would be useful is predictive maintanence. When alarms are received in Cumulocity, the AI agent can also obtain information from the troubleshooting documents and manuals. The troubleshooting docs and manuals can contain Remedial Measures to resolve the error codes in the alarms as well as steps to purchase spare parts to be replaced, in case the components have reached end of life due to issues like Motor Overheating.

This is very useful when it comes to proactively maintaining machines to keep them running optimally.

Step-by-Step guide

Before we dive into the details of creating the agent, below is an introduction the AI Agent Manager.

Introduction to the Agent Manager

This guide assumes that the Cumulocity Agent Manager is available within your Cumulocity tenant. The Cumulocity Agent Manager enables users to create AI Agents using a No Code Approach and allows to connect external tools via MCP Servers. It is currently in a private preview state and an introduction can be found in the latest Whats new Webinar starting at minute 13.
If you want to give the Agent Manager a try, please contact Rahul Talreja from the Cumulocity Product team with details on your use case and your Cumulocity tenant.

Now let’s dive into the Step by Step guide to create the agent

Create Pincone Assistant

Let’s see how we can use the Pinecone Assistant to create an assistant and use the MCP Server:

Step 1: Signup for the free Starter tier of Pinecone and login
Step 2: Navigate to the Pinecone Assistant.

Step 3: Click on Create an assistant

Enter the name of the assistant. Lets call it demo-assistant

Now the assistant is successfully created.

Step 4: Add a few documents to the assistant. For the demo purposes we will add documents via the UI. But we can also upload documents via the Python/Javascript SDKs of the assistant or via a Curl command (Assistant SDK)

Once the upload is complete, the files are visible on the right hand pane

Step 5: The next step is to get the MCP Server URL from the assistant. Click on the Settings icon and extract the MCP URL

Click on the Copy Button next to the MCP URL. In this case the URL is https://prod-1-data.ke.pinecone.io/mcp/assistants/demo-assistant

Step 6: Before integrating the assistant into the AI Agent Manager, we need the Bearer token.

Click on the Pinecone icon on the top left. Then click on API Keys

Click on Create API Key and create an API Key.

Integrating with the AI Agent Manager

Let’s now add the MCP Server into the AI Agent Manager and use it in an Agent.

Use the following POST call to add the MCP Server to the AI Agent Manager. Add /sse to the Assistant MCP URL:

curl --location '<tenant url>/service/ai/mcp/servers' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic <tenant credentials>' \
--data '{
    "url": "<MCP server url>/sse",
    "name": "Pinecone-mcp",
    "headers": {
        "Authorization": "Bearer <Pinecone API Key>"
    }
}'

Important: Please replace <Pinecone API Key> with your Pinecone API Key, <MCP server url> with the Pinecone MCP server url. Replace tenant url and tenant credentials with the url and credentials of the tenant where the AI Agent Manager is running

Now the get_context tool of the demo-pinecone-mcp server is visible.

We can test the tool:

In this example, we are retrieving information associated with the error code 6040 of a machine. The information consists of Common Causes, Troubleshooting steps, Preventative Measures.

Now that the tool is tested, we can use it in an agent. Create the new agent as follows

Click on Add agent

The name must be in lowercase without spaces

Give an appropriate name to the Agent and give a System Prompt. The goal of the System Prompt is to provide instructions on the task performed by the agent.

When the user enters the prompt, the agent uses the system prompt and the tools enabled by the user for the agent, to perform the task.

Lets now add the get_context tool to the agent:

We have now selected the tool get_context

We can now test the agent before saving

Testing allows you to tune the System Prompt based on your needs, in order to provide the desired output.

Once the agent testing is done, click on Save

We have now successfully created an agent, which can now be used in your applications.

Summary

In this post, we have demonstrated how users can quickly create an RAG AI agent without any coding and use these agents to interface with their data.

References: