Important! Steps to Building a Agent
Last updated:2024-07-01

Important! Steps to Building a Agent

GPTBots is an AI application (Agent) development platform where you can create any Agent tailored to your needs without any coding.

GPTBots is simple and easy to use. Therefore, the biggest challenge for users is—how to create a usable Agent that meets their needs?

Hence, the thought process is crucial.

This article will guide you step-by-step to think through and build your own Agent.

Thought Process

The approach to designing a Agent is generally similar to designing a product.

However, we recommend treating the Agent as a human. This helps us better project real-life problems onto the Agent for it to solve.

Scenario

First, you need to determine the scenario, i.e., what problems do you want this Agent to solve?

We usually use "5W1H" to describe a scenario:

Where, When, Who, Why, How, What.

For example:

At home (Where), while shopping online (When), the buyer (Who), because they don't know how to return a product (Why), wants to make a call (How), to consult the online store's customer service (What).

This is a typical user scenario for an e-commerce customer service Agent.

We can use this as a reference to list multiple scenarios for the Agent, such as:

  • Not knowing how to return a product
  • Not knowing how to get a refund
  • Not knowing how to exchange a product
  • Not knowing how to file a complaint
  • How to use the product
  • Other contact methods for the platform
  • ...

Generally, we do not recommend having a Agent solve too broad a range of issues. The more vertical or focused the Agent's scenarios are, the more effectively it can solve problems.

Like humans, it needs to "specialize in a particular field."

In the above example, the listed scenarios are all related to "e-commerce customer service," which helps to clearly establish the image of an e-commerce customer service representative.

At the same time, it is best not to include scenarios unrelated to "e-commerce customer service," such as "loan recommendations, product recommendations," in this Agent.

Positioning

After listing the scenarios, we can have a clear positioning for the Agent.

In the above example, it is an "e-commerce customer service" Agent that can answer some questions about the e-commerce platform for users and handle tasks such as returns and exchanges.

Resources

Once the positioning is clear, we need to start preparing some resources that the Agent will use, mainly including two categories: knowledge and tools.

Knowledge

Knowledge, as the name implies, is what the Agent needs to learn and understand, usually some documents.

Even humans need to learn the corresponding business knowledge before handling work tasks. The Agent is the same.

The Agent will respond to user questions based on the knowledge it has.

For example, the return and exchange process of a certain e-commerce platform. If this knowledge is not provided to the Agent, then the Agent will not know how to respond. In extreme cases, it may even produce "hallucinations" and give random answers.

You need to prepare these knowledge documents based on the Agent's scenarios and positioning.

For example, if you want the Agent to be able to answer questions about returns and exchanges on the e-commerce platform, then you need to prepare relevant documents on the rules and regulations of returns and exchanges for the e-commerce platform.

Tools

Tools, which are the necessary instruments for a Agent to execute tasks,

For a Agent, tools are essentially "APIs." By providing APIs to the Agent as tools, the Agent can call these APIs at appropriate times to perform certain tasks.

For example, if a user on an e-commerce platform wants to return a product, the Agent developer can provide the return API to the Agent. The Agent can then call this API when the user wants to return a product, directly completing the return process for the user.

You need to prepare these tools (APIs) based on the Agent's scenario and positioning, such as return APIs, exchange APIs, ticket submission APIs, etc.

Design

After clarifying the ideas and preparing the resources, we can proceed to the formal design phase of the Agent.

Identity Prompt: Positioning the Agent

image-20240702183619107

Writing a reasonable identity prompt is the first step in creating an excellent Agent.

In the identity prompt, we need to define the Agent's role, skills, tasks, constraints, etc., so that it clearly understands its responsibilities and performs better.

Read More: How to Write Effective and Powerful Identity Prompts

Context Allocation: Allowing Agents to Handle Appropriate Information Within Limited Space

image-20240702183704628

The Agent is based on a Large Language Model (LLM), and the context length of an LLM represents the amount of information it can handle, which is limited.

Just like different people's ability to read articles, some can read a large number of articles in a short time and still understand and process the content and information, while others can only read a small number of articles in the same time and may not even be able to process them.

The context length of an LLM is similar to the amount of articles a human can read and process in a short time.

Therefore, within a limited context, providing appropriate information to the LLM and minimizing the provision of irrelevant information can help the LLM produce better results.

Read More: How to Reasonably Allocate LLM Context

Knowledge Base: Teaching the Agent Knowledge

image-20240702183726588

Upload your prepared knowledge documents into the knowledge base so that when users ask questions, the Agent will first use the user's query to search within the knowledge base, find document content semantically related to the question as a reference, and then have the LLM summarize and provide an answer to the user.

However, uploading knowledge documents into the Agent's knowledge base may not necessarily be the best way to utilize knowledge, which is related to the RAG framework and the way LLM handles tasks. You can refer to this article to design the method that best suits your scenario:

Read More: How to Operate the Knowledge Base

Tools: Enabling Agents to Perform Tasks

image-20240702183808305

In the Tools module, you can package your prepared APIs as Tools and add them to the Agent, enabling the Agent to execute these APIs.

Read More: How to Create Tools

At the same time, GPTBots also provides a large number of ready-to-use OpenTools, which you can directly add to the Agent according to your needs without the need for secondary development.

Flow: Suitable for Agents with Complex Processes

image-20240702183843845

If you need to design a agent logic that is quite complex, involving numerous steps, processes, or conditional judgments, then Flow will be an excellent choice. It allows you to freely configure the agent's workflow on a canvas by dragging and dropping.

Read More: About Flow

Memory: Enable the Agent to Continuously Handle Tasks

image-20240702184900746

Just like conversations between people, we need to know what we talked about earlier to continue the previous topic and keep the conversation going.

The memory of a agent works similarly.

You can determine whether to use memory and how to use it based on the bot's service scenario. For example:

  • If the agent involves many rounds of conversation to solve a problem (such as in-depth discussions), you can enable "long-term memory."
  • If the agent can solve the problem in just a few rounds of conversation (such as online consultations), you can enable only "short-term memory" without the need for "long-term memory."
  • If the agent's main task can be completed in one go (such as writing an email), then there is no need to enable any memory.

image-20240702184917989

At the same time, you can also provide user attributes to the Agent as memory, allowing the Agent to understand some information about the user to provide better personalized services.

image-20240702190038187

For example, for a hotel's room service Agent, when the Agent interacts with a guest, it must have already obtained the guest's name, room number, and other details through user attributes. Therefore, if the guest requests meal delivery through the Agent at this time, the Agent does not need to ask for the guest's name and room number again. It only needs to ask what kind of service the guest requires because the Agent knows which room to deliver the meal to.

Read More: About Memory and User Attributes

Release Usage: Where Agents Are Used

You can evaluate where the Agent needs to be integrated based on your specific use case to serve the users.

Integrate it wherever the users are.

Read more: Agent Integration

Examples

If you still don't quite understand how to design a Agent after reading the above content, we have also prepared a wealth of examples for you.

image-20240702192402047

You only need to select any template when creating a Agent, and you will get a Agent. You can refer to the parameters already set in the template to further understand and think about how to design a Agent. This is a very good way to learn.