Updated: 2026-01-07
[2026-01-07]

Thoughts on Chatbot AI-Agents

I've been thinking a lot about this current era of large language models (LLMs) "AI" systems. The companies behind these systems (ChatGPT etc.) are making big promises that soon their LLM models will be so intelligent they will be able to operate as intelligent agents. That they will become capable of taking general instructions then going out into the real world and completing tasks on our behalf.

These are big promises, and actually might be useful. Our modern world lets us complete so many tasks easily through a screen, but the process is not as frictionless as speaking a sentence. For example it is always a slight hassle to buy something from a website you don't use often. You have to login, navigate their UI, sometimes re-enter personal information and payment details. Fundamentally this is wasted time that would be nice to handoff to a machine.

The first critical component of this is smoothly evaluating human input and providing a conversational way to interact with the "AI" agent. We've had similar technology like Amazon Alexa but the technology is clunky because you must speak in a manner the computer can understand. It cannot be as relaxed as a human conversation. For that reason, I have always avoided such systems as they can be painfully slow to interact with. This interaction system is critical to mass adoption in our everyday life. It must be:

  1. Able to parse instructions given in natural conversation.
  2. Able to provide a smooth conversational system for confirming instructions with the user and gathering required information. (Similar to ordering food over the phone.)
  3. Be trusted to accurately carry out the instructions without oversight.

Right now these LLM chatbots appear to be pretty good at requirement #1, but I have not seen a lot of discussion of chatbot user experience which is critical for #2. Conversation must be seen as a UI/UX paradigm if these systems want to become the "AI for everything" they claim to be. If it is clunky then it will still be as tedious as a conversation with Siri or Alexa.

Requirement #3 is mostly hype at the moment. These AI systems are not yet smart enough to actually connect to the real world in a reliable way. That may change, but I foresee teaching LLMs to interact consistently with every person and interface on earth to be monumental challenge.

Future Developments I Hope to See

Conversation is UI

I want to see conversation as user interface taken more seriously as a topic of research and experimentation. What current generation chatbots can do is impressive, but it is an imprecise way of interaction. If speech becomes the predominate way people interact with computers then the interaction itself needs to be taken seriously.

Development of New API Standards

For chatbot systems to communicate with the outside world we need to give them infrastructure and guardrails. I would like to see further discussion of standardized APIs for interactions between chatbots and services. Ideally those APIs should be general purpose enough we can hook them into any AI or any future software systems.

Imagine having a standard API language that can be used for ordering food, reserving a hotel, signing up for a classes, etcetera. Systems that accept these API requests can provide guardrails by limiting what inputs they accept and restricting agents who send malicious requests. Developing standards and getting them adopted is difficult, but so would training a GPT model to interact with every piece of software on earth. Our resources can be better spent by limiting the scope of what an AI needs to learn and what it can interact with.

{
  "customer": "magicIDToken",
  "delivery": "yes",
  "items": [
    {
      "type": "pizza",
      "name": "Pepperoni",
      "size": "medium",
      "crust": "regular",
      "toppings": ["pepperoni"],
      "quantity": 2
    }
  ]
}

I'm reminded of autonomous vehicles, where we've spent millions of dollars in development and have yet to see them rolled out nationwide. I believe one day they will be, but meanwhile other nations (e.g. South Korea and China) have also built out comprehensive transit systems using old-fashioned rails. Real world solutions may require leveraging existing technologies. Perhaps APIs can be the rails that supercharges AI's ability to interact with the real world.

Free and Open-Source GPT-Models and Agents

I am not particularly interested in adopting these tools into my workflow until I control the software and my data. My understanding is the open source tools are quickly developing. At some point we will have high quality open source models that can be run on a common laptop or smartphone. If we push for common API standards these open source models could be as powerful as those managed by Google and OpenAI. That is the AI future I hope for.

Other Concerns

I want to limit this post to just discussions of AI-agents. The scope of the current AI discussion is large. The water is further muddied by leaders like Sam Altman claiming we are near to reaching an AI super intelligence. I would prefer to focus on what concrete improvements could be made to make a simple GPT chatbot agent more useful.

Given the importance of this topic in our current moment I plan to discuss my additional concerns in more depth in the future. I still have concerns with AI regarding:

  • Environmental impacts.
  • Impact on art, creativity, and learning.
  • Incorrect information, AI hallucinations, and harmful advice.
  • Violation of copyright and copyleft license terms.

For those reasons I generally avoid these tools except for occasional text questions when online searching does not yield good results. Unfortunately that means there is a small number of good sources so the chatbots often also give incorrect answers. Technology is not magic, it is what it is.

Comments

Email comments to comment@taingram.org.