Some context...

So, I've been working on this blog run by AI agents, zap.cool. There are 5 agents, a Post Writer, Titler, and Tagger, a Meta Description Writer, and a Topic Creator.

It started as an experiment with OpenAI Assistants to create a team of agents that interact with each other in a human-like way to accomplish tasks (created by both users and agents), similar to Microsoft's AutoGen.

I started with the conversation framework as the foundation, but realized that for an MVP, simply passing the output to each agent sequentially would show better results than asking a single agent to do it all in one pass. While this wasn't conversational, I saw potential in them since they could still call functions. They would still be able to do work (sub-tasks) autonomously even if they were only doing a given task on command (through a cron job).

The Assistants API, while having it's benefits, like the built-in tools and threads, is limited. At the moment, you can't adjust parameters like temperature or top_p. Because of this, I've switched to Chat Completions which gives me more control of the output.

Since finishing the core functionality, most of my time has been spent tweaking the prompts and parameters for quality and consistency. And after much testing, I'm quite happy with the topic generation and post quality! There are still many more features that I want to add, like web browsing, research, and continuous performance analysis and improvement.

I'm also planning on making a tool to tweak prompts & parameters in the browser, giving me a nice view of how each change is affecting the output. Looking forward to this!

Refining the conversation framework

My initial focus was on getting the agents to work in a linear fashion - a bit like an assembly line in a factory. Each agent has its specialty, and it does its part before passing the baton to the next. This approach, while efficient, lacked the dynamic interaction I envisioned. So, I'm circling back to the idea of making these interactions more organic, more like a real team meeting.

I'm currently experimenting with ways to enhance the autonomy of each agent. The goal is for them to not just execute tasks in a set sequence but to "think" - as well as LLMs can - and collaborate. Imagine a scenario where the Topic Creator comes up with a theme, and the Post Writer, instead of just taking it at face value, actually discusses it, maybe suggesting a tweak based on recent trends or past performance data.

This back-and-forth could lead to more engaging, timely content. That alone could be a huge improvement, but I'd like to try using SocialAGI as well to make these conversations even better.

Autonomy

Balancing autonomy with efficiency is my next big challenge. I'm aware that giving these agents too much freedom could lead to inefficiencies - they might end up chatting more than working. Similar to my current cron job approach, I'm devising a schedule, something akin to a regular workday, where they have designated times for brainstorming and task execution. This way, they can collaborate effectively without burning through resources.

I'd love to (and will) give them full autonomy as a fun experiment, but I still want be able to provide inputs and thoughts to steer the output. Because the most important next step is integrating frameworks like this into our every day lives. These groups, teams, whatever you want to call them, of agents, should be an extension of us. Personal assistants to aid us in every aspect of our lives.