The purpose of this blog post is to share my experience and findings while implementing my own Typescript version of Auto-GPT. I strongly believe in foundational knowledge, so I wanted to build this project from scratch to truly comprehend the underlying ideas. Most of the concepts and ideas are the same (and a lot of the structure is heavily inspired by Auto-GPT); the main difference in the implementation between this implementation (at the time I start this project) and the original Auto-GPT codebase is that I took a more OOO approach to keep it maintainability and a bit of DDD inspiration for modularity.
In this post, I'll provide an overview of the architecture I used, discuss my observations on the system, hacks that I found significantly improved performance, and suggest potential improvements and further ideas. I hope that by sharing my journey and technical insights, others might find it interesting or useful.
In my implementation, I took a fairly standard approach to the main control loop. I built an Agent
class and a Memory
class, which together handle the creating coming up with plan of action and saving those memories to be later used as part of the prompt. The acting system was fairly unique in that it uses a commandBus
which commands can be registered to which could then be invoked on by the Agent.
Overview of the architecture of the system.
I wonβt go into much detail here as I think there wasnβt much difference between my implementation and that of Auto-GPT.
Similarities with the original Auto-GPT
The agent I built can think and act, much like the Auto-GPT system. It uses a similar structure for generating prompts and interacting with the GPT-3.5 language model.
Memory Module
The memory module was fairly standard, it saved all the thoughts generated by the agent and then structured them when injected into the prompt. I spent a lot of time trying to come up with different ways of building out the memory. I have a vectorDB implementation using langChain and pinecone, but it was never actually needed with the use cases I was doing.
Type validation
One thing that was worth mentioning is that similar to Auto-GPT, I also found that if a return value from GPT-3.5 didn't match the required JSON format, just requiring it would often result in the correct format. My implementation has a wrapper for this around this in an OpenAiManager
class.
Docker Container to Execute code
I was a little uneasy about just letting Auto-GPT arbitrarily run code and commands on my device so all interactions done on a code and command level are run within a docker container.