Generative AI — Current state and the emerging state of Autonomous AI Agents
Unless you have been living under a rock, you probably have heard of ChatGPT by OpenAI which currently at version 4. Natural Language Processing has come a long way because of the cost, available scale of compute power, and the exponential rise in the number of parameters used to train them. Just as a reference, the below list shows the increase in the number of parameters in these models released within the past two years.
- GPT 4 has over trillion parameters as cited by anonymous sources
- GPT 3 has about 175 billion parameters
- Bloom from the creators of Hugging Face has about 176 billion parameters
- ESMFold from Meta AI has about 15 billion parameters
- LaMDA (Language Model for Dialogue Applications) from Google has 137 billion parameters
- MT-NLG (Megatron-Turing Natural Language Generation) from Nvidia and Microsoft has about 540 billion parameters
With the increase in the number of parameters, the generated language feels more “natural/human like” to enable organic conversations. This has a lot of applications in creating realistic conversational AIs compared to the chatbots that have been around for the last 2+ years.
I did a simple comparison between ChatGPT and HuggingFace by running a few queries. I have shown a sample example below. You will notice that ChatGPT is pretty forgiving with the queries compared to HuggingFace. The below snapshots shows the output between the two options.
HuggingFace output #1 with a query that is not framed perfectly:
ChatGPT output with the same query gives a much better response:
You will notice that even though I didn’t frame my question clearly, ChatGPT was able to infer the context and generate the output. When I improved the query to sound more like a “question”, the outcome was much more desirable. The ChatGPT response was identical with the improved query.
Hugging Face output #2
HuggingFace did better with the question framed more clearly.
The opportunities for novel content created via text opens up a whole array of opportunities for conversational AI interfaces, marketing and static content in websites, email and social media campaigns etc.
Emergence of Agents
As amazing as the ChatGPT platform is, the one limiting factor is that we can do one query at a time. One compelling tool that is built on top of ChatGPT is Auto-GPT. This tool is impressive. This was released on March 30, 2023 and is already at 125K stars on Github!(Compared to less than 70K stars for PyTorch). Just to give a reference, Tensorflow is at about 174K at the time of this writing.
One of the reasons for this kind of growth for Auto-GPT is the ability for us to create “Agents”. They can be thought of a mechanism to create a pipeline of goals or checklists for us (upto 5 goals at this time in the case of AutoGPT). Although this is not the first “Agent” created so far using AI technologies, AutoGPT is by far the most robust one yet. For example, there is also BabyAGI (for reference, this has less than 14K stars on GitHub).
The illustration below shows Auto-GPT and BabyAGI in the top three. It is not a mere coincidence to see these AI projects trending and growing exponentially.
These agents are expensive in this current state of infancy and are not ready for mass consumption (one reason is the cost and the second reason is interpretability of AI). But, it will be interesting to see the progress of this as they mature.