In the past two weeks, I've dedicated significant time to explore the latest advancements in language models (LLMs).
Since early April, a new trend has emerged: autonomous agents.
In this post, I would like to share my recent learnings around them and how I plan to apply this knowledge in the data engineering context.
Autonomous agents
Two projects are leading the way around autonomous agents:
The core concept behind these projects is to combine multiple GPT sessions, called agents, interacting with each other to accomplish a specific task.
Typically, these projects are structured like this:
One agent generates tasks: the "Task Creation Agent"
Another agent performs the tasks and sends the results to others: the "Task Execution Agent"
A third agent prioritizes the tasks: the "Task Prioritization Agent"
Each agent calls GPT-4 in the background to accomplish its specific objective.
The task creation, prioritization, and execution processes occur recursively in an infinite loop.
To maintain a consistent state across the loops, shared memory is added, often Pinecone DB (vector database).
The idea is simply to simulate a trial-and-error process within GPT (or any language model) until a desired goal is reached. With this process, the language model can correct its own mistakes by itself without human feedback in the loop.
It’s kind of an organizational management layer above GTP "workers".
The framework
The community builds these systems using a framework called Langchain.
This framework simplifies interactions between language models, such as GPT, and other data sources like APIs, websites, and databases.
They are basically building the interfaces you need to easily create complex solutions around language models.
The framework provides wrappers and abstractions around the following objects:
Model: An abstraction over language models like GPT
Prompt: An abstraction for easy reuse and filling of prompt templates
Memory: A mechanism to persist data across calls to models
Index: A feature that makes a model aware of your own data
Chain: A sequence of calls to a language model
Agent: A standard interface for building the concept of recursive agents mentioned earlier
Let's examine an example relevant to data engineers, where the autonomous agent is tasked with independently querying an API.
In this use case from Langchain's documentation, the agent needs to answer the following request:
make me a playlist with the first song from kind of blue. call it machine blues.
with only the Spotify API definition and access tokens provided.
With Langcahin this task can be coded in a few lines:
from langchain.llms.openai import OpenAI
from langchain.agents.agent_toolkits.openapi import planner
llm = OpenAI(model_name="gpt-4", temperature=0.0)
spotify_agent = planner.create_openapi_agent(spotify_api_spec, requests_wrapper, llm)
user_query = "make me a playlist with the first song from kind of blue. call it machine blues."
spotify_agent.run(user_query)
spotify_api_spec
is an object containing row API documentation (in open API format) and requests_wrapper
is a function adding the API access token to the API calls.
The results are really impressive:
The agent successfully comprehended the query, formulated a plan consisting of five queries to obtain the answer, and autonomously executed these queries.
Examples of applications
A lot of nice applications have popped us in the last weeks:
coding agent that can execute Python code and correct it recursively.
coding agent following Test Driven Development: you write the test, and the agent builds the code.
https://twitter.com/adamcohenhillel/status/1644836492294905856
market research agent
and 1000 more …
My next step
I'm enthusiastic about applying this trend in the data engineering field.
As previously demonstrated, agents can now read API specifications and execute code.
This week, I plan to conduct a small experiment focused on constructing a data pipeline entirely through an autonomous agent.
I will grant my AWS account access to the agent and give him the following goal: get data from an API at a certain frequency in a bucket. Let’s see if he can create the infrastructure and application code by itself …
If successful, this experiment holds incredible potential: the agent could source any available open data and integrate them into a data share, such as Snowflake to make them easily accessible to any company.
I will provide updates in the next newsletter post :)
Thanks for reading,
-Ju
I would be grateful if you could help me to improve this newsletter. Don’t hesitate to share with me what you liked/disliked and the topic you would like to be tackled.
P.S. you can reply to this email; it will get to me.