TLDR: Like most API products in the market, Assistants API allow us to build the core of our product

A lot of amazing stuffs are announced on OpenAI DevDay like GPT-4 Turbo with 128k context, GPT with Vision and lower pricing. The one I am fond with is the new Assistants API.

I have already experimenting with chatbot apps for awhile now before the launch of Assistants API. If you are going to build a chatbot with your own knowledge base, you are most likely to stumble upon the concept of RAG (Retrieval-augmented generation). It is fairly easy to get started and build something that actually work. Langchain has a great tutorial on that.

However, there are actually a lot of steps and decisions to be taken care of behind the scene.

  • Trunacating the document
    • Because of the limited context length, we have to split a document into multiple chunks.
  • Calculate document embeddings and store it in a Vector DB
  • Decide the number of documents to retrieve
    • Retrieving too much document will lead to unrelavant result
    • Retrieving too little document might miss out important information
  • Avoiding duplicated documents
  • Managing chat history
    • We have to truncate the chat in case it grow too large
    • We have to decide how to truncate so that the context is retain
  • Document citation

All of this is taken care by Assistants API which reduced the steps to just uploading the document. The downside are the number of documents is limited to 20 as of now and the overall cost might be higher compare to doing all yourself.


If you are still interested, I will share the experience building with Assistants API for Wander which focus on Function Call. I was trying to replicate the Wanderlust demo shown in the DevDay

Wander Demo:

Wander Demo

Check out the Wander Github Repo if you are interested. I enhanced it with real-time data using SerpApi Google Maps API.

First time seeing the demo, I thought probably that's how the future of UIUX is gonna be like. Even though the demo of showing markers on map is relative small use case but I believe people will find way to implement them effectively on their applications. Here is an great article that explain the concept I want to talk about.

I believe you can build all this with Chat Completions API alone, but Assistants API just make everything easier. For example, the flow of chat is managed. It tells you when it is time to handle Function call and the output from the Function call should submit to the API to get the next step.

Currently the execution especially those include Function call is taking too long (50s+) and streaming is not supported yet. These are the two most important improvements that can really make it usuable in production.

I haven't use Assistants API for scale and production level, so the information I can share is limited. If you have deployed it to production, do let me know!