- text extraction - process complex document formats (PDF, CSV, etc.)
- text splitting - create meaningful chunks out of long pages of text
- embedding - vectorize chunks (extract semantic meaning)
- store - insert chunks to a specialized database
- embedding - vectorize user query
- search - retrieve the document chunks most similar to the user query
Building blocks
Let’s break down how each step can be implemented with the Agent Stack API, but first, make sure you have the Platform API extension enabled in your agent:default_input_modes parameter in the agent decorator.
First, let’s build a set of functions to process the documents which we will then use in the agent.
Text Extraction
To extract text from aFile uploaded to the Platform API, simply use file.create_extraction() and wait for
the result. After extraction is completed, the extraction object will contain
extracted_files, which is a list of extracted files in different formats.
Extraction Formats
Text extraction produces two extraction formats and you can request either subset by passingformats to create_extraction (e.g., ["markdown"] if you only need plain text):
- markdown: The extracted text formatted as Markdown (
file.load_text_content()) - vendor_specific_json: The Docling-specific JSON format containing document structure (
file.load_json_content())
WARNING:
The vendor_specific_json format is not generated for plain text or markdown files, as Docling does not support these formats as input.
Text Splitting
In this example we will useMarkdownTextSplitter from the
langchain-text-splitters package.
This will split a long document into reasonably sized chunks based on the Markdown header structure.
Embedding
Now we need to embed each chunk using the embedding service. Similarly to LLM, Agent Stack implements OpenAI-compatible embedding API. You can use any preferred client, in this example we will use the embedding extension to create anAsyncOpenAI client:
Store
Finally, to insert the prepared items, we need a function to create a vector store. For this we will need to know the dimension of the embeddings and model_id. Because the model is chosen by the embedding extension and we don’t know it in advance, we will create a test embedding request to calculate the dimension:vector_store.add_documents, this will become clear in the final example.
Query vector store
Assuming we have our knowledge base of documents prepared, we can now easily search the store according to the user query. The following function will retrieve five document chunks most similar to the query embedding:Putting all together
Having all the pieces in place, we can now build the agent.Simple agent
This is a simplified agent that expects a message with one or more files attached asFilePart and a
user query as TextPart. A new vector store is created for each message.