TypeScript LangChain add field to document metadata – Typescript

by
Ali Hasan
langchain react-typescript

The Problem:

How to add a custom field to the metadata of a document using the Langchain JavaScript SDK?

The Solutions:

Solution 1: Custom metadata can be added through the 2nd parameter of createDocuments

The 2nd argument of the `createDocuments` method in the `CharacterTextSplitter` can be an array of objects. The properties of these objects will be assigned into the metadata of every element in the returned `documents` array.

For Example:

const myMetaData = { url: "https://www.google.com" };
const documents = await splitter.createDocuments([text], [myMetaData],
  { chunkHeader, appendChunkOverlapHeader: true });

After this, documents will contain an array, with each element being an object with pageContent and metaData properties. Under metaData, the properties from myMetaData above will also appear. pageContent will also have the text of chunkHeader prepended.

{
  pageContent: <chunkHeader plus the chunk>,
  metadata: <all properties of myMetaData plus loc (text line numbers of chunk)>
}

Solution 2: Adding Field to Metadata

To add a field to the metadata of a document in LangChain, you can iterate through the documents and modify their metadata directly. Here’s an example with an additional field named “doc_id”:

for (const _doc of docs) {
  _doc.metadata['doc_id'] = doc_id;
}

Solution 3: Use `Document` class with `splitDocuments` method

To add a field to the metadata of a Langchain Document while using the CharacterTextSplitter, follow these steps:

  1. Create a new instance of the Document class, passing the pageContent and the desired metadata as parameters.
  2. Use the CharacterTextSplitter.splitDocuments() method to split your text into multiple documents, passing the newly created Document instance as an argument.
const docOutput = await splitter.splitDocuments([
  new Document({ pageContent: text }, metadata: { someField: "someValue" })
]);

Q&A

How should I add a field to the metadata of Langchain’s Documents?

The 2nd argument of createDocuments can take an array of objects whose properties will be assigned into the metadata of every element of the returned documents array.

Video Explanation:

The following video, titled "Retrieval-Augmented Generation (RAG) using LangChain and ...", provides additional insights and in-depth exploration related to the topics discussed in this post.

Play video

... TypeScript/JavaScript worlds in the field of AI. As a developer advocate, he regularly shares his work through his articles and demos for ...