3.4 LangChain을 활용한 Vector Database 변경 (Chroma ➡️ Pinecone)

  • LangChain을 활용하면 쉽게 Vector Database 변경가능

  • LangChain 공식문서의 Chroma 사용 가이드 ↗️ 를 기준으로 보면 한줄만 변경하면 됨

# import
from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_community.embeddings.sentence_transformer import (
    SentenceTransformerEmbeddings,
)
from langchain_text_splitters import CharacterTextSplitter

# load the document and split it into chunks
loader = TextLoader("../../how_to/state_of_the_union.txt")
documents = loader.load()

# split it into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

# create the open-source embedding function
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

# load it into Chroma


db = PineconeVectorStore.from_documents(docs, embeddings, index_name=index_name)


# query it
query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)

# print results
print(docs[0].page_content)

Last updated