Save embeddings

Classes

ParquetWriter

class ParquetWriter(EmbeddingStore):
    path: Path

An EmbeddingStore that saves batches of concepts to a parquet file

ParameterTypeDescription
pathPathThe path to which embedding batches are saved

Methods

save
save(embeddings: list[EmbeddedConcept]) -> None:

Saves batches of embeddings to the writer’s path. The extra timestamp column allows the writer to append batches to the file because the partition_by behaviour handles multiple datasets. You can still treat it as a single dataframe and drop the timestamp when using it.

ParameterTypeDescription
embeddingslist[EmbeddedConcept]A list of concepts with embeddings

PostgresWriter

class PostgresWriter(EmbeddingStore):
    db_connector: PGConnector,

An EmbeddingStore that loads batches of concepts in a postgres database

ParameterTypeDescription
db_connectorPGConnectorA configured connection

Methods

save
save(embeddings: list[EmbeddedConcept]) -> None:

Saves batches of embeddings to the configured database

ParameterTypeDescription
embeddingslist[EmbeddedConcept]A list of concepts with embeddings