omop.omop_models

source

The omop_models.py file defines SQLAlchemy ORM models for interacting with the OMOP standard tables and the additional embeddings table in a PostgreSQL database. These models map to the database tables used by lettuce, enabling programmatic access OMOP concepts.

The models use the sqlalchemy library for ORM functionality and pgvector for vector-based operations, supporting semantic search capabilities.

This page documents each model, its attributes, and its role in the system. The models are defined using SQLAlchemy’s declarative_base() and are organized as follows:

Concept

class Concept(Base):
    __tablename__ = "concept"
    __table_args__ = {"schema": DB_SCHEMA}
 
    concept_id = Column(Integer, primary_key=True)
    concept_name = Column(String)
    vocabulary_id = Column(String)
    concept_code = Column(String)
    standard_concept = Column(String)

This class represents an ORM mapping to the OMOP concept table, representing standardised medical concepts (e.g., drugs, conditions).

Attributes

ColumnTypeDescriptionConstraints
concept_idIntegerUnique identifier for the concept.Primary Key
concept_nameStringName of the concept (e.g., “Acetaminophen”).
vocabulary_idStringSource vocabulary (e.g., “RxNorm”).
concept_codeStringCode for the concept in the vocabulary.
standard_conceptStringIndicates if it’s a standard concept (e.g., “S”).

ConceptSynonym

class ConceptSynonym(Base):
    __tablename__ = "concept_synonym"
    __table_args__ = {"schema": DB_SCHEMA}
 
    concept_id = Column(Integer, primary_key=True)
    concept_synonym_name = Column(String)
    language_concept_id = Column(Integer)

This class represents an ORM mapping to the OMOP concept_synonym table, storing alternative names or synonyms for concepts.

Attributes

ColumnTypeDescriptionConstraints
concept_idIntegerID of the associated concept.Primary Key
concept_synonym_nameStringSynonym name (e.g., “Tylenol”).
language_concept_idIntegerID of the language for the synonym.

ConceptRelationship

class ConceptRelationship(Base):
    __tablename__ = "concept_relationship"
    __table_args__ = {"schema": DB_SCHEMA}
 
    concept_id_1 = Column(Integer)
    concept_id_2 = Column(Integer)
    relationship_id = Column(String)
    valid_start_date = Column(Date)
    valid_end_date = Column(Date)
    invalid_reason = Column(String)
    dummy_primary = Column(Integer, primary_key=True)

Maps to the concept_relationship table, defining relationships between concepts (e.g., “Maps to”, “Is a”).

Attributes

ColumnTypeDescriptionConstraints
concept_id_1IntegerID of the first concept.
concept_id_2IntegerID of the second concept.
relationship_idStringType of relationship (e.g., “Maps to”).
valid_start_dateDateStart date of the relationship’s validity.
valid_end_dateDateEnd date of the relationship’s validity.
invalid_reasonStringReason if the relationship is invalid.
dummy_primaryIntegerDummy primary key for ORM compatibility.Primary Key

ConceptAncestor

class ConceptAncestor(Base):
    __tablename__ = "concept_ancestor"
    __table_args__ = {"schema": DB_SCHEMA}
 
    ancestor_concept_id = Column(Integer)
    descendant_concept_id = Column(Integer)
    min_levels_of_separation = Column(Integer)
    max_levels_of_separation = Column(Integer)
    dummy_primary = Column(Integer, primary_key=True)

Maps to the concept_ancestor table, storing hierarchical relationships between ancestor and descendant concepts. Enables querying of concept hierarchies, such as finding broader categories (e.g., “Analgesics” for “Acetaminophen”).

Attributes

ColumnTypeDescriptionConstraints
ancestor_concept_idIntegerID of the ancestor concept.
descendant_concept_idIntegerID of the descendant concept.
min_levels_of_separationIntegerMinimum levels of separation in hierarchy.
max_levels_of_separationIntegerMaximum levels of separation in hierarchy.
dummy_primaryIntegerDummy primary key for ORM compatibility.Primary Key

Embedding

class Embedding(Base):
    __tablename__ = DB_VECTABLE
    __table_args__ = {"schema": DB_SCHEMA}
 
    concept_id = Column(Integer)
    embedding = mapped_column(Vector(DB_VECSIZE))
    dummy_primary = Column(Integer, primary_key=True)

Maps to the embeddings table, storing vector representations of OMOP concepts for semantic search.

To enable semantic search for mapping informal drug names to OMOP concept IDs, we use pgvector, an open-source PostgreSQL extension for storing, querying, and manipulating high-dimensional vector data. Vectors, representing embeddings generated by a pre-trained machine learning model, are stored as a vector data type in a single row of the embeddings table, each linked to a specific OMOP concept via its concept_id. This allows efficient similarity searches to match informal names to standardized concepts.

Attributes

ColumnTypeDescriptionConstraints
concept_idIntegerID of the associated concept.
embeddingVectorVector representation (size DB_VECSIZE).
dummy_primaryIntegerDummy primary key for ORM compatibility.Primary Key