版本:1.1.0b2 |发布日期:2016年7月1日

SQLAlchemy 1.1文档

配置关系如何连接

relationship() will normally create a join between two tables by examining the foreign key relationship between the two tables to determine which columns should be compared. 这种行为需要定制的情况有很多种。

处理多个连接路径

处理的最常见情况之一是两个表之间有多个外键路径。

考虑一个包含Address类的两个外键的Customer类:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class Customer(Base):
    __tablename__ = 'customer'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    billing_address_id = Column(Integer, ForeignKey("address.id"))
    shipping_address_id = Column(Integer, ForeignKey("address.id"))

    billing_address = relationship("Address")
    shipping_address = relationship("Address")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    street = Column(String)
    city = Column(String)
    state = Column(String)
    zip = Column(String)

上面的映射,当我们尝试使用它,会产生错误:

sqlalchemy.exc.AmbiguousForeignKeysError: Could not determine join
condition between parent/child tables on relationship
Customer.billing_address - there are multiple foreign key
paths linking the tables.  Specify the 'foreign_keys' argument,
providing a list of those columns which should be
counted as containing a foreign key reference to the parent table.

上面的消息很长。There are many potential messages that relationship() can return, which have been carefully tailored to detect a variety of common configurational issues; most will suggest the additional configuration that’s needed to resolve the ambiguity or other missing information.

在这种情况下,消息要求我们通过指示每一个外键列应该被考虑,来限定每个relationship(),并且适当的形式如下:

class Customer(Base):
    __tablename__ = 'customer'
    id = Column(Integer, primary_key=True)
    name = Column(String)

    billing_address_id = Column(Integer, ForeignKey("address.id"))
    shipping_address_id = Column(Integer, ForeignKey("address.id"))

    billing_address = relationship("Address", foreign_keys=[billing_address_id])
    shipping_address = relationship("Address", foreign_keys=[shipping_address_id])

在上面,我们指定了foreign_keys参数,它是一个ColumnColumn对象的列表,指示这些列被视为“外部”换句话说,包含引用父表的值的列。Loading the Customer.billing_address relationship from a Customer object will use the value present in billing_address_id in order to identify the row in Address to be loaded; similarly, shipping_address_id is used for the shipping_address relationship. 两栏的联动在坚持中也起着作用。刚刚插入的Address对象的新生成的主键将在刷新期间被复制到关联的Customer对象的相应外键列中。

使用Declarative指定foreign_keys时,我们也可以使用字符串名称来指定,但是如果使用列表,那么列表是字符串的一部分是很重要的:

billing_address = relationship("Address", foreign_keys="[Customer.billing_address_id]")

在这个具体的例子中,这个列表并不是必须的,因为我们只需要一个Column

billing_address = relationship("Address", foreign_keys="Customer.billing_address_id")

版本0.8中改变: relationship()可以单独根据foreign_keys参数解决外键目标之间的歧义;在这种情况下,不再需要primaryjoin参数。

指定交替连接条件

构造连接时,relationship()的默认行为是将主键列的值与外键引用列的值相等。在使用“secondary”表的情况下,我们可以将这个标准更改为我们希望使用primaryjoin参数的任何内容,以及secondaryjoin参数。

在下面的例子中,我们使用User类以及存储街道地址的Address类创建一个关系boston_addresses加载那些指定“波士顿”城市的Address对象:

from sqlalchemy import Integer, ForeignKey, String, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    boston_addresses = relationship("Address",
                    primaryjoin="and_(User.id==Address.user_id, "
                        "Address.city=='Boston')")

class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    user_id = Column(Integer, ForeignKey('user.id'))

    street = Column(String)
    city = Column(String)
    state = Column(String)
    zip = Column(String)

在这个字符串SQL表达式中,我们使用了and_()连接结构来为连接条件建立两个不同的谓词 - 连接User.idAddress.user_id列,并将Address中的行限制为city='Boston'使用声明时,基本的SQL函数(如and_())在字符串relationship()参数的计算命名空间中自动可用。

我们在primaryjoin中使用的自定义标准通常仅在SQLAlchemy正在呈现SQL以加载或表示此关系时才有意义。也就是说,它被用在为了执行每个属性的延迟加载而发出的SQL语句中,或者在查询时构建连接(例如通过Query.join())或通过渴望“加入”或“子查询”式的加载。当内存中的对象被操纵的时候,我们可以把任何我们想要的Address对象放到boston_addresses集合中,而不管.city属性是。对象将保留在集合中,直到属性过期并从应用该条件的数据库重新加载。当发生刷新时,boston_addresses内部的对象将被无条件地刷新,将主键user.id列的值赋给外键持有的address.user_id列。city条件在这里没有效果,因为flush过程只关心将主键值同步到引用外键值中。

创建自定义外部条件

主要连接条件的另一个要素是如何确定那些被认为是“外来”的列。通常,Column对象的一些子集将指定ForeignKey,否则将成为与联接条件相关的ForeignKeyConstraint的一部分。relationship() looks to this foreign key status as it decides how it should load and persist data for this relationship. 但是,可以使用primaryjoin参数来创建不涉及任何“模式”级外键的连接条件。我们可以明确地将primaryjoinforeign_keysremote_side结合起来,以建立这样的连接。

下面,一个类HostEntry加入到自身,将字符串content列与ip_address列(这是一个名为INET我们需要使用cast()来将连接的一边转换为另一边的类型:

from sqlalchemy import cast, String, Column, Integer
from sqlalchemy.orm import relationship
from sqlalchemy.dialects.postgresql import INET

from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class HostEntry(Base):
    __tablename__ = 'host_entry'

    id = Column(Integer, primary_key=True)
    ip_address = Column(INET)
    content = Column(String(50))

    # relationship() using explicit foreign_keys, remote_side
    parent_host = relationship("HostEntry",
                        primaryjoin=ip_address == cast(content, INET),
                        foreign_keys=content,
                        remote_side=ip_address
                    )

上面的关系会产生一个连接,如:

SELECT host_entry.id, host_entry.ip_address, host_entry.content
FROM host_entry JOIN host_entry AS host_entry_1
ON host_entry_1.ip_address = CAST(host_entry.content AS INET)

An alternative syntax to the above is to use the foreign() and remote() annotations, inline within the primaryjoin expression. 这个语法代表relationship()通常自己应用于给定foreign_keysremote_side参数的连接条件的注释。当显式连接条件存在时,这些函数可能会更加简洁,并且还可以用来精确地标记“外部”或“远程”列,而不管该列是多次声明的还是复杂的SQL表达式:

from sqlalchemy.orm import foreign, remote

class HostEntry(Base):
    __tablename__ = 'host_entry'

    id = Column(Integer, primary_key=True)
    ip_address = Column(INET)
    content = Column(String(50))

    # relationship() using explicit foreign() and remote() annotations
    # in lieu of separate arguments
    parent_host = relationship("HostEntry",
                        primaryjoin=remote(ip_address) == \
                                cast(foreign(content), INET),
                    )

在连接条件中使用自定义运算符

Another use case for relationships is the use of custom operators, such as Postgresql’s “is contained within” << operator when joining with types such as postgresql.INET and postgresql.CIDR. 对于自定义运算符,我们使用Operators.op()函数:

inet_column.op("<<")(cidr_column)

However, if we construct a primaryjoin using this operator, relationship() will still need more information. This is because when it examines our primaryjoin condition, it specifically looks for operators used for comparisons, and this is typically a fixed list containing known comparison operators such as ==, <, etc. 因此,对于我们的自定义操作员参与此系统,我们需要使用is_comparison参数将其注册为比较运算符:

inet_column.op("<<", is_comparison=True)(cidr_column)

一个完整的例子:

class IPA(Base):
    __tablename__ = 'ip_address'

    id = Column(Integer, primary_key=True)
    v4address = Column(INET)

    network = relationship("Network",
                        primaryjoin="IPA.v4address.op('<<', is_comparison=True)"
                            "(foreign(Network.v4representation))",
                        viewonly=True
                    )
class Network(Base):
    __tablename__ = 'network'

    id = Column(Integer, primary_key=True)
    v4representation = Column(CIDR)

上面,一个查询如:

session.query(IPA).join(IPA.network)

将呈现为:

SELECT ip_address.id AS ip_address_id, ip_address.v4address AS ip_address_v4address
FROM ip_address JOIN network ON ip_address.v4address << network.v4representation

版本0.9.2新增: - 添加了Operators.op.is_comparison标志来帮助创建relationship()结构使用自定义运算符。

重叠外键

使用复合外键时可能会出现一个罕见的情况,例如,单个列可能是通过外键约束引用的多个列的主题。

Consider an (admittedly complex) mapping such as the Magazine object, referred to both by the Writer object and the Article object using a composite primary key scheme that includes magazine_id for both; then to make Article refer to Writer as well, Article.magazine_id is involved in two separate relationships; Article.magazine and Article.writer:

class Magazine(Base):
    __tablename__ = 'magazine'

    id = Column(Integer, primary_key=True)


class Article(Base):
    __tablename__ = 'article'

    article_id = Column(Integer)
    magazine_id = Column(ForeignKey('magazine.id'))
    writer_id = Column()

    magazine = relationship("Magazine")
    writer = relationship("Writer")

    __table_args__ = (
        PrimaryKeyConstraint('article_id', 'magazine_id'),
        ForeignKeyConstraint(
            ['writer_id', 'magazine_id'],
            ['writer.id', 'writer.magazine_id']
        ),
    )


class Writer(Base):
    __tablename__ = 'writer'

    id = Column(Integer, primary_key=True)
    magazine_id = Column(ForeignKey('magazine.id'), primary_key=True)
    magazine = relationship("Magazine")

当上面的映射配置完成后,我们会看到这个警告:

SAWarning: relationship 'Article.writer' will copy column
writer.magazine_id to column article.magazine_id,
which conflicts with relationship(s): 'Article.magazine'
(copies magazine.id to article.magazine_id). Consider applying
viewonly=True to read-only relationships, or provide a primaryjoin
condition marking writable columns with the foreign() annotation.

What this refers to originates from the fact that Article.magazine_id is the subject of two different foreign key constraints; it refers to Magazine.id directly as a source column, but also refers to Writer.magazine_id as a source column in the context of the composite key to Writer. If we associate an Article with a particular Magazine, but then associate the Article with a Writer that’s associated with a different Magazine, the ORM will overwrite Article.magazine_id non-deterministically, silently changing which magazine we refer towards; it may also attempt to place NULL into this columnn if we de-associate a Writer from an Article. 警告让我们知道这是事实。

为了解决这个问题,我们需要打破Article的行为来包含以下所有三个特性:

  1. Article first and foremost writes to Article.magazine_id based on data persisted in the Article.magazine relationship only, that is a value copied from Magazine.id.
  2. Article can write to Article.writer_id on behalf of data persisted in the Article.writer relationship, but only the Writer.id column; the Writer.magazine_id column should not be written into Article.magazine_id as it ultimately is sourced from Magazine.id.
  3. Article takes Article.magazine_id into account when loading Article.writer, even though it doesn’t write to it on behalf of this relationship.

为了得到#1和#2,我们可以只指定Article.writer_id作为Article.writer的“外键”:

class Article(Base):
    # ...

    writer = relationship("Writer", foreign_keys='Article.writer_id')

但是,当查询Writer时,这会影响Article.writer不考虑Article.magazine_id

SELECT article.article_id AS article_article_id,
    article.magazine_id AS article_magazine_id,
    article.writer_id AS article_writer_id
FROM article
JOIN writer ON writer.id = article.writer_id

因此,为了获得#1,#2和#3的全部,我们通过将primaryjoin完整地与foreign_keys参数,或者用foreign()进行注释更简洁:

class Article(Base):
    # ...

    writer = relationship(
        "Writer",
        primaryjoin="and_(Writer.id == foreign(Article.writer_id), "
                    "Writer.magazine_id == Article.magazine_id)")

版本1.0.0更改: ORM将尝试警告当一列同时用作多个关系的同步目标。

非关系比较/物化路径

警告

本节详细介绍了一个实验功能。

使用自定义表达式意味着我们可以产生不符合常规主/外关键字模型的非正统连接条件。一个这样的例子是物化路径模式,在这里我们比较重叠路径令牌的字符串以产生树结构。

通过仔细使用foreign()remote(),我们可以建立起一个关系,有效地产生一个基本的物化路径系统。Essentially, when foreign() and remote() are on the same side of the comparison expression, the relationship is considered to be “one to many”; when they are on different sides, the relationship is considered to be “many to one”. 为了比较我们将在这里使用,我们将处理集合,所以我们保持东西配置为“一对多”:

class Element(Base):
    __tablename__ = 'element'

    path = Column(String, primary_key=True)

    descendants = relationship('Element',
                           primaryjoin=
                                remote(foreign(path)).like(
                                        path.concat('/%')),
                           viewonly=True,
                           order_by=path)

在上面,如果有一个Element对象的路径属性为"/foo/bar2",我们寻找一个Element.descendants看起来像:

SELECT element.path AS element_path
FROM element
WHERE element.path LIKE ('/foo/bar2' || '/%') ORDER BY element.path

版本0.9.5中的新增功能:已添加支持以允许在primaryjoin条件中对自身进行单列比较,以及对使用ColumnOperators.like()

自我指涉的多对多关系

多对多关系可以由primaryjoinsecondaryjoin中的一个或两个来定制 - 后者对于指定使用secondary参数。涉及到primaryjoinsecondaryjoin使用的一个常见情况是在从类到自身之间建立多对多的关系时,如下所示:

from sqlalchemy import Integer, ForeignKey, String, Column, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship

Base = declarative_base()

node_to_node = Table("node_to_node", Base.metadata,
    Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
    Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True)
)

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    label = Column(String)
    right_nodes = relationship("Node",
                        secondary=node_to_node,
                        primaryjoin=id==node_to_node.c.left_node_id,
                        secondaryjoin=id==node_to_node.c.right_node_id,
                        backref="left_nodes"
    )

在上面的地方,SQLAlchemy不能自动知道哪些列应该连接到right_nodesleft_nodes关系。primaryjoinsecondaryjoin参数确定了我们想要如何加入关联表。在上面的声明式表单中,当我们在对应于Node类的Python块中声明这些条件时,id变量可直接作为Column

Alternatively, we can define the primaryjoin and secondaryjoin arguments using strings, which is suitable in the case that our configuration does not have either the Node.id column object available yet or the node_to_node table perhaps isn’t yet available. 当在一个声明性字符串中引用一个普通的Table对象时,我们使用该表的字符串名称,因为它存在于MetaData中:

class Node(Base):
    __tablename__ = 'node'
    id = Column(Integer, primary_key=True)
    label = Column(String)
    right_nodes = relationship("Node",
                        secondary="node_to_node",
                        primaryjoin="Node.id==node_to_node.c.left_node_id",
                        secondaryjoin="Node.id==node_to_node.c.right_node_id",
                        backref="left_nodes"
    )

这里的经典映射情况是类似的,其中node_to_node可以连接到node.c.id

from sqlalchemy import Integer, ForeignKey, String, Column, Table, MetaData
from sqlalchemy.orm import relationship, mapper

metadata = MetaData()

node_to_node = Table("node_to_node", metadata,
    Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True),
    Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True)
)

node = Table("node", metadata,
    Column('id', Integer, primary_key=True),
    Column('label', String)
)
class Node(object):
    pass

mapper(Node, node, properties={
    'right_nodes':relationship(Node,
                        secondary=node_to_node,
                        primaryjoin=node.c.id==node_to_node.c.left_node_id,
                        secondaryjoin=node.c.id==node_to_node.c.right_node_id,
                        backref="left_nodes"
                    )})

请注意,在这两个例子中,backref关键字指定一个left_nodes backref - 当relationship()创建反方向的第二个关系时,足以扭转primaryjoinsecondaryjoin参数。

复合“次要”加入

注意

本节介绍SQLAlchemy的一些新的和实验性的功能。

有时,当人们试图在两个表之间建立一个relationship()这是relationship()的一个领域,它试图推动可能的边界,并且通常需要在SQLAlchemy邮件列表上敲定许多这些外来用例的最终解决方案。

在最近版本的SQLAlchemy中,为了提供由多个表组成的复合目标,在这些情况下可以使用secondary参数。下面是一个这样的连接条件的例子(要求版本0.9.2至少要起作用):

class A(Base):
    __tablename__ = 'a'

    id = Column(Integer, primary_key=True)
    b_id = Column(ForeignKey('b.id'))

    d = relationship("D",
                secondary="join(B, D, B.d_id == D.id)."
                            "join(C, C.d_id == D.id)",
                primaryjoin="and_(A.b_id == B.id, A.id == C.a_id)",
                secondaryjoin="D.id == B.d_id",
                uselist=False
                )

class B(Base):
    __tablename__ = 'b'

    id = Column(Integer, primary_key=True)
    d_id = Column(ForeignKey('d.id'))

class C(Base):
    __tablename__ = 'c'

    id = Column(Integer, primary_key=True)
    a_id = Column(ForeignKey('a.id'))
    d_id = Column(ForeignKey('d.id'))

class D(Base):
    __tablename__ = 'd'

    id = Column(Integer, primary_key=True)

In the above example, we provide all three of secondary, primaryjoin, and secondaryjoin, in the declarative style referring to the named tables a, b, c, d directly. AD的查询如下所示:

sess.query(A).join(A.d).all()

SELECT a.id AS a_id, a.b_id AS a_b_id FROM a JOIN ( b AS b_1 JOIN d AS d_1 ON b_1.d_id = d_1.id JOIN c AS c_1 ON c_1.d_id = d_1.id) ON a.b_id = b_1.id AND a.id = c_1.a_id JOIN d ON d.id = b_1.d_id

在上面的例子中,我们利用能够将多个表填充到“二级”容器中,以便我们可以跨多个表加入,同时仍然保持relationship()的“简单”在“左”和“右”方面只有“一张”表;复杂性保持在中间。

版本0.9.2中的新增功能:允许将join()结构直接用作secondary参数的目标,包括对连接的支持,急切的加入和延迟加载,以及在声明式中支持指定复杂条件,如将类名作为目标的连接。

与非主映射器的关系

在上一节中,我们举例说明了一种技术,我们使用secondary为了在连接条件中放置额外的表。有一个复杂的连接案例,即使这种技术是不够的,当我们试图从A加入B时,可以使用任何数量的CD等。之间,但是也有AB 直接之间的连接条件。In this case, the join from A to B may be difficult to express with just a complex primaryjoin condition, as the intermediary tables may need special handling, and it is also not expressable with a secondary object, since the A->secondary->B pattern does not support any references between A and B directly. 当这个非常高级的情况出现时,我们可以求助于创建第二个映射作为关系的目标。这是我们使用mapper()的地方,以便映射到一个包含我们所需的所有附加表的类。为了生成这个映射器作为我们类的“替代”映射,我们使用non_primary标志。

Below illustrates a relationship() with a simple join from A to B, however the primaryjoin condition is augmented with two additional entities C and D, which also must have rows that line up with the rows in both A and B simultaneously:

class A(Base):
    __tablename__ = 'a'

    id = Column(Integer, primary_key=True)
    b_id = Column(ForeignKey('b.id'))

class B(Base):
    __tablename__ = 'b'

    id = Column(Integer, primary_key=True)

class C(Base):
    __tablename__ = 'c'

    id = Column(Integer, primary_key=True)
    a_id = Column(ForeignKey('a.id'))

class D(Base):
    __tablename__ = 'd'

    id = Column(Integer, primary_key=True)
    c_id = Column(ForeignKey('c.id'))
    b_id = Column(ForeignKey('b.id'))

# 1. set up the join() as a variable, so we can refer
# to it in the mapping multiple times.
j = join(B, D, D.b_id == B.id).join(C, C.id == D.c_id)

# 2. Create a new mapper() to B, with non_primary=True.
# Columns in the join with the same name must be
# disambiguated within the mapping, using named properties.
B_viacd = mapper(B, j, non_primary=True, properties={
    "b_id": [j.c.b_id, j.c.d_b_id],
    "d_id": j.c.d_id
    })

A.b = relationship(B_viacd, primaryjoin=A.b_id == B_viacd.c.b_id)

在上面的例子中,当我们查询的时候,我们的B的非主映射器会发出额外的列;这些可以被忽略:

sess.query(A).join(A.b).all()

SELECT a.id AS a_id, a.b_id AS a_b_id FROM a JOIN (b JOIN d ON d.b_id = b.id JOIN c ON c.id = d.c_id) ON a.b_id = b.id

构建启用查询的属性

非常雄心勃勃的自定义连接条件可能无法直接持久,在某些情况下甚至可能无法正确加载。要删除等式的持久性部分,请使用relationship()上的标记viewonly,将其建立为只读属性(写入集合的数据将是在flush()上被忽略)。但是,在极端情况下,请考虑将常规Python属性与Query结合使用,如下所示:

class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)

    def _get_addresses(self):
        return object_session(self).query(Address).with_parent(self).filter(...).all()
    addresses = property(_get_addresses)