How does Delta Lake manage feature compatibility?
Many Delta Lake optimizations require enabling Delta Lake features on a table. Delta Lake features are always backwards compatible, so tables written by a lower Delta Lake version can always be read and written by a higher Delta Lake version. Enabling some features breaks forward compatibility with workloads running in a lower Delta Lake version. For features that break forward compatibility, you must update all workloads that reference the upgraded tables to use a compliant Delta Lake version.
What Delta Lake features require client upgrades?
Section titled “What Delta Lake features require client upgrades?”The following Delta Lake features break forward compatibility. Features are enabled on a table-by-table basis.
Feature | Requires Delta Lake version or later | Documentation |
---|---|---|
CHECK constraints | Delta Lake 0.8.0 | CHECK constraint |
Generated columns | Delta Lake 1.0.0 | Use generated columns |
Column mapping | Delta Lake 1.2.0 | Delta column mapping |
Change data feed | Delta Lake 2.0.0 | Change data feed |
Deletion vectors | Delta Lake 2.3.0 | What are deletion vectors? |
Table features | Delta Lake 2.3.0 | What are table features? |
Timestamp without Timezone | Delta Lake 2.4.0 | TimestampNTZType |
Iceberg Compatibility V1 | Delta Lake 3.0.0 | IcebergCompatV1 |
Iceberg Compatibility V2 | Delta Lake 3.1.0 | IcebergCompatV2 |
V2 Checkpoints | Delta Lake 3.0.0 | V2 Checkpoint Spec |
Domain metadata | Delta Lake 3.0.0 | Domain Metadata Spec |
Clustering | Delta Lake 3.1.0 | Use liquid clustering for Delta tables |
Row Tracking | Delta Lake 3.2.0 | Use row tracking for Delta tables |
Type widening (Preview) | Delta Lake 3.2.0 | Delta type widening |
Identity columns | Delta Lake 3.3.0 | Use identity columns |
In-Commit Timestamps | Delta Lake 3.3.0 | Use identity columns |
What is a table protocol specification?
Section titled “What is a table protocol specification?”Every Delta table has a protocol specification which indicates the set of features that the table supports. The protocol specification is used by applications that read or write the table to determine if they can handle all the features that the table supports. If an application does not know how to handle a feature that is listed as supported in the protocol of a table, then that application is not be able to read or write that table.
The protocol specification is separated into two components: the read protocol and the write protocol.
Read protocol
Section titled “Read protocol”The read protocol lists all features that a table supports and that an application must understand in order to read the table correctly. Upgrading the read protocol of a table requires that all reader applications support the added features.
Write protocol
Section titled “Write protocol”The write protocol lists all features that a table supports and that an application must understand in order to write to the table correctly. Upgrading the write protocol of a table requires that all writer applications support the added features. It does not affect read-only applications, unless the read protocol is also upgraded.
Which protocols must be upgraded?
Section titled “Which protocols must be upgraded?”Some features require upgrading both the read protocol and the write protocol. Other features only require upgrading the write protocol.
As an example, support for CHECK
constraints is a write protocol feature: only writing applications need to know about CHECK
constraints and enforce them.
In contrast, column mapping requires upgrading both the read and write protocols. Because the data is stored differently in the table, reader applications must understand column mapping so they can read the data correctly.
For more on upgrading, see Upgrading protocol versions.
What are table features?
Section titled “What are table features?”In Delta Lake 2.3.0 and above, Delta Lake table features introduce granular flags specifying which features are supported by a given table. Table features are the successor to protocol versions and are designed with the goal of improved flexibility for clients that read and write Delta Lake. See What is a protocol version?.
A Delta table feature is a marker that indicates that the table supports a particular feature. Every feature is either a write protocol feature (meaning it only upgrades the write protocol) or a read/write protocol feature (meaning both read and write protocols are upgraded to enable the feature).
To learn more about supported table features in Delta Lake, see the Delta Lake protocol.
Do table features change how Delta Lake features are enabled?
Section titled “Do table features change how Delta Lake features are enabled?”If you only interact with Delta tables through Delta Lake, you can continue to track support for Delta Lake features using minimum Delta Lake requirements. If you read and write from Delta tables using other systems, you might need to consider how table features impact compatibility, because there is a risk that the system could not understand the upgraded protocol versions.
What is a protocol version?
Section titled “What is a protocol version?”A protocol version is a protocol number that indicates a particular grouping of table features. In Delta Lake 2.3.0 and below, you cannot enable table features individually. Protocol versions bundle a group of features.
Delta tables specify a separate protocol version for read protocol and write protocol. The transaction log for a Delta table contains protocol versioning information that supports Delta Lake evolution.
The protocol versions bundle all features from previous protocols. See Features by protocol version.
Features by protocol version
Section titled “Features by protocol version”The following table shows minimum protocol versions required for Delta Lake features.
Feature | minWriterVersion | minReaderVersion | Documentation |
---|---|---|---|
Basic functionality | 2 | 1 | Welcome to the Delta Lake documentation |
CHECK constraints | 3 | 1 | CHECK constraint |
Change data feed | 4 | 1 | Change data feed |
Generated columns | 4 | 1 | Use generated columns |
Column mapping | 5 | 2 | Delta column mapping |
Identity columns | 6 | 1 | Use identity columns |
Table features read | 7 | 1 | What are table features? |
Table features write | 7 | 3 | What are table features? |
Deletion vectors | 7 | 3 | What are deletion vectors? |
Timestamp without Timezone | 7 | 3 | TimestampNTZType |
Iceberg Compatibility V1 | 7 | 2 | IcebergCompatV1 |
V2 Checkpoints | 7 | 3 | V2 Checkpoint Spec |
Vacuum Protocol Check | 7 | 3 | Vacuum Protocol Check Spec |
Row Tracking | 7 | 3 | Use row tracking for Delta tables |
Type widening (Preview) | 7 | 3 | Delta type widening |
In-Commit Timestamps | 7 | 3 | In-Commit Timestamps Spec |
Upgrading protocol versions
Section titled “Upgrading protocol versions”You can choose to manually update a table to a newer protocol version. We recommend using the lowest protocol versions that support the Delta Lake features required for your table. Upgrading the writer protocol might cause less disruption than upgrading the reader protocol since systems and workloads using older Delta Lake versions can still read from tables, even if they do not support the updated writer protocol.
To upgrade a table to a newer protocol version, use the DeltaTable.upgradeTableProtocol
method:
-- Upgrades the reader protocol version to 1 and the writer protocol version to 3.ALTER TABLE <table_identifier> SET TBLPROPERTIES('delta.minReaderVersion' = '1', 'delta.minWriterVersion' = '3')
from delta.tables import DeltaTable delta =DeltaTable.forPath(spark, "path_to_table") # or DeltaTable.forNamedelta.upgradeTableProtocol(1, 3) # upgrades to readerVersion=1,writerVersion=3
import io.delta.tables.DeltaTableval delta = DeltaTable.forPath(spark, "path_to_table") // or DeltaTable.forNamedelta.upgradeTableProtocol(1, 3) // Upgrades to readerVersion=1, writerVersion=3.