Dropping Partitions
Background
Partitioning is specified on a table when it is initially defined via the PARTITION BY
clause. This clause can contain a SQL expression on any columns, the results of which will define which partition a row is sent to.
The data ^^parts^^ are logically associated with each partition on disk and can be queried in isolation. For the example below, we partition the posts
table by year using the expression toYear(CreationDate)
. As rows are inserted into ClickHouse, this expression will be evaluated against each row and routed to the resulting partition if it exists (if the row is the first for a year, the partition will be created).
Read about setting the partition expression in a section How to set the partition expression.
In ClickHouse, users should principally consider partitioning to be a data management feature, not a query optimization technique. By separating data logically based on a key, each partition can be operated on independently e.g. deleted. This allows users to move partitions, and thus subsets, between storage tiers efficiently on time or expire data/efficiently delete from the cluster.
Drop partitions
ALTER TABLE ... DROP PARTITION
provides a cost-efficient way to drop a whole partition.
This query tags the partition as inactive and deletes data completely, approximately in 10 minutes. The query is replicated – it deletes data on all replicas.
In example, below we remove posts from 2008 for the earlier table by dropping the associated partition.