{"id":18055816,"url":"https://github.com/hbutani/icebergsql","last_synced_at":"2025-04-11T02:06:22.226Z","repository":{"id":141965228,"uuid":"205969793","full_name":"hbutani/icebergSQL","owner":"hbutani","description":"Integration of Iceberg table management into Spark SQL","archived":false,"fork":false,"pushed_at":"2020-01-21T04:42:57.000Z","size":1381,"stargazers_count":11,"open_issues_count":2,"forks_count":4,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-11T02:02:50.406Z","etag":null,"topics":["iceberg","spark","sql"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hbutani.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-03T02:07:39.000Z","updated_at":"2021-09-05T06:19:29.000Z","dependencies_parsed_at":"2023-05-09T14:19:36.040Z","dependency_job_id":null,"html_url":"https://github.com/hbutani/icebergSQL","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbutani%2FicebergSQL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbutani%2FicebergSQL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbutani%2FicebergSQL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hbutani%2FicebergSQL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hbutani","download_url":"https://codeload.github.com/hbutani/icebergSQL/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248328170,"owners_count":21085261,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["iceberg","spark","sql"],"created_at":"2024-10-31T01:12:01.004Z","updated_at":"2025-04-11T02:06:22.216Z","avatar_url":"https://github.com/hbutani.png","language":"Scala","readme":"# Integrating Iceberg Table Management into Spark SQL\n\n[Iceberg]({https://iceberg.apache.org/spec/) introduces the\nconcept of *Table formats* (as opposed to File Formats) that defines an access\nand management model for Tables in Big Data systems. At its\ncore it is a well documented and portable specification of versionable Table\nmetadata (both physical and logical metadata). On top of this it provides a set\nof capabilities: Snapshot isolation, significant speed and simplicity in access\nof  Table metadata critical for Query Planning overhead( even in the case of datasets with millions\nof files and partitions), schema evolution, and partition layout isolation( hence\nthe ability to change physical layout without changing Applications). \n\nThese capabilities fill some very critical gaps of\nTable management in Big Data systems, and hence various open source communities\nhave quickly adopted/integrated Iceberg functionality. Iceberg was initially developed\nat Netflix; subsequently(likely because of its wide appeal) Netflix has\ngraciously incubated it as an [Apache project](https://github.com/apache/incubator-iceberg).\n\nIt does this by defining clear contracts for underlying file formats such\nas  how schema and statistics from these formats are available to/in Iceberg. It\nprescribes how Iceberg capabilities can be integrated into existing Big Data\nscenarios through various packaged components such as *iceberg-parquet*, *iceberg-orc*,\n*iceberg-avro* for applications that directly manage *parquet, orc and avro*\nfiles. Further *iceberg-hive*, *iceberg-presto* and *iceberg-spark* are packaged jars\nthat can be dropped into scenarios using /hive, presto and spark/ and want to leverage\nIceberg to manage datasets. \n\n[Apache Spark]() is rapidly gaining traction as a platform for Enterprise Analytical\nworkloads: for example our [Oracle SNAP Spark native OLAP Platform](https://tinyurl.com/y8hbyp9q) \n( [see also](https://tinyurl.com/y59k23bf)) is used  by a Fortune 10 company to\npower their Finance Data Lake. These tend to be **SQL heavy** (in fact almost\nexclusively SQL based) solutions. It is a time honored tradition to surface\nanalytical and management functionality in SQL, for example as SQL Row and Table\nfunctions, or Database options like OLAP and Geospatial capabilities.\nData Management is a critical aspect of an\nAnalytical platform, but unfortunately is an underdeveloped component of Apache\nSpark. This has led customers to come up with their own data management schemes \nsuch as using Hive ACID Tables for data management with\nSpark for query processing, or custom solutions using ETL platforms and tools\nsuch as Talend and Airflow. Providing Table management that is seamlessly\nintegrated into familiar SQL verbs such as *create table*, *insert*, and *select*\nsimplifies the task of developing Analytical solutions on Apache Spark and\nwill drive further adoption.\n\nFor [Apache Spark](), Iceberg integration is not fully available for the SQL layer. \nThere is work going on to surface \n[Iceberg Table Management as a V2 Datasource table](https://databricks.com/session/apache-spark-data-source-v2), \nbut V2 Datasources itself are [not fully integrated into Spark SQL](https://tinyurl.com/y5u576gk) \n( [see also]({https://tinyurl.com/yylna72p)).  Given the significant of Apache Spark 2.x we feel it is useful\nto provide the Table Management for Datasource V1 tables, bringing this\nfunctionality to a large deployed base.  These reasons led us to develop the ability to use \nIceberg Table Management capabilities with Spark SQL, specifically Datasource V1 tables. Our component will:\n- allow users to **create managed tables** and define source column to partition\n  column transformations as table options.  \n- have **SQL insert statements create new Iceberg Table snapshots**\n- have **SQL select~ statements leverage Iceberg Table snapshots** for partition\n  and file pruning\n- provide **a new 'as of' clause to the sql select statement** to run a query against a\n  particular snapshot of a managed table.\n- **extend Spark SQL with Iceberg management views and statements** to view and manage the\n  snapshots of a managed table.\n\n## How to use the functionality?\n\n- setup spark.sql.extensions= org.apache.spark.sql.iceberg.planning.SparkSessionExtensions\n- setup classpath via\n  - spark.executor.extraClassPath + spark.driver.extraClassPath\n  - OR\n  - spark.jars\n\n## A detailed example\n\n### Table creation\nConsider the following definition of a `store_sales_out` table. \n```sql\ncreate table if not exists store_sales_out\n    (\n      ss_sold_time_sk           int,\n      ss_item_sk                int,\n      ss_customer_sk            int,\n      ss_cdemo_sk               int,\n      ss_hdemo_sk               int,\n      ss_addr_sk                int,\n      ss_store_sk               int,\n      ss_promo_sk               int,\n      ss_quantity               int,\n      ss_wholesale_cost         decimal(7,2),\n      ss_list_price             decimal(7,2),\n      ss_sales_price            decimal(7,2),\n      ss_ext_sales_price        decimal(7,2),\n      ss_sold_month             string,\n      ss_sold_day               string,\n      ss_sold_date_sk string\n    )\n    USING parquet\n    OPTIONS (\n      path \"src/test/resources/store_sales\",\n      addTableManagement \"true\",\n      columnDependencies \"ss_sold_date_sk=ss_sold_month:truncate[2],ss_sold_date_sk=ss_sold_day:truncate[4]\"\n    )\n    partitioned by (ss_sold_date_sk)\n```\n\nThis is regular Spark datasources v1 table that is partitioned on `ss_sold_date_sk`.\nIt is defined with two extra options. Setting The **addTableManagement** to true wil make this a table \nthat will be integrated with the Table management infratstructure from *Iceberg*. The \n**columnDependencies** can be used to defined functional dependencies between table columns in terms\nof *Iceberg Transforms*; these will be used for partition and datafile pruning during query planning.\nMore on when we talk about querying.\n\nAssume there is another `store_sales` table with the same schema that is used to \ninsert/update the `store_sales_out table`. This entire example is in the\n _BasicCreateAndInsertTest_ test class. We refer you to this class for the DDL for the `store_sales` table and \n other details in this example.\n \nInitially the `store_sales_out` has no snapshots as can be seen from the output of the `showSnapShots` invocation\nin _BasicCreateAndInsertTest_. We will be providing a `snapshots_view` shortly, so users will be able to\nissue a `select * from \u003ctable_name\u003e$snapshots`(this is similar to `snap$` views we have in our SNAP product).\nThe table identifier has the form \u003ctable_name\u003e followed by the string `$snapshots_view`. So for `store_sales_out` \nyou would have to issue a `select * from store_sales_out$snapshots_view`.\n\n```\nInitially no snapShots:\nselect * from `store_sales_out$snapshots`\n+---+--------+----------+-------------+----------------+--------------------+\n|id |parentId|timeMillis|numAddedFiles|numdDeletedFiles|manifestListLocation|\n+---+--------+----------+-------------+----------------+--------------------+\n+---+--------+----------+-------------+----------------+--------------------+\n```\n\n### Insert into store_sales_out without any partition specification.\nLet's insert into the `store_sales_out` table. So we issue\n```sql\ninsert into  store_sales_out \n  select  *  from store_sales\n```\n\nThis creates a SnapShot with 30 files added. The `store_sales` table has `6` partitions with 5 files in each partition.\n```\nselect * from `store_sales_out$snapshots`\n\n+-------------------+--------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|id                 |parentId|timeMillis   |numAddedFiles|numdDeletedFiles|manifestListLocation                                                                                                                                |\n+-------------------+--------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|8311655904283006343|-1      |1566875511640|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-8311655904283006343-1-51cd6794-8cc1-4433-8055-d268dbe62202.avro|\n+-------------------+--------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n```\n\nThe table now has 2109 rows.\n```\nselect count(*) from store_sales_out;\n\n+--------+\n|count(1)|\n+--------+\n|2109    |\n+--------+\n```\n\n### Query with  ss_sold_date_sk='0906245' has predciate ss_sold_month='09' added\nRunning a query with a partition filter\n```sql\nselect count(*) \nfrom store_sales_out \nwhere ss_sold_date_sk='0906245'\n```\n\nWe have defined `ss_sold_month` is related to `ss_sold_date_sk` via a `truncate` transformation.\nSo for this query the `ss_sold_month='09'` is pushed to the _TableScan_ Iceberg operation.\nThis is observed by introspecting the `icebergFilter` property of the _IceTableScanExec_ physical\noperator.\n\nThe output shows there are 236 rows in this partition.\n```\n+--------+\n|count(1)|\n+--------+\n|236     |\n+--------+\n```\n\n### Issue another Insert into store_sales_out without any partition specification.\n\nLet's issue another insert into the `store_sales_out` table. \n```sql\ninsert into  store_sales_out \n  select  *  from store_sales\n```\nThis creates another SnapShot with 30 files added.\n```\nselect * from `store_sales_out$snapshots`\n\n+-------------------+------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|id                 |parentId          |timeMillis   |numAddedFiles|numdDeletedFiles|manifestListLocation                                                                                                                                |\n+-------------------+------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|369432757121624247 |-1                |1566958072042|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-369432757121624247-1-24288117-f3b9-4f85-aa94-b943cabd844d.avro |\n|2542920950855973853|369432757121624247|1566958075214|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-2542920950855973853-1-39abbb75-1022-4ca2-b274-1ea10f445a9b.avro|\n+-------------------+------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n```\n\nThe table now has 4218 rows.\n```\nselect count(*) from store_sales_out;\n\n+--------+\n|count(1)|\n+--------+\n|4218    |\n+--------+\n```\n\n### Run select as of first insert\nIf we query the table as of the first insert we still see `2109` rows\n```sql\nas of '2019-09-15 20:32:24.062'\nselect count(*) from store_sales_out\n\n+--------+\n|count(1)|\n+--------+\n|2109    |\n+--------+\n\n```\n\n### Issue an Insert Overwrite into store_sales_out without any partition specification.\n\nNow let's issue a insert overwrite on the `store_sales_out` table. \n```sql\ninsert overwrite table  store_sales_out \n  select  *  from store_sales\n```\n\nThis creates a SnapShot with 30 files added and 60 files deleted.\n```\n\nselect * from `store_sales_out$snapshots`\n\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|id                 |parentId           |timeMillis   |numAddedFiles|numdDeletedFiles|manifestListLocation                                                                                                                                |\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|369432757121624247 |-1                 |1566958072042|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-369432757121624247-1-24288117-f3b9-4f85-aa94-b943cabd844d.avro |\n|2542920950855973853|369432757121624247 |1566958075214|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-2542920950855973853-1-39abbb75-1022-4ca2-b274-1ea10f445a9b.avro|\n|6277089168341855684|2542920950855973853|1566958077282|30           |60              |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-6277089168341855684-1-99eeb5d3-7b1b-4c54-ba3e-2cb1d6946cbe.avro|\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n```\n\nThe table again has 2109 rows.\n```\nselect count(*) from store_sales_out;\n\n+--------+\n|count(1)|\n+--------+\n|2109    |\n+--------+\n```\n\n### Insert overwrite of 1 partition\n\nNext we insert overwrite 1 partition of store_sales_out\n\n```sql\ninsert overwrite table  store_sales_out partition ( ss_sold_date_sk='0906245' )\n  select ss_sold_time_sk,ss_item_sk,ss_customer_sk,ss_cdemo_sk,ss_hdemo_sk,ss_addr_sk,\n       ss_store_sk,ss_promo_sk,ss_quantity,ss_wholesale_cost,ss_list_price,ss_sales_price,\n       ss_ext_sales_price,ss_sold_month,ss_sold_day from store_sales\n  where ss_sold_date_sk='0906245' \n\n```\n\nThis creates a SnapShot with 5 files added and 5 files deleted.\n```\n\nselect * from `store_sales_out$snapshots`\n\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|id                 |parentId           |timeMillis   |numAddedFiles|numdDeletedFiles|manifestListLocation                                                                                                                                |\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|369432757121624247 |-1                 |1566958072042|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-369432757121624247-1-24288117-f3b9-4f85-aa94-b943cabd844d.avro |\n|2542920950855973853|369432757121624247 |1566958075214|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-2542920950855973853-1-39abbb75-1022-4ca2-b274-1ea10f445a9b.avro|\n|6277089168341855684|2542920950855973853|1566958077282|30           |60              |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-6277089168341855684-1-99eeb5d3-7b1b-4c54-ba3e-2cb1d6946cbe.avro|\n|4984732539170247398|6277089168341855684|1566958078575|5            |5               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-4984732539170247398-1-00a53e56-fd8f-4293-9554-41f2b89ae2d2.avro|\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n```\n\nThe table still has 2109 rows.\n```\nselect count(*) from store_sales_out;\n\n+--------+\n|count(1)|\n+--------+\n|2109    |\n+--------+\n```\n\n### Insert overwrite of 1 partition with predicates on the source data\nWe run an insert overwrite on 1 partition again, but now we have an extra predicate on source rows\nthat reduces the number of rows inserted into this partition.\n\n```sql\ninsert overwrite table  store_sales_out partition ( ss_sold_date_sk='0905245' )\n  select ss_sold_time_sk,ss_item_sk,ss_customer_sk,ss_cdemo_sk,ss_hdemo_sk,\n         ss_addr_sk,ss_store_sk,ss_promo_sk,ss_quantity,ss_wholesale_cost,ss_list_price,\n         ss_sales_price,ss_ext_sales_price,ss_sold_month,ss_sold_day from store_sales\n  where ss_sold_date_sk='0905245' and ss_item_sk \u003c 5000 \n```\n\nThis creates a SnapShot with 5 files added and 5 files deleted. \n```\n\nselect * from `store_sales_out$snapshots`\n\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|id                 |parentId           |timeMillis   |numAddedFiles|numdDeletedFiles|manifestListLocation                                                                                                                                |\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n|369432757121624247 |-1                 |1566958072042|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-369432757121624247-1-24288117-f3b9-4f85-aa94-b943cabd844d.avro |\n|2542920950855973853|369432757121624247 |1566958075214|30           |0               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-2542920950855973853-1-39abbb75-1022-4ca2-b274-1ea10f445a9b.avro|\n|6277089168341855684|2542920950855973853|1566958077282|30           |60              |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-6277089168341855684-1-99eeb5d3-7b1b-4c54-ba3e-2cb1d6946cbe.avro|\n|4984732539170247398|6277089168341855684|1566958078575|5            |5               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-4984732539170247398-1-00a53e56-fd8f-4293-9554-41f2b89ae2d2.avro|\n|713868995008319946 |4984732539170247398|1566958079727|5            |5               |/Users/hbutani/sparkline/icebergSQL/src/test/resources/store_sales_out/metadata/snap-713868995008319946-1-2ce5fb56-906d-4a66-a27f-2b5fad668560.avro |\n+-------------------+-------------------+-------------+-------------+----------------+----------------------------------------------------------------------------------------------------------------------------------------------------+\n```\n\nBut the table count has reduced because of the extra source predciate.\n\n```\nselect count(*) from store_sales_out;\n\n+--------+\n|count(1)|\n+--------+\n|1877    |\n+--------+\n\n```\n\n### Run select as of first insert\nIf we can still query the table as of the first insert and we will get `2109` rows\n```sql\nas of '2019-09-15 20:32:24.062'\nselect count(*) from store_sales_out\n\n+--------+\n|count(1)|\n+--------+\n|2109    |\n+--------+\n\n```\n\n### Query the last insrted partition with the invert of the predicate used in the last insert\n\n```sql\nselect * from store_sales_out \nwhere ss_item_sk \u003e 5000 and ss_sold_date_sk='0905245'\n```\n\nWe used the predicate `ss_item_sk \u003c 5000` fot the insert and here we use its invert `ss_item_sk \u003e 5000`\nWe can validate that this is a _NullScan_ by observing that zero files are scanned by the \n_IceTableScanExec_ physical operator\n\n## Under the covers\n\nSee this [design note](docs/icebergSparkSQL.pdf) for more details. Note that is out-dated in some\nof the details, when there are discrepenancies in detailed feature descriptions than this page is more\ncurrent. But the note still provides a decent picture of how the integration works under the covers. \n\nA `SparkSessionExtensions` is setup with Planning and Parsing extensions. As explained in the \n*How to use* section set the `spark.sql.extensions` to this so these rules and extensions are \nin effect in the _SparkSessions_ created in the deployed Spark Context.\n\n### Table Creation\nThe _CreateIcebergTableRule_ checks that tables marked as `addTableManagement=true` are\nsupported. Currently we require the table to be non-bucketed and partitioned\n(we plan to relax the partitioning constraint soon); and the table columns must be of \n*Atomic, Map, Array, or Struct type*. \n \n### The Column Dependencies option\nIf `columnDependencies` option is specified then this\nmust be in the form of a comma separated list of column dependence. \nA 'column dependence' is of the form `srcCol=destCol:transformFn`, for example\n`date_col=day_col:extract[2]` where `date_col` is a string in the form `DD-MM-YYYY`.\nSemantically a column dependence implies that the destCol value can be determined\nfrom a srcCol value; the columns are in a one-to-one` or `many-to-one` relationship.\nThe src and dest columns can be any column (data or partition columns) of the table.\n\nCurrently we support *Iceberg Transforms* as mapping functions.\nSo users can relate columns based on `date` or `timestamp` elements,\nbased on `truncating` values or `value buckets`.\n\nDuring a table scan we will attempt to transform a predicate on the `srcCol`\ninto a predicate on the `destCol`. For example `date_col='09-12-2019'` will be transformed\ninto `day_col='09'` and applied. If the `destCol` is a partition column\nthis can lead to partition pruning. For example if the table is partitioned by\n`day_col` then from a predicate `date_col='09-12-2019'` the  inferred predicate\n`day_col='09'` will lead to only partitions from the 9th day of each month to\nbe scanned. In case the `destCol` is a data column the inferred predicate can  lead\nto datafiles being pruned based on the statistics available on the column.\n\n### Inserting into a snapshot managed table\n\nSpark Insert Plans involve 3 major components:  the\n_InsertHadoopFsRelation_ Spark Command, the _FileFormat Writer_ and the \n_File Commit Protocol_. Table metadata information(up to the granularity of table\npartitions) is retrieved and updated from the Spark Catalog, whereas File information and \ninteraction is done via the _File System_ API. \n\n- **InsertHadoopFsRelation:** orchestrates the entire operation, it also handles\n  interaction with the Spark Catalog. It's logic executes in the Driver of\n  the SparkContext. The actions it performs are: compute the affected \n  partitions based on the /partition specification/ in the Insert statement, \n  setup the File Commit Protocol and the Write Job that is associated with \n  the File Format Writer, execute and commit/abort the job, compute the set \n  of Added and Deleted partitions, and update the Spark Catalog.\n- **File Commit Protocol:** tracks changes to data files made by the job and\n  provides rudimentary level of job isolation. It provides a set of \n  callbacks like new Task file, Task commit/abort and Job commit/abort that\n  the other components use to notify it of file changes. On commit\n  it moves files into their final locations, after which other operations\n  will see the new list of the Table files.\n- **Write Tasks:** create and write new Files, notify the File Commit\n  Protocol of new files.\n  \nSee  **Figure: Spark Insert Command execution** in the design for a detail \nsequence diagram of how a insert is executed. For snapshot managed tables\nwe replace the ~InsertHadoopFsRelation~ Command with an \n~Insert Into IcebergTable~ command.\n\n#### Insert Into IcebergTable command\nThis a drop-in replacement for InsertIntoHadoopFsRelationCommand setup by the \n~IcebergTableWriteRule~. By and large follows the same execution flow as\n~InsertIntoHadoopFsRelation~ Command with the following behavior overrides.\n\n- The write must be on a CatalogTable. So catalogTable parameter is not optional.\n- Since this is a iceberg managed table we load the IceTable metadata for this table.\n- `initialMatchingPartitions` is computed from the IceTable metadata\n- since data files must be managed by iceberg custom partition\n  locations cannot be configured for this table.\n- an `IcebergFileCommitProtocol` is setup that wraps the underlying\n  FileCommitProtocol. This mostly defers to the underlying commitProtocol\n  instance; in the process it ensures iceberg DataFile instances are created for\n  new files on task commit which are then delivered to the Driver\n  `IcebergFileCommitProtocol` instance via `TaskCommitMessages`.\n - The underlying `FileCommitProtocol` is setup with `dynamicPartitionOverwrite`\n   mode set to false. Since IceTable metadata is used by scan operations to\n   compute what files to scan we don't have to do an all-or-nothing replacement\n   of files in a partition that is needed for dynamic partition mode using the\n   FileCommitProtocol.\n- in case of dynamicPartitionOverwrite mode we don't clear specified source\n  Partitions, because we want the current files to be able execute queries\n  against older snapshots.\n- once the job finishes the Catalog is updated with 'new' and 'deleted'\n  partitions just as it is in a regular InsertIntoHadoopFsRelationCommand\n- then based on the 'initial set' of DataFile and the set of DataFile created by\n  tasks of this job a new iceberg Snapshot is created.\n- finally cache invalidation and stats update actions happen just like in a\n  regular InsertIntoHadoopFsRelationCommand.\n\n#### Iceberg File Commit Protocol\n\nProvides the following function on top of the 'normal' Commit Protocol. Commit\nactions are simply deferred to the 'designate' except in the following: \n\n- track files created for each Task in a TaskPaths instance. This tracks the\n  temporary file location and also the location that the file will be moved to\n  on a commit. \n- on Task Commit build an Iceberg DataFile instance. Currently only if the file\n  is a parquet file we will also build column level stats.\n  - The TaskCommitMessage we send back has a payload of\n    IcebergTaskCommitMessage, which encapsulates  the TaskCommitMessage build by\n    the 'designate' and the DataFile instances. \n- we ignore deleteWithJob invocations, as we want to keep historical files\n  around. These will be removed via a clear snapshot command. \n- on a commitJob we extract all the DataFile instances from the\n  IcebergTaskCommitMessage messages and expose a addedDataFiles list which is\n  used by IceTableScanExec to build the new Iceberg Table Snapshot. \n  \n\n### Scanning a snapshot managed table\nThis is handled by a ~Iceberg Table Scan~ physical operator. \nThis is setup as a **parent** Physical Operator of a [[FileSourceScanExec]]. During execution\nbefore handing over control to its child [[FileSourceScanExec]] operator it updates its `selectedPartitions` \nmember.\n\nThis is computed based on the `partitionFilters` and `dataFilters` associated with this scan.\nThese are converted to an [[IceExpression]], further [[IceExpression]] are added based on\n`column dependencies` defined for this table. From the current Iceberg snaphot a list of\n[[DataFile]] are computed. Finally the [[FileSourceScanExec]] list of selected partitions\nlist is updated to remove files from [[PartitionDirectory]] instances not in this list of\nDataFiles.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhbutani%2Ficebergsql","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhbutani%2Ficebergsql","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhbutani%2Ficebergsql/lists"}