{"id":8281717,"url":"https://github.com/apache/datafusion","last_synced_at":"2025-12-12T01:02:27.019Z","repository":{"id":36971235,"uuid":"358917318","full_name":"apache/datafusion","owner":"apache","description":"Apache DataFusion SQL Query Engine","archived":false,"fork":false,"pushed_at":"2025-09-06T18:43:38.000Z","size":158859,"stargazers_count":7686,"open_issues_count":1529,"forks_count":1626,"subscribers_count":112,"default_branch":"main","last_synced_at":"2025-09-06T20:36:35.865Z","etag":null,"topics":["arrow","big-data","dataframe","datafusion","olap","python","query-engine","rust","sql"],"latest_commit_sha":null,"homepage":"https://datafusion.apache.org/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE.txt","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-04-17T15:40:23.000Z","updated_at":"2025-09-06T18:43:42.000Z","dependencies_parsed_at":"2023-12-11T13:01:41.027Z","dependency_job_id":"b5d8b40d-b408-4619-bcb0-03c6bc3e5e7a","html_url":"https://github.com/apache/datafusion","commit_stats":{"total_commits":8825,"total_committers":715,"mean_commits":"12.342657342657343","dds":0.8645892351274788,"last_synced_commit":"0243ebd585264852be55822e3504be54e1e0e406"},"previous_names":["apache/datafusion","apache/arrow-datafusion"],"tags_count":133,"template":false,"template_full_name":null,"purl":"pkg:github/apache/datafusion","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fdatafusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fdatafusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fdatafusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fdatafusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/datafusion/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fdatafusion/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274166988,"owners_count":25233962,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-08T02:00:09.813Z","response_time":121,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arrow","big-data","dataframe","datafusion","olap","python","query-engine","rust","sql"],"created_at":"2024-04-20T09:01:42.291Z","updated_at":"2025-12-12T01:02:26.971Z","avatar_url":"https://github.com/apache.png","language":"Rust","readme":"\u003c!---\n  Licensed to the Apache Software Foundation (ASF) under one\n  or more contributor license agreements.  See the NOTICE file\n  distributed with this work for additional information\n  regarding copyright ownership.  The ASF licenses this file\n  to you under the Apache License, Version 2.0 (the\n  \"License\"); you may not use this file except in compliance\n  with the License.  You may obtain a copy of the License at\n\n    http://www.apache.org/licenses/LICENSE-2.0\n\n  Unless required by applicable law or agreed to in writing,\n  software distributed under the License is distributed on an\n  \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n  KIND, either express or implied.  See the License for the\n  specific language governing permissions and limitations\n  under the License.\n--\u003e\n\n# Apache DataFusion\n\n[![Crates.io][crates-badge]][crates-url]\n[![Apache licensed][license-badge]][license-url]\n[![Build Status][actions-badge]][actions-url]\n![Commit Activity][commit-activity-badge]\n[![Open Issues][open-issues-badge]][open-issues-url]\n[![Discord chat][discord-badge]][discord-url]\n[![Linkedin][linkedin-badge]][linkedin-url]\n![Crates.io MSRV][msrv-badge]\n\n[crates-badge]: https://img.shields.io/crates/v/datafusion.svg\n[crates-url]: https://crates.io/crates/datafusion\n[license-badge]: https://img.shields.io/badge/license-Apache%20v2-blue.svg\n[license-url]: https://github.com/apache/datafusion/blob/main/LICENSE.txt\n[actions-badge]: https://github.com/apache/datafusion/actions/workflows/rust.yml/badge.svg\n[actions-url]: https://github.com/apache/datafusion/actions?query=branch%3Amain\n[discord-badge]: https://img.shields.io/badge/Chat-Discord-purple\n[discord-url]: https://discord.com/invite/Qw5gKqHxUM\n[commit-activity-badge]: https://img.shields.io/github/commit-activity/m/apache/datafusion\n[open-issues-badge]: https://img.shields.io/github/issues-raw/apache/datafusion\n[open-issues-url]: https://github.com/apache/datafusion/issues\n[linkedin-badge]: https://img.shields.io/badge/Follow-Linkedin-blue\n[linkedin-url]: https://www.linkedin.com/company/apache-datafusion/\n[msrv-badge]: https://img.shields.io/crates/msrv/datafusion?label=Min%20Rust%20Version\n\n[Website](https://datafusion.apache.org/) |\n[API Docs](https://docs.rs/datafusion/latest/datafusion/) |\n[Chat](https://discord.com/channels/885562378132000778/885562378132000781)\n\n\u003ca href=\"https://datafusion.apache.org/\"\u003e\n  \u003cimg src=\"https://github.com/apache/datafusion/raw/HEAD/docs/source/_static/images/2x_bgwhite_original.png\" width=\"512\" alt=\"logo\"/\u003e\n\u003c/a\u003e\n\nDataFusion is an extensible query engine written in [Rust] that\nuses [Apache Arrow] as its in-memory format.\n\nThis crate provides libraries and binaries for developers building fast and\nfeature rich database and analytic systems, customized to particular workloads.\nSee [use cases] for examples. The following related subprojects target end users:\n\n- [DataFusion Python](https://github.com/apache/datafusion-python/) offers a Python interface for SQL and DataFrame\n  queries.\n- [DataFusion Comet](https://github.com/apache/datafusion-comet/) is an accelerator for Apache Spark based on\n  DataFusion.\n\n\"Out of the box,\"\nDataFusion offers [SQL] and [`Dataframe`] APIs, excellent [performance],\nbuilt-in support for CSV, Parquet, JSON, and Avro, extensive customization, and\na great community.\n\nDataFusion features a full query planner, a columnar, streaming, multi-threaded,\nvectorized execution engine, and partitioned data sources. You can\ncustomize DataFusion at almost all points including additional data sources,\nquery languages, functions, custom operators and more.\nSee the [Architecture] section for more details.\n\n[rust]: http://rustlang.org\n[apache arrow]: https://arrow.apache.org\n[use cases]: https://datafusion.apache.org/user-guide/introduction.html#use-cases\n[python bindings]: https://github.com/apache/datafusion-python\n[performance]: https://benchmark.clickhouse.com/\n[architecture]: https://datafusion.apache.org/contributor-guide/architecture.html\n\nHere are links to some important information\n\n- [Project Site](https://datafusion.apache.org/)\n- [Installation](https://datafusion.apache.org/user-guide/cli/installation.html)\n- [Rust Getting Started](https://datafusion.apache.org/user-guide/example-usage.html)\n- [Rust DataFrame API](https://datafusion.apache.org/user-guide/dataframe.html)\n- [Rust API docs](https://docs.rs/datafusion/latest/datafusion)\n- [Rust Examples](https://github.com/apache/datafusion/tree/main/datafusion-examples)\n- [Python DataFrame API](https://arrow.apache.org/datafusion-python/)\n- [Architecture](https://docs.rs/datafusion/latest/datafusion/index.html#architecture)\n\n## What can you do with this crate?\n\nDataFusion is great for building projects such as domain specific query engines, new database platforms and data pipelines, query languages and more.\nIt lets you start quickly from a fully working engine, and then customize those features specific to your use. [Click Here](https://datafusion.apache.org/user-guide/introduction.html#known-users) to see a list known users.\n\n## Contributing to DataFusion\n\nPlease see the [contributor guide] and [communication] pages for more information.\n\n[contributor guide]: https://datafusion.apache.org/contributor-guide\n[communication]: https://datafusion.apache.org/contributor-guide/communication.html\n\n## Crate features\n\nThis crate has several [features] which can be specified in your `Cargo.toml`.\n\n[features]: https://doc.rust-lang.org/cargo/reference/features.html\n\nDefault features:\n\n- `nested_expressions`: functions for working with nested type function such as `array_to_string`\n- `compression`: reading files compressed with `xz2`, `bzip2`, `flate2`, and `zstd`\n- `crypto_expressions`: cryptographic functions such as `md5` and `sha256`\n- `datetime_expressions`: date and time functions such as `to_timestamp`\n- `encoding_expressions`: `encode` and `decode` functions\n- `parquet`: support for reading the [Apache Parquet] format\n- `regex_expressions`: regular expression functions, such as `regexp_match`\n- `unicode_expressions`: Include unicode aware functions such as `character_length`\n- `unparser`: enables support to reverse LogicalPlans back into SQL\n- `recursive_protection`: uses [recursive](https://docs.rs/recursive/latest/recursive/) for stack overflow protection.\n\nOptional features:\n\n- `avro`: support for reading the [Apache Avro] format\n- `backtrace`: include backtrace information in error messages\n- `parquet_encryption`: support for using [Parquet Modular Encryption]\n- `pyarrow`: conversions between PyArrow and DataFusion types\n- `serde`: enable arrow-schema's `serde` feature\n\n[apache avro]: https://avro.apache.org/\n[apache parquet]: https://parquet.apache.org/\n[parquet modular encryption]: https://parquet.apache.org/docs/file-format/data-pages/encryption/\n\n## DataFusion API Evolution and Deprecation Guidelines\n\nPublic methods in Apache DataFusion evolve over time: while we try to maintain a\nstable API, we also improve the API over time. As a result, we typically\ndeprecate methods before removing them, according to the [deprecation guidelines].\n\n[deprecation guidelines]: https://datafusion.apache.org/contributor-guide/api-health.html\n\n## Dependencies and `Cargo.lock`\n\nFollowing the [guidance] on committing `Cargo.lock` files, this project commits\nits `Cargo.lock` file.\n\nCI uses the committed `Cargo.lock` file, and dependencies are updated regularly\nusing [Dependabot] PRs.\n\n[guidance]: https://blog.rust-lang.org/2023/08/29/committing-lockfiles.html\n[dependabot]: https://docs.github.com/en/code-security/dependabot/working-with-dependabot\n","funding_links":[],"categories":["Rust","HarmonyOS","Libraries","语言资源库","Applications"],"sub_categories":["Windows Manager","Data processing","rust"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fdatafusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2Fdatafusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fdatafusion/lists"}