https://github.com/pyexcel/pyexcel-xlsx

A wrapper library to read, manipulate and write data in xlsx and xlsm format using openpyxl
https://github.com/pyexcel/pyexcel-xlsx
xlsx
Last synced: about 2 months ago
JSON representation
A wrapper library to read, manipulate and write data in xlsx and xlsm format using openpyxl
Host: GitHub
URL: https://github.com/pyexcel/pyexcel-xlsx
Owner: pyexcel
License: other
Created: 2014-12-02T00:03:24.000Z (over 10 years ago)
Default Branch: dev
Last Pushed: 2025-05-03T11:20:00.000Z (2 months ago)
Last Synced: 2025-05-03T12:23:25.191Z (2 months ago)
Topics: xlsx
Language: Python
Size: 335 KB
Stars: 118
Watchers: 6
Forks: 33
Open Issues: 3
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.rst
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project

awesome-python-machine-learning-resources - GitHub - 26% open · ⏱️ 28.11.2020): (数据读写与提取)
README

        ================================================================================

pyexcel-xlsx - Let you focus on data, instead of xlsx format

================================================================================

.. image:: https://raw.githubusercontent.com/pyexcel/pyexcel.github.io/master/images/patreon.png

   :target: https://www.patreon.com/chfw

.. image:: https://raw.githubusercontent.com/pyexcel/pyexcel-mobans/master/images/awesome-badge.svg

   :target: https://awesome-python.com/#specific-formats-processing

.. image:: https://codecov.io/gh/pyexcel/pyexcel-xlsx/branch/master/graph/badge.svg

   :target: https://codecov.io/gh/pyexcel/pyexcel-xlsx

.. image:: https://badge.fury.io/py/pyexcel-xlsx.svg

   :target: https://pypi.org/project/pyexcel-xlsx

.. image:: https://anaconda.org/conda-forge/pyexcel-xlsx/badges/version.svg

   :target: https://anaconda.org/conda-forge/pyexcel-xlsx

.. image:: https://pepy.tech/badge/pyexcel-xlsx/month

   :target: https://pepy.tech/project/pyexcel-xlsx

.. image:: https://anaconda.org/conda-forge/pyexcel-xlsx/badges/downloads.svg

   :target: https://anaconda.org/conda-forge/pyexcel-xlsx

.. image:: https://img.shields.io/gitter/room/gitterHQ/gitter.svg

   :target: https://gitter.im/pyexcel/Lobby

.. image:: https://img.shields.io/static/v1?label=continuous%20templating&message=%E6%A8%A1%E7%89%88%E6%9B%B4%E6%96%B0&color=blue&style=flat-square

    :target: https://moban.readthedocs.io/en/latest/#at-scale-continous-templating-for-open-source-projects

.. image:: https://img.shields.io/static/v1?label=coding%20style&message=black&color=black&style=flat-square

    :target: https://github.com/psf/black

**pyexcel-xlsx** is a tiny wrapper library to read, manipulate and write data in xlsx and xlsm format using  `read_only` mode reader, `write_only` mode writer from openpyxl. You are likely to use it with `pyexcel `__.

Please note:

1. `auto_detect_int` flag will not take effect because openpyxl detect integer in python 3 by default.

2. `skip_hidden_row_and_column` will get a penalty where `read_only` mode cannot be used.

Support the project

================================================================================

If your company uses pyexcel and its components in a revenue-generating product,

please consider supporting the project on GitHub or

`Patreon `_. Your financial

support will enable me to dedicate more time to coding, improving documentation,

and creating engaging content.

Known constraints

==================

Fonts, colors and charts are not supported.

Nor to read password protected xls, xlsx and ods files.

Installation

================================================================================

You can install pyexcel-xlsx via pip:

.. code-block:: bash

    $ pip install pyexcel-xlsx

or clone it and install it:

.. code-block:: bash

    $ git clone https://github.com/pyexcel/pyexcel-xlsx.git

    $ cd pyexcel-xlsx

    $ python setup.py install

Usage

================================================================================

As a standalone library

--------------------------------------------------------------------------------

.. testcode::

   :hide:

    >>> import os

    >>> import sys

    >>> from io import BytesIO

    >>> from collections import OrderedDict

Write to an xlsx file

********************************************************************************

Here's the sample code to write a dictionary to an xlsx file:

.. code-block:: python

    >>> from pyexcel_xlsx import save_data

    >>> data = OrderedDict() # from collections import OrderedDict

    >>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})

    >>> data.update({"Sheet 2": [["row 1", "row 2", "row 3"]]})

    >>> save_data("your_file.xlsx", data)

Read from an xlsx file

********************************************************************************

Here's the sample code:

.. code-block:: python

    >>> from pyexcel_xlsx import get_data

    >>> data = get_data("your_file.xlsx")

    >>> import json

    >>> print(json.dumps(data))

    {"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [["row 1", "row 2", "row 3"]]}

Write an xlsx to memory

********************************************************************************

Here's the sample code to write a dictionary to an xlsx file:

.. code-block:: python

    >>> from pyexcel_xlsx import save_data

    >>> data = OrderedDict()

    >>> data.update({"Sheet 1": [[1, 2, 3], [4, 5, 6]]})

    >>> data.update({"Sheet 2": [[7, 8, 9], [10, 11, 12]]})

    >>> io = BytesIO()

    >>> save_data(io, data)

    >>> # do something with the io

    >>> # In reality, you might give it to your http response

    >>> # object for downloading

Read from an xlsx from memory

********************************************************************************

Continue from previous example:

.. code-block:: python

    >>> # This is just an illustration

    >>> # In reality, you might deal with xlsx file upload

    >>> # where you will read from requests.FILES['YOUR_XLSX_FILE']

    >>> data = get_data(io)

    >>> print(json.dumps(data))

    {"Sheet 1": [[1, 2, 3], [4, 5, 6]], "Sheet 2": [[7, 8, 9], [10, 11, 12]]}

Pagination feature

********************************************************************************

Let's assume the following file is a huge xlsx file:

.. code-block:: python

   >>> huge_data = [

   ...     [1, 21, 31],

   ...     [2, 22, 32],

   ...     [3, 23, 33],

   ...     [4, 24, 34],

   ...     [5, 25, 35],

   ...     [6, 26, 36]

   ... ]

   >>> sheetx = {

   ...     "huge": huge_data

   ... }

   >>> save_data("huge_file.xlsx", sheetx)

And let's pretend to read partial data:

.. code-block:: python

   >>> partial_data = get_data("huge_file.xlsx", start_row=2, row_limit=3)

   >>> print(json.dumps(partial_data))

   {"huge": [[3, 23, 33], [4, 24, 34], [5, 25, 35]]}

And you could as well do the same for columns:

.. code-block:: python

   >>> partial_data = get_data("huge_file.xlsx", start_column=1, column_limit=2)

   >>> print(json.dumps(partial_data))

   {"huge": [[21, 31], [22, 32], [23, 33], [24, 34], [25, 35], [26, 36]]}

Obvious, you could do both at the same time:

.. code-block:: python

   >>> partial_data = get_data("huge_file.xlsx",

   ...     start_row=2, row_limit=3,

   ...     start_column=1, column_limit=2)

   >>> print(json.dumps(partial_data))

   {"huge": [[23, 33], [24, 34], [25, 35]]}

.. testcode::

   :hide:

   >>> os.unlink("huge_file.xlsx")

As a pyexcel plugin

--------------------------------------------------------------------------------

No longer, explicit import is needed since pyexcel version 0.2.2. Instead,

this library is auto-loaded. So if you want to read data in xlsx format,

installing it is enough.

Reading from an xlsx file

********************************************************************************

Here is the sample code:

.. code-block:: python

    >>> import pyexcel as pe

    >>> sheet = pe.get_book(file_name="your_file.xlsx")

    >>> sheet

    Sheet 1:

    +---+---+---+

    | 1 | 2 | 3 |

    +---+---+---+

    | 4 | 5 | 6 |

    +---+---+---+

    Sheet 2:

    +-------+-------+-------+

    | row 1 | row 2 | row 3 |

    +-------+-------+-------+

Writing to an xlsx file

********************************************************************************

Here is the sample code:

.. code-block:: python

    >>> sheet.save_as("another_file.xlsx")

Reading from a IO instance

********************************************************************************

You got to wrap the binary content with stream to get xlsx working:

.. code-block:: python

    >>> # This is just an illustration

    >>> # In reality, you might deal with xlsx file upload

    >>> # where you will read from requests.FILES['YOUR_XLSX_FILE']

    >>> xlsxfile = "another_file.xlsx"

    >>> with open(xlsxfile, "rb") as f:

    ...     content = f.read()

    ...     r = pe.get_book(file_type="xlsx", file_content=content)

    ...     print(r)

    ...

    Sheet 1:

    +---+---+---+

    | 1 | 2 | 3 |

    +---+---+---+

    | 4 | 5 | 6 |

    +---+---+---+

    Sheet 2:

    +-------+-------+-------+

    | row 1 | row 2 | row 3 |

    +-------+-------+-------+

Writing to a BytesIO instance

********************************************************************************

You need to pass a BytesIO instance to Writer:

.. code-block:: python

    >>> data = [

    ...     [1, 2, 3],

    ...     [4, 5, 6]

    ... ]

    >>> io = BytesIO()

    >>> sheet = pe.Sheet(data)

    >>> io = sheet.save_to_memory("xlsx", io)

    >>> # then do something with io

    >>> # In reality, you might give it to your http response

    >>> # object for downloading

License

================================================================================

New BSD License

Developer guide

==================

Development steps for code changes

#. git clone https://github.com/pyexcel/pyexcel-xlsx.git

#. cd pyexcel-xlsx

Upgrade your setup tools and pip. They are needed for development and testing only:

#. pip install --upgrade setuptools pip

Then install relevant development requirements:

#. pip install -r rnd_requirements.txt # if such a file exists

#. pip install -r requirements.txt

#. pip install -r tests/requirements.txt

Once you have finished your changes, please provide test case(s), relevant documentation

and update changelog.yml

.. note::

    As to rnd_requirements.txt, usually, it is created when a dependent

    library is not released. Once the dependency is installed

    (will be released), the future

    version of the dependency in the requirements.txt will be valid.

How to test your contribution

--------------------------------------------------------------------------------

Although `nose` and `doctest` are both used in code testing, it is advisable

that unit tests are put in tests. `doctest` is incorporated only to make sure

the code examples in documentation remain valid across different development

releases.

On Linux/Unix systems, please launch your tests like this::

    $ make

On Windows, please issue this command::

    > test.bat

Before you commit

------------------------------

Please run::

    $ make format

so as to beautify your code otherwise your build may fail your unit test.

.. testcode::

   :hide:

   >>> import os

   >>> os.unlink("your_file.xlsx")

   >>> os.unlink("another_file.xlsx")
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pyexcel/pyexcel-xlsx

Awesome Lists containing this project

README