PyArrow security vulnerability when reading IPC Streaming or Parquet files. #2115
anjakefala
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey guys,
Since some of you either use VisiData to read Arrow and Parquet data, or use
PyArrow
in your lives, I wanted to let you know about an existing security vulnerability inPyArrow
.Deserialization of untrusted data in IPC and Parquet readers in
PyArrow
versions 0.14.0 to 14.0.0 allows arbitrary code execution. An application is vulnerable if it reads Arrow IPC, Feather or Parquet data from untrusted sources (for example user-supplied input files).This vulnerability only affects
PyArrow
, not other Apache Arrow implementations or bindings.It is recommended that users of
PyArrow
upgrade to 14.0.1. Similarly, it is recommended that downstream libraries upgrade their dependency requirements to PyArrow 14.0.1 or later. PyPI packages are already available, and so are conda-forge: https://anaconda.org/conda-forge/pyarrowIf it is not possible to upgrade, maintainers provide a separate package pyarrow-hotfix that disables the vulnerability on older PyArrow versions. See https://pypi.org/project/pyarrow-hotfix/ for instructions.
Python 3.7 only supports up until 12.0.1. So on VisiData's end, we bumped the minimum dependency version for
pyarrow
on 3.8+ and installpyarrow_hotfix
for 3.7: 23de62cA link to the CVE: https://www.cve.org/CVERecord?id=CVE-2023-47248
Beta Was this translation helpful? Give feedback.
All reactions