Update the documentation related to optional packages (#805)

* Update the documentation related to optional packages * Add `cmake>=3.8` to documentation requirements * Hardcode lilcom version to 1.1.0 for documentation builds to avoid the cmake error * Hacky lilcom version workaround for docs * Yet another workaround * Fix documentation listing * Fix documentation listing * Fix documentation listing
lhotse-speech · Sep 12, 2022 · 0e4ddd2 · 0e4ddd2
1 parent 908c828
commit 0e4ddd2
Show file tree

Hide file tree

Showing 4 changed files with 81 additions and 14 deletions.
diff --git a/README.md b/README.md
@@ -32,7 +32,7 @@ Lhotse is a Python library aiming to make speech and audio data preparation flex
 We currently have the following tutorials available in `examples` directory:
 - Basic complete Lhotse workflow [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lhotse-speech/lhotse/blob/master/examples/00-basic-workflow.ipynb)
 - Transforming data with Cuts [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lhotse-speech/lhotse/blob/master/examples/01-cut-python-api.ipynb)
-- *(experimental)* WebDataset integration [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lhotse-speech/lhotse/blob/master/examples/02-webdataset-integration.ipynb)
+- WebDataset integration [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lhotse-speech/lhotse/blob/master/examples/02-webdataset-integration.ipynb)
 - How to combine multiple datasets [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lhotse-speech/lhotse/blob/master/examples/03-combining-datasets.ipynb)
 
 ### Examples of use
@@ -71,8 +71,6 @@ To install the latest, unreleased version, do:
 
     pip install git+https://github.com/lhotse-speech/lhotse
 
-_Hint: for up to 50% faster reading of JSONL manifests, use: `pip install lhotse[orjson]` to leverage the [orjson](https://pypi.org/project/orjson/) library._
-
 ### Development installation
 
 For development installation, you can fork/clone the GitHub repo and install with pip:
@@ -92,9 +90,18 @@ This is an editable installation (`-e` option), meaning that your changes to the
 reflected when importing lhotse (no re-install needed). The `[dev]` part means you're installing extra dependencies
 that are used to run tests, build documentation or launch jupyter notebooks.
 
-### Extra dependencies
+### Optional dependencies
+
+**Other pip packages.** You can leverage optional features of Lhotse by installing the relevant supporting package like this: `pip install lhotse[package_name]`. The supported optional packages include:
+- `pip install lhotse[kaldi]` for a maximal feature set related to Kaldi compatibility. It includes libraries such as `kaldi_native_io` (a more efficient variant of `kaldi_io`) and `kaldifeat` that port some of Kaldi functionality into Python.
+- `pip install lhotse[orjson]` for up to 50% faster reading of JSONL manifests.
+- `pip install lhotse[webdataset]`. We support "compiling" your data into WebDataset tarball format for more effective IO. You can still interact with the data as if it was a regular lazy CutSet. To learn more, check out the following tutorial: [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/lhotse-speech/lhotse/blob/master/examples/02-webdataset-integration.ipynb)
+- `pip install h5py` if you want to extract speech features and store them as HDF5 arrays.
+- `pip install dill`. When `dill` is installed, we'll use it to pickle CutSet that uses a lambda function in calls such as `.map` or `.filter`. This is helpful in PyTorch DataLoader with `num_jobs>0`. Without `dill`, depending on your environment, you'll see an exception or a hanging script.
+- `pip install smart_open` to read and write manifests and data in any location supported by `smart_open` (e.g. cloud, http).
+- `pip install opensmile` for feature extraction using the OpenSmile toolkit's Python wrapper.
 
-For reading older LDC SPHERE (.sph) audio files that are compressed with codecs unsupported by ffmpeg and sox, please run:
+**sph2pipe.** For reading older LDC SPHERE (.sph) audio files that are compressed with codecs unsupported by ffmpeg and sox, please run:
 
     # CLI
     lhotse install-sph2pipe

diff --git a/docs/getting-started.rst b/docs/getting-started.rst
@@ -13,19 +13,28 @@ Main goals
 **********
 
 * Attract a wider community to speech processing tasks with a **Python-centric design**.
+
 * Accommodate experienced Kaldi users with an **expressive command-line interface**.
+
 * Provide **standard data preparation recipes** for commonly used corpora.
+
 * Provide **PyTorch Dataset classes** for speech and audio related tasks.
+
 * Flexible data preparation for model training with the notion of **audio cuts**.
+
 * **Efficiency**, especially in terms of I/O bandwidth and storage capacity.
 
 Tutorials
 *********
 
 We currently have the following tutorials available in `examples` directory:
+
 * Basic complete Lhotse workflow |tutorial00|
+
 * Transforming data with Cuts |tutorial01|
-* *(experimental)* WebDataset integration |tutorial02|
+
+* WebDataset integration |tutorial02|
+
 * How to combine multiple datasets |tutorial03|
 
 .. |tutorial00| image:: https://colab.research.google.com/assets/colab-badge.svg
@@ -44,6 +53,7 @@ Examples of use
 Check out the following links to see how Lhotse is being put to use:
 
 * `Icefall recipes`_: where k2 and Lhotse meet.
+
 * Minimal ESPnet+Lhotse example: |mini librispeech colab notebook|
 
  .. |mini librispeech colab notebook| image:: https://colab.research.google.com/assets/colab-badge.svg
@@ -99,6 +109,37 @@ reflected when importing lhotse (no re-install needed). The ``[dev]`` part means
 that are used to run tests, build documentation or launch jupyter notebooks.
 
 
+Optional dependencies
+*********************
+
+**Other pip packages.** You can leverage optional features of Lhotse by installing the relevant supporting package like this: ``pip install lhotse[package_name]``. The supported optional packages include:
+
+* ``pip install lhotse[kaldi]`` for a maximal feature set related to Kaldi compatibility. It includes libraries such as ``kaldi_native_io`` (a more efficient variant of ``kaldi_io``) and ``kaldifeat`` that port some of Kaldi functionality into Python.
+
+* ``pip install lhotse[orjson]`` for up to 50% faster reading of JSONL manifests.
+
+* ``pip install lhotse[webdataset]``. We support "compiling" your data into WebDataset tarball format for more effective IO. You can still interact with the data as if it was a regular lazy CutSet. To learn more, check out the following tutorial: |tutorial02|
+
+* ``pip install h5py`` if you want to extract speech features and store them as HDF5 arrays.
+
+* ``pip install dill``. When ``dill`` is installed, we'll use it to pickle CutSet that uses a lambda function in calls such as ``.map`` or ``.filter``. This is helpful in PyTorch DataLoader with ``num_jobs>0``. Without ``dill``, depending on your environment, you'll see an exception or a hanging script.
+
+* ``pip install smart_open`` to read and write manifests and data in any location supported by ``smart_open`` (e.g. cloud, http).
+
+* ``pip install opensmile`` for feature extraction using the OpenSmile toolkit's Python wrapper.
+
+**sph2pipe.** For reading older LDC SPHERE (.sph) audio files that are compressed with codecs unsupported by ffmpeg and sox, please run::
+
+    # CLI
+    lhotse install-sph2pipe
+
+    # Python
+    from lhotse.tools import install_sph2pipe
+    install_sph2pipe()
+
+It will download it to ``~/.lhotse/tools``, compile it, and auto-register in ``PATH``. The program should be automatically detected and used by Lhotse.
+
+
 Examples
 --------
 

diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -2,4 +2,4 @@ numpy>=1.18.1
 sphinx_rtd_theme
 sphinx==4.2.0
 sphinx-click==3.0.1
-sphinx-autodoc-typehints==1.12.0
+sphinx-autodoc-typehints==1.12.0
diff --git a/setup.py b/setup.py
@@ -130,13 +130,19 @@ def mark_lhotse_version(version: str) -> None:
     "cytoolz>=0.10.1",
     "dataclasses",
     "intervaltree>= 3.1.0",
-    "lilcom>=1.1.0",
     "numpy>=1.18.1",
     "packaging",
     "pyyaml>=5.3.1",
     "tqdm",
 ]
 
+# Workaround for lilcom cmake issue: https://github.com/danpovey/lilcom/issues/41
+# present in automatic documentation builds.
+if os.environ.get("READTHEDOCS", False):
+    install_requires.append("lilcom==1.1.0")
+else:
+    install_requires.append("lilcom>=1.1.0")
+
 try:
     # If the user already installed PyTorch, make sure he has torchaudio too.
     # Otherwise, we'll just install the latest versions from PyPI for the user.
@@ -168,10 +174,20 @@ def mark_lhotse_version(version: str) -> None:
     "isort==5.10.1",
     "pre-commit>=2.17.0,<=2.19.0",
 ]
-dev_requires = sorted(docs_require + tests_require + ["jupyterlab", "matplotlib"])
-orjson_require = ["orjson>=3.6.6"]
-dill_require = ["dill"]
-all_requires = sorted(dev_requires + orjson_require + dill_require)
+orjson_requires = ["orjson>=3.6.6"]
+webdataset_requires = ["webdataset==0.2.5"]
+dill_requires = ["dill"]
+h5py_requires = ["h5py"]
+kaldi_requires = ["kaldi_native_io", "kaldifeat"]
+dev_requires = sorted(
+    docs_require
+    + tests_require
+    + orjson_requires
+    + webdataset_requires
+    + dill_requires
+    + ["jupyterlab", "matplotlib"]
+)
+all_requires = sorted(dev_requires)
 
 if os.environ.get("READTHEDOCS", False):
     # When building documentation, omit torchaudio installation and mock it instead.
@@ -202,8 +218,11 @@ def mark_lhotse_version(version: str) -> None:
     },
     install_requires=install_requires,
     extras_require={
-        "dill": dill_require,
-        "orjson": orjson_require,
+        "dill": dill_requires,
+        "orjson": orjson_requires,
+        "webdataset": webdataset_requires,
+        "h5py": h5py_requires,
+        "kaldi": kaldi_requires,
         "docs": docs_require,
         "tests": tests_require,
         "dev": dev_requires,