From bbcf4082cbdd2bd41b036690ccace615cea6bc41 Mon Sep 17 00:00:00 2001 From: Martin Bammer Date: Sun, 14 Apr 2024 00:07:22 +0200 Subject: [PATCH] Update doc --- doc/count.md | 119 -------------------- doc/scandir.md | 170 ---------------------------- doc/walk.md | 91 --------------- pyscandir/doc/count.md | 119 ++++++++++++++++++++ pyscandir/doc/scandir.md | 170 ++++++++++++++++++++++++++++ pyscandir/doc/walk.md | 91 +++++++++++++++ scandir/doc/count.md | 148 +++++++++++++++++++----- scandir/doc/index.html | 93 --------------- scandir/doc/scandir.md | 238 ++++++++++++++++++++++++++++++++------- scandir/doc/walk.md | 172 +++++++++++++++++++++------- 10 files changed, 832 insertions(+), 579 deletions(-) delete mode 100644 doc/count.md delete mode 100644 doc/scandir.md delete mode 100644 doc/walk.md create mode 100644 pyscandir/doc/count.md create mode 100644 pyscandir/doc/scandir.md create mode 100644 pyscandir/doc/walk.md delete mode 100644 scandir/doc/index.html diff --git a/doc/count.md b/doc/count.md deleted file mode 100644 index c28200d..0000000 --- a/doc/count.md +++ /dev/null @@ -1,119 +0,0 @@ -# The API of class ``Count`` - -## ``Statistics`` - -The ``Statistics`` class is the return value of class methods ``results`` and ``collect`` of class ``Count``. - -### ``Statistics`` has following class members - -- ``dirs`` contains number of directories. -- ``files`` contains number of files. -- ``slinks`` contains number of symlinks. -- ``hlinks`` contains number of hardlinks. -- ``devices`` contains number of devices (only relevant on Unix systems). -- ``pipes`` contains number of named pipes (only relevant on Unix systems). -- ``size`` contains total size of all files. -- ``usage`` contains total usage on disk. -- ``errors`` list of access errors (list of strings). -- ``duration`` time taken for scanning (in seconds as a float). - -## ``Count(root_path: str, skip_hidden: bool = False, max_depth: int = 0, max_file_cnt: int = 0, dir_include: List[str] = None, dir_exclude: List[str] = None, file_include: List[str] = None, file_exclude: List[str] = None, case_sensitive: bool = False, return_type: ReturnType = ReturnType.Base)`` - -Creates a class instance for calculating statistics. The class instance initially does nothing. To start the scan either the method ``start`` or the method ``collect`` has to be called or a context has to be created (``with Count(...) as instance:``). When the context is closed the background thread is stopped. - -### Parameters - -- ``root_path`` is directory to scan. ``~`` is allowed on Unix systems. -- ``skip_hidden`` if ``True`` then ignore all hidden files and directories. -- ``max_depth`` is maximum depth of iteration. If ``0`` then depth limit is disabled. -- ``max_file_cnt`` is maximum number of files to collect. If ``0`` then limit is disabled. -- ``dir_include`` list of patterns for directories to include. -- ``dir_exclude`` list of patterns for directories to exclude. -- ``file_include`` list of patterns for files to include. -- ``file_exclude`` list of patterns for files to exclude. -- ``case_sensitive`` if `True` then do case sensitive pattern matching. -- ``return_type`` defines type of data returned. - -For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). - -### Return types - -- ``ReturnType.Base`` calculate statistcs for ``dirs``, ``files``, ``slinks``, ``size`` and ``usage``. -- ``ReturnType.Ext`` in addition to above calculate statistcs ``hlinks`` and on Unix platforms ``devices`` and ``pipes``. - -### Example usage of the context manager - -```python -import scandir_rs as scandir - -with scandir.Count("~/workspace", extended=True)) as instance: - while instance.busy(): - statistics = instance.results() - # Do something -``` - -### ``clear()`` - -Clear all results. - -### ``start()`` - -Start calculating statistics in background. Raises an expception if a task is already running. - -### ``join()`` - -Wait for parsing task to finish. - -### ``stop()`` - -Stop parsing task. - -### ``collect() -> Statistics`` - -Calculate statistics and return a ``Statistics`` object when the task has finished. This method is blocking and releases the GIL. - -### ``has_results() -> bool`` - -Returns ``True`` if new statistics are available. - -### ``results() -> Statistics`` - -Return a ``Statistics`` object with the current statistics. - -### ``has_errors() -> bool`` - -Returns ``True`` if errors occured while scanning the directory tree. The errors can be found in the statistics object. - -### ``duration -> float`` - -Returns the duration of the task in seconds as float. As long as the task is running it will return 0. - -### ``finished -> bool`` - -Returns ``True`` after the task has finished. - -### ``busy -> bool`` - -Returns ``True`` while a task is running. - -### ``as_dict() -> dict`` - -Returns statistics as a ``dict``. Result will only contain the keys of which the values are non zero. - -### ``to_speedy() -> bytes`` - -Feature `speedy` enabled. - -Returns statistics as [speedy](https://docs.rs/speedy/latest/speedy) encoded byte string. - -### ``to_bincode() -> bytes`` - -Feature `bincode` enabled. - -Returns statistics as [bincode](https://docs.rs/bincode/latest/bincode) encoded byte string. - -### ``to_json() -> str`` - -Feature `json` enabled. - -Returns statistics as [json](https://docs.rs/serde_json/latest/serde_json) encoded string. diff --git a/doc/scandir.md b/doc/scandir.md deleted file mode 100644 index 4739070..0000000 --- a/doc/scandir.md +++ /dev/null @@ -1,170 +0,0 @@ -# The API of class ``Scandir`` - -## ``ScandirResult`` - -Is an enum which can be: - -``DirEntry`` -``DirEntryExt`` - -## ``DirEntry`` - -- ``path`` relative path -- ``is_symlink`` ``True`` is entry is a symbolic link. -- ``is_dir`` ``True`` is entry is a directory. -- ``is_file`` ``True`` is entry is a file. -- ``st_ctime`` creation time in seconds as float. -- ``st_mtime`` modification time in seconds as float. -- ``st_atime`` access time in seconds as float. -- ``st_size`` size of entry. - -## ``DirEntryExt`` - -- ``is_symlink`` ``True`` is entry is a symbolic link. -- ``is_dir`` ``True`` is entry is a directory. -- ``is_file`` ``True`` is entry is a file. -- ``st_ctime`` creation time in seconds as float. -- ``st_mtime`` modification time in seconds as float. -- ``st_atime`` access time in seconds as float. -- ``st_mode`` file access mode / rights. -- ``st_ino`` inode number (only for Unix). -- ``st_dev`` device number (only for Unix). -- ``st_nlink`` number of hard links. -- ``st_size`` size of entry. -- ``st_blksize`` block size of file system. -- ``st_blocks`` number of blocks used. -- ``st_uid`` user id (only for Unix). -- ``st_gid`` groud id (only for Unix). -- ``st_rdev`` device number (for character and block devices on Unix). - -## ``Scandir(root_path: str, sorted: bool = False, skip_hidden: bool = False, metadata: bool = False, metadata_ext: bool = False, max_depth: int = 0, dir_include: list = None, dir_exclude: list = None, file_include: list = None, file_exclude: list = None, case_sensitive: bool = True, return_type: int = RETURN_TYPE_WALK, store: bool = true)`` - -Creates a class object for more control when reading the directory contents. Useful when the iteration should be doine in background without blocking the application. The class instance initially does nothing. To start the scan either the method ``start`` has to be called or a context has to be created (``with ClassInstance:``). When the context is closed the background thread is stopped. - -The returned results are tuples with absolute path and `DirEntry`, `DirEntryExt` or `DirEntryFull` object, depending on the `return_type`. In case of an error an error string is returned. - -### Parameters - -- ``root_path`` is directory to scan. ``~`` is allowed on Unix systems. -- ``sorted`` if ``True`` alphabetically sort results. -- ``skip_hidden`` if ``True`` ignore all hidden files and directories. -- ``metadata`` if ``True`` also fetch some metadata. -- ``metadata_ext`` if ``True`` also fetch extended metadata. -- ``max_depth`` is maximum depth of iteration. If ``0`` then depth limit is disabled. -- ``dir_include`` list of patterns for directories to include. -- ``dir_exclude`` list of patterns for directories to exclude. -- ``file_include`` list of patterns for files to include. -- ``file_exclude`` list of patterns for files to exclude. -- ``case_sensitive`` if `True` then do case sensitive pattern matching. -- ``return_type`` defines type of data returned. -- ``store`` store results in local structure. - -For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). - -### Return types - -- ``ReturnType.Base`` return ``DirEntry`` objects. -- ``ReturnType.Ext`` return ``DirEntryExt`` objects. - -### ``clear()`` - -Clear all results. - -### ``start()`` - -Start parsing the directory tree in background. Raises an expception if a task is already running. - -### ``join()`` - -Wait for parsing task to finish. - -### ``stop()`` - -Stop parsing task. - -### ``collect() -> Tuple[List[ScandirResult], List[Tuple[str, str]]]`` - -Parse file tree and wait until parsing has finished. Method ``start`` will be called if not already done. This method returns the same as the ``results`` method. It is blocking and releases the GIL. -``Error`` contains a tuple with 2 strings. First string contains path to file. Second string is the error message. - -### ``has_results(only_new: bool | None = True) -> bool`` - -Returns ``True`` if new entries or errors are available and ``only_new`` is ``True`` (default) or in case ``only_new`` is ``False`` and any entries and errors have been collected since the start of the parse task. - -### ``results_cnt(only_new: bool | None = True) -> int`` - -Returns the number of new entries and errors if ``only_new`` is ``True`` (default) or in case ``only_new`` is ``False`` the number of entries and errors since the start of the parse task. - -### ``results(only_new: bool | None = True) -> Tuple[List[ScandirResult], List[str, str]]`` - -Returns entries and errors. - -If ``only_new`` is ``True`` (default) then return all results and errors collected so far else return only new results and errors. - -### ``has_entries(only_new: bool | None = True) -> bool`` - -Returns ``True`` if new entries are available and ``only_new`` is ``True`` (default) or in case ``only_new`` is ``False`` and any entries have been collected since the start of the parse task. - -### ``entries_cnt(only_new: bool | None = True) -> int`` - -Returns the number of new entries if ``only_new`` is ``True`` (default) or in case ``only_new`` is ``False`` the number of entries since the start of the parse task. - -### ``entries(only_new: bool | None = True) -> List[Tuple[str, Toc]]`` - -Returns entries. - -If ``only_new`` is ``True`` (default) then return all results and errors collected so far else return only new results and errors. - -### ``has_errors() -> bool`` - -Returns ``True`` if new errors are available and ``only_new`` is ``True`` (default) or in case ``only_new`` is ``False`` and any errors have been collected since the start of the parse task. - -### ``errors_cnt(only_new: bool | None = True) -> int`` - -Returns the number of new errors if ``only_new`` is ``True`` (default) or in case ``only_new`` is ``False`` the number of errors since the start of the parse task. - -### ``errors(only_new: bool | None = True) -> List[Tuple[str, str]]`` - -Returns errors. - -If ``only_new`` is ``True`` (default) then return all results and errors collected so far else return only new results and errors. - -### ``duration -> float`` - -Returns the duration of the parsing task. As long as the task is running it will return 0. - -### ``finished -> bool`` - -Returns ``True`` after the parsing task has finished. - -### ``busy -> bool`` - -Returns ``True`` while a parsing task is running. - -### ``statistics -> Statistics`` - -Returns the statistics for all currently collected results. - -### ``as_dict(only_new: bool | None = True) -> Dict[str, DirEntry | DirEntryExt | str]`` - -Returns entries and errors as dictionary. - -If ``only_new`` is ``True`` then return all results collected so far else return only new results. Each result consists of root directory and ``Toc``. - -### ``to_speedy() -> bytes`` - -Feature `speedy` enabled. - -Returns statistics as [speedy](https://docs.rs/speedy/latest/speedy) encoded byte string. - -### ``to_bincode() -> bytes`` - -Feature `bincode` enabled. - -Returns statistics as [bincode](https://docs.rs/bincode/latest/bincode) encoded byte string. - -### ``to_json() -> str`` - -Feature `json` enabled. - -Returns statistics as [json](https://docs.rs/serde_json/latest/serde_json) encoded string. diff --git a/doc/walk.md b/doc/walk.md deleted file mode 100644 index 524f661..0000000 --- a/doc/walk.md +++ /dev/null @@ -1,91 +0,0 @@ -# The API of class ``Walk`` - -## ``Toc`` - -The ``Toc`` class is the return value of class method ``results`` and ``collect`` of class ``Walk``. - -### ``Toc`` has following class members - -- ``dirs`` list of directory names. -- ``files`` list of filenames. -- ``symlinks`` list of symlink names. -- ``other`` list of names of all other entry types. -- ``errors`` list of access errors (list of strings). - -## ``Walk(root_path: str, sorted: bool = False, skip_hidden: bool = False, max_depth: int = 0, max_file_cnt: int = 0, dir_include: List[str] = None, dir_exclude: List[str] = None, file_include: List[str] = None, file_exclude: List[str] = None, case_sensitive: bool = True, return_type: ReturnType = ReturnType.Base, store: bool = true)`` - -Creates a class instance for calculating statistics. The class instance initially does nothing. To start the scan either the method ``start`` or the method ``collect`` has to be called or a context has to be created (``with Walk(...) as instance:``). When the context is closed the background thread is stopped. - -### Parameters - -- ``root_path`` is directory to scan. ``~`` is allowed on Unix systems. -- ``sorted`` if ``True`` alphabetically sort results. -- ``skip_hidden`` if ``True`` then ignore all hidden files and directories. -- ``max_depth`` is maximum depth of iteration. If ``0`` then depth limit is disabled. -- ``dir_include`` list of patterns for directories to include. -- ``dir_exclude`` list of patterns for directories to exclude. -- ``file_include`` list of patterns for files to include. -- ``file_exclude`` list of patterns for files to exclude. -- ``case_sensitive`` if `True` then do case sensitive pattern matching. -- ``return_type`` defines type of data returned. -- ``store`` store results in local structure. - -For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). - -### Return types - -- ``ReturnType.Base`` return ``dirs`` and ``files`` as ``os.walk`` does. -- ``ReturnType.Ext`` return additional data: ``symlinks``, ``other`` and ``errors``. - -**Please note:** -> Due to limitations of jwalk the returned errors just contain the error message without any information to which files the errors correspond to. - -### ``clear()`` - -Clear all results. - -### ``start()`` - -Start parsing the directory tree in background. Raises an exception if a task is already running. - -### ``join()`` - -Wait for task to finish. - -### ``stop()`` - -Stop task. - -### ``collect() -> Toc`` - -Collect directories, files, etc. and return a ``Toc`` object when the task has finished. This method is blocking and releases the GIL. Method ``start`` will be called if not already done. - -### ``has_results(only_new: bool | None = True) -> bool`` - -Returns ``True`` if new entries are available and ``only_new`` is ``False`` or in case ``only_new`` is ``False`` and any entries have been collected since task start. - -### ``results_cnt(only_new: bool | None = True) -> int`` - -Returns number of results collected so far. If ``update`` is ``True`` then new results are counted too. - -### ``results(ronly_new: bool | None = True) -> List[Tuple[str, Toc]]`` - -Returns entries and errors. - -If ``only_new`` is ``True`` (default) then return all ``Toc`` collected so far else return only new ``Toc``. - -### ``has_errors() -> bool`` - -Returns ``True`` if errors occured while walking through the directory tree. The error messages can be found in ``Toc`` objects returned. - -### ``duration -> float`` - -Returns the duration of the task in seconds as float. As long as the task is running it will return 0. - -### ``finished -> bool`` - -Returns ``True`` after the task has finished. - -### ``busy -> bool`` - -Returns ``True`` while a task is running. diff --git a/pyscandir/doc/count.md b/pyscandir/doc/count.md new file mode 100644 index 0000000..7a4098c --- /dev/null +++ b/pyscandir/doc/count.md @@ -0,0 +1,119 @@ +# The API of class `Count` + +## `Statistics` + +The `Statistics` class is the return value of class methods `results` and `collect` of class `Count`. + +### `Statistics` has following class members + +- `dirs` contains number of directories. +- `files` contains number of files. +- `slinks` contains number of symlinks. +- `hlinks` contains number of hardlinks. +- `devices` contains number of devices (only relevant on Unix systems). +- `pipes` contains number of named pipes (only relevant on Unix systems). +- `size` contains total size of all files. +- `usage` contains total usage on disk. +- `errors` list of access errors (list of strings). +- `duration` time taken for scanning (in seconds as a float). + +## `Count(root_path: str, skip_hidden: bool = False, max_depth: int = 0, max_file_cnt: int = 0, dir_include: List[str] = None, dir_exclude: List[str] = None, file_include: List[str] = None, file_exclude: List[str] = None, case_sensitive: bool = False, return_type: ReturnType = ReturnType.Base)` + +Creates a class instance for calculating statistics. The class instance initially does nothing. To start the scan either the method `start` or the method `collect` has to be called or a context has to be created (`with Count(...) as instance:`). When the context is closed the background thread is stopped. + +### Parameters + +- `root_path` is directory to scan. `~` is allowed on Unix systems. +- `skip_hidden` if `True` then ignore all hidden files and directories. +- `max_depth` is maximum depth of iteration. If `0` then depth limit is disabled. +- `max_file_cnt` is maximum number of files to collect. If `0` then limit is disabled. +- `dir_include` list of patterns for directories to include. +- `dir_exclude` list of patterns for directories to exclude. +- `file_include` list of patterns for files to include. +- `file_exclude` list of patterns for files to exclude. +- `case_sensitive` if `True` then do case sensitive pattern matching. +- `return_type` defines type of data returned. + +For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). + +### Return types + +- `ReturnType.Base` calculate statistcs for `dirs`, `files`, `slinks`, `size` and `usage`. +- `ReturnType.Ext` in addition to above calculate statistcs `hlinks` and on Unix platforms `devices` and `pipes`. + +### Example usage of the context manager + +``python +import scandir_rs as scandir + +with scandir.Count("~/workspace", extended=True)) as instance: + while instance.busy(): + statistics = instance.results() + # Do something +`` + +### `clear()` + +Clear all results. + +### `start()` + +Start calculating statistics in background. Raises an expception if a task is already running. + +### `join()` + +Wait for parsing task to finish. + +### `stop()` + +Stop parsing task. + +### `collect() -> Statistics` + +Calculate statistics and return a `Statistics` object when the task has finished. This method is blocking and releases the GIL. + +### `has_results() -> bool` + +Returns `True` if new statistics are available. + +### `results() -> Statistics` + +Return a `Statistics` object with the current statistics. + +### `has_errors() -> bool` + +Returns `True` if errors occured while scanning the directory tree. The errors can be found in the statistics object. + +### `duration -> float` + +Returns the duration of the task in seconds as float. As long as the task is running it will return 0. + +### `finished -> bool` + +Returns `True` after the task has finished. + +### `busy -> bool` + +Returns `True` while a task is running. + +### `as_dict() -> dict` + +Returns statistics as a `dict`. Result will only contain the keys of which the values are non zero. + +### `to_speedy() -> bytes` + +Feature `speedy` enabled. + +Returns statistics as [speedy](https://docs.rs/speedy/latest/speedy) encoded byte string. + +### `to_bincode() -> bytes` + +Feature `bincode` enabled. + +Returns statistics as [bincode](https://docs.rs/bincode/latest/bincode) encoded byte string. + +### `to_json() -> str` + +Feature `json` enabled. + +Returns statistics as [json](https://docs.rs/serde_json/latest/serde_json) encoded string. diff --git a/pyscandir/doc/scandir.md b/pyscandir/doc/scandir.md new file mode 100644 index 0000000..889e12f --- /dev/null +++ b/pyscandir/doc/scandir.md @@ -0,0 +1,170 @@ +# The API of class `Scandir` + +## `ScandirResult` + +Is an enum which can be: + +`DirEntry` +`DirEntryExt` + +## `DirEntry` + +- `path` relative path +- `is_symlink` `True` is entry is a symbolic link. +- `is_dir` `True` is entry is a directory. +- `is_file` `True` is entry is a file. +- `st_ctime` creation time in seconds as float. +- `st_mtime` modification time in seconds as float. +- `st_atime` access time in seconds as float. +- `st_size` size of entry. + +## `DirEntryExt` + +- `is_symlink` `True` is entry is a symbolic link. +- `is_dir` `True` is entry is a directory. +- `is_file` `True` is entry is a file. +- `st_ctime` creation time in seconds as float. +- `st_mtime` modification time in seconds as float. +- `st_atime` access time in seconds as float. +- `st_mode` file access mode / rights. +- `st_ino` inode number (only for Unix). +- `st_dev` device number (only for Unix). +- `st_nlink` number of hard links. +- `st_size` size of entry. +- `st_blksize` block size of file system. +- `st_blocks` number of blocks used. +- `st_uid` user id (only for Unix). +- `st_gid` groud id (only for Unix). +- `st_rdev` device number (for character and block devices on Unix). + +## `Scandir(root_path: str, sorted: bool = False, skip_hidden: bool = False, metadata: bool = False, metadata_ext: bool = False, max_depth: int = 0, dir_include: list = None, dir_exclude: list = None, file_include: list = None, file_exclude: list = None, case_sensitive: bool = True, return_type: int = RETURN_TYPE_WALK, store: bool = true)` + +Creates a class object for more control when reading the directory contents. Useful when the iteration should be doine in background without blocking the application. The class instance initially does nothing. To start the scan either the method `start` has to be called or a context has to be created (`with ClassInstance:`). When the context is closed the background thread is stopped. + +The returned results are tuples with absolute path and `DirEntry`, `DirEntryExt` or `DirEntryFull` object, depending on the `return_type`. In case of an error an error string is returned. + +### Parameters + +- `root_path` is directory to scan. `~` is allowed on Unix systems. +- `sorted` if `True` alphabetically sort results. +- `skip_hidden` if `True` ignore all hidden files and directories. +- `metadata` if `True` also fetch some metadata. +- `metadata_ext` if `True` also fetch extended metadata. +- `max_depth` is maximum depth of iteration. If `0` then depth limit is disabled. +- `dir_include` list of patterns for directories to include. +- `dir_exclude` list of patterns for directories to exclude. +- `file_include` list of patterns for files to include. +- `file_exclude` list of patterns for files to exclude. +- `case_sensitive` if `True` then do case sensitive pattern matching. +- `return_type` defines type of data returned. +- `store` store results in local structure. + +For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). + +### Return types + +- `ReturnType.Base` return `DirEntry` objects. +- `ReturnType.Ext` return `DirEntryExt` objects. + +### `clear()` + +Clear all results. + +### `start()` + +Start parsing the directory tree in background. Raises an expception if a task is already running. + +### `join()` + +Wait for parsing task to finish. + +### `stop()` + +Stop parsing task. + +### `collect() -> Tuple[List[ScandirResult], List[Tuple[str, str]]]` + +Parse file tree and wait until parsing has finished. Method `start` will be called if not already done. This method returns the same as the `results` method. It is blocking and releases the GIL. +`Error` contains a tuple with 2 strings. First string contains path to file. Second string is the error message. + +### `has_results(only_new: bool | None = True) -> bool` + +Returns `True` if new entries or errors are available and `only_new` is `True` (default) or in case `only_new` is `False` and any entries and errors have been collected since the start of the parse task. + +### `results_cnt(only_new: bool | None = True) -> int` + +Returns the number of new entries and errors if `only_new` is `True` (default) or in case `only_new` is `False` the number of entries and errors since the start of the parse task. + +### `results(only_new: bool | None = True) -> Tuple[List[ScandirResult], List[str, str]]` + +Returns entries and errors. + +If `only_new` is `True` (default) then return all results and errors collected so far else return only new results and errors. + +### `has_entries(only_new: bool | None = True) -> bool` + +Returns `True` if new entries are available and `only_new` is `True` (default) or in case `only_new` is `False` and any entries have been collected since the start of the parse task. + +### `entries_cnt(only_new: bool | None = True) -> int` + +Returns the number of new entries if `only_new` is `True` (default) or in case `only_new` is `False` the number of entries since the start of the parse task. + +### `entries(only_new: bool | None = True) -> List[Tuple[str, Toc]]` + +Returns entries. + +If `only_new` is `True` (default) then return all results and errors collected so far else return only new results and errors. + +### `has_errors() -> bool` + +Returns `True` if new errors are available and `only_new` is `True` (default) or in case `only_new` is `False` and any errors have been collected since the start of the parse task. + +### `errors_cnt(only_new: bool | None = True) -> int` + +Returns the number of new errors if `only_new` is `True` (default) or in case `only_new` is `False` the number of errors since the start of the parse task. + +### `errors(only_new: bool | None = True) -> List[Tuple[str, str]]` + +Returns errors. + +If `only_new` is `True` (default) then return all results and errors collected so far else return only new results and errors. + +### `duration -> float` + +Returns the duration of the parsing task. As long as the task is running it will return 0. + +### `finished -> bool` + +Returns `True` after the parsing task has finished. + +### `busy -> bool` + +Returns `True` while a parsing task is running. + +### `statistics -> Statistics` + +Returns the statistics for all currently collected results. + +### `as_dict(only_new: bool | None = True) -> Dict[str, DirEntry | DirEntryExt | str]` + +Returns entries and errors as dictionary. + +If `only_new` is `True` then return all results collected so far else return only new results. Each result consists of root directory and `Toc`. + +### `to_speedy() -> bytes` + +Feature `speedy` enabled. + +Returns statistics as [speedy](https://docs.rs/speedy/latest/speedy) encoded byte string. + +### `to_bincode() -> bytes` + +Feature `bincode` enabled. + +Returns statistics as [bincode](https://docs.rs/bincode/latest/bincode) encoded byte string. + +### `to_json() -> str` + +Feature `json` enabled. + +Returns statistics as [json](https://docs.rs/serde_json/latest/serde_json) encoded string. diff --git a/pyscandir/doc/walk.md b/pyscandir/doc/walk.md new file mode 100644 index 0000000..bc82f7e --- /dev/null +++ b/pyscandir/doc/walk.md @@ -0,0 +1,91 @@ +# The API of class `Walk` + +## `Toc` + +The `Toc` class is the return value of class method `results` and `collect` of class `Walk`. + +### `Toc` has following class members + +- `dirs` list of directory names. +- `files` list of filenames. +- `symlinks` list of symlink names. +- `other` list of names of all other entry types. +- `errors` list of access errors (list of strings). + +## `Walk(root_path: str, sorted: bool = False, skip_hidden: bool = False, max_depth: int = 0, max_file_cnt: int = 0, dir_include: List[str] = None, dir_exclude: List[str] = None, file_include: List[str] = None, file_exclude: List[str] = None, case_sensitive: bool = True, return_type: ReturnType = ReturnType.Base, store: bool = true)` + +Creates a class instance for calculating statistics. The class instance initially does nothing. To start the scan either the method `start` or the method `collect` has to be called or a context has to be created (`with Walk(...) as instance:`). When the context is closed the background thread is stopped. + +### Parameters + +- `root_path` is directory to scan. `~` is allowed on Unix systems. +- `sorted` if `True` alphabetically sort results. +- `skip_hidden` if `True` then ignore all hidden files and directories. +- `max_depth` is maximum depth of iteration. If `0` then depth limit is disabled. +- `dir_include` list of patterns for directories to include. +- `dir_exclude` list of patterns for directories to exclude. +- `file_include` list of patterns for files to include. +- `file_exclude` list of patterns for files to exclude. +- `case_sensitive` if `True` then do case sensitive pattern matching. +- `return_type` defines type of data returned. +- `store` store results in local structure. + +For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). + +### Return types + +- `ReturnType.Base` return `dirs` and `files` as `os.walk` does. +- `ReturnType.Ext` return additional data: `symlinks`, `other` and `errors`. + +**Please note:** +> Due to limitations of jwalk the returned errors just contain the error message without any information to which files the errors correspond to. + +### `clear()` + +Clear all results. + +### `start()` + +Start parsing the directory tree in background. Raises an exception if a task is already running. + +### `join()` + +Wait for task to finish. + +### `stop()` + +Stop task. + +### `collect() -> Toc` + +Collect directories, files, etc. and return a `Toc` object when the task has finished. This method is blocking and releases the GIL. Method `start` will be called if not already done. + +### `has_results(only_new: bool | None = True) -> bool` + +Returns `True` if new entries are available and `only_new` is `False` or in case `only_new` is `False` and any entries have been collected since task start. + +### `results_cnt(only_new: bool | None = True) -> int` + +Returns number of results collected so far. If `update` is `True` then new results are counted too. + +### `results(ronly_new: bool | None = True) -> List[Tuple[str, Toc]]` + +Returns entries and errors. + +If `only_new` is `True` (default) then return all `Toc` collected so far else return only new `Toc`. + +### `has_errors() -> bool` + +Returns `True` if errors occured while walking through the directory tree. The error messages can be found in `Toc` objects returned. + +### `duration -> float` + +Returns the duration of the task in seconds as float. As long as the task is running it will return 0. + +### `finished -> bool` + +Returns `True` after the task has finished. + +### `busy -> bool` + +Returns `True` while a task is running. diff --git a/scandir/doc/count.md b/scandir/doc/count.md index 632008b..3e14bac 100644 --- a/scandir/doc/count.md +++ b/scandir/doc/count.md @@ -1,26 +1,122 @@ -# Count - -```rust -// collect() starts the worker thread and waits until it has finished. The line below is blocking. -let statistics = Count::new(&path)?.collect()?; -``` - -```rust -let mut instance = Count::new(&path)?; -// collect() starts the worker thread and waits until it has finished. The line below is blocking. -let statistics = instance.collect()?; -``` - -```rust -let mut instance = Count::new(&path)?; -instance.start()?; -loop { - if !instance.busy() { - break; - } - // Do something - thread::sleep(Duration::from_millis(10)); -} -// collect() immediately returns because the worker thread has already finished. -let statistics = instance.collect()?; -``` +# The API of class `Count` + +## Statistics + +The `Statistics` class is the return value of class methods `results` and `collect` of class `Count`. + +### `Statistics` has following class members + +- `dirs` contains number of directories. +- `files` contains number of files. +- `slinks` contains number of symlinks. +- `hlinks` contains number of hardlinks. +- `devices` contains number of devices (only relevant on Unix systems). +- `pipes` contains number of named pipes (only relevant on Unix systems). +- `size` contains total size of all files. +- `usage` contains total usage on disk. +- `errors` list of access errors (list of strings). +- `duration` time taken for scanning (in seconds as a float). + +## Count::new>(root_path: P) -> Result + +Creates a class instance for calculating statistics. The class instance initially does nothing. To start the scan either the method `start` or the method `collect` has to be called. + +### Class members + +- `root_path` is directory to scan. `~` is allowed on Unix systems. +- `skip_hidden` if `true` then ignore all hidden files and directories. +- `max_depth` is maximum depth of iteration. If `0` then depth limit is disabled. +- `max_file_cnt` is maximum number of files to collect. If `0` then limit is disabled. +- `dir_include` list of patterns for directories to include. +- `dir_exclude` list of patterns for directories to exclude. +- `file_include` list of patterns for files to include. +- `file_exclude` list of patterns for files to exclude. +- `case_sensitive` if `true` then do case sensitive pattern matching. +- `return_type` defines type of data returned. + +For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). + +### Return types + +- `ReturnType::Base` calculate statistics for `dirs`, `files`, `slinks`, `size` and `usage`. +- `ReturnType::Ext` in addition to above calculate statistcs `hlinks` and on Unix platforms `devices` and `pipes`. + +### skip_hidden(mut self, skip_hidden: bool) -> Self + +Set to `true` to skip hidden (starting with a dot) files. + +### max_depth(mut self, depth: usize) -> Self + +Set the maximum depth of entries yield by the iterator. + +### max_file_cnt(mut self, max_file_cnt: usize) -> Self + +Set maximum number of files to collect. + +### dir_include(mut self, dir_include: Option>) -> Self + +Set directory include filter. + +### dir_exclude(mut self, dir_exclude: Option>) -> Self + +Set directory exclude filter. + +### file_include(mut self, file_include: Option>) -> Self + +Set file include filter. + +### file_exclude(mut self, file_exclude: Option>) -> Self + +Set file exclude filter. + +### case_sensitive(mut self, case_sensitive: bool) -> Self + +Set case sensitive filename filtering. + +### extended(mut self, extended: bool) -> Self + +Set extended file type counting. + +### clear(&mut self) + +Clear all results. + +### start(&mut self) -> Result<(), Error> + +Start calculating statistics in background. Raises an expception if a task is already running. + +### join(&mut self) -> bool + +Wait for parsing task to finish. + +### stop(&mut self) -> bool + +Stop parsing task. + +### collect(&mut self) -> Result + +Calculate statistics and return a `Statistics` object when the task has finished. + +### has_results(&self) -> bool + +Returns `true` if new statistics are available. + +### results(&mut self) -> Statistics + +Return a `Statistics` object with the current statistics. + +### has_errors(&mut self) -> bool + +Returns `true` if errors occured while scanning the directory tree. The errors can be found in the statistics object. + +### duration(&mut self) -> f64 + +Returns the duration of the task in seconds as float. As long as the task is running it will return 0. + +### finished(&self) -> bool + +Returns `true` after the task has finished. + +### busy(&self) -> bool + +Returns `true` while a task is running. diff --git a/scandir/doc/index.html b/scandir/doc/index.html deleted file mode 100644 index 7ffacdb..0000000 --- a/scandir/doc/index.html +++ /dev/null @@ -1,93 +0,0 @@ - - - - - - Benchmarks - - - - -
-

Benchmarks

-

These are the results of the benchmarks.rs run.

-

Benchmark results on Linux

-

Walk linux-5.9

-

Benchmark results for linux-5.9 file tree for walkdir crate including a metadata call for each entry and - scandir::Walk call.

- -

Walk usr

-

Benchmark results for /usr file tree for walkdir crate including a metadata call for each entry and - scandir::Walk call.

- -

Scandir linux-5.9

-

Benchmark results for linux-5.9 file tree for scan_dir crate including a metadata call for each entry and - scandir::Scandir call with and without collecting extended metadata information.

- -

Scandir usr

-

Benchmark results for /usr file tree for scan_dir crate including a metadata call for each entry and - scandir::Scandir call with and without collecting extended metadata information.

- -

Benchmark results on Windows

-

Walk linux-5.9

-

Benchmark results for linux-5.9 file tree for walkdir crate including a metadata call for each entry and - scandir::Walk call.

- -

Walk Windows

-

Benchmark results for /usr file tree for walkdir crate including a metadata call for each entry and - scandir::Walk call.

- -

Scandir linux-5.9

-

Benchmark results for linux-5.9 file tree for scan_dir crate including a metadata call for each entry and - scandir::Scandir call with and without collecting extended metadata information.

- -

Scandir Windows

-

Benchmark results for /usr file tree for scan_dir crate including a metadata call for each entry and - scandir::Scandir call with and without collecting extended metadata information.

- -
- - - \ No newline at end of file diff --git a/scandir/doc/scandir.md b/scandir/doc/scandir.md index 13e35b1..4874157 100644 --- a/scandir/doc/scandir.md +++ b/scandir/doc/scandir.md @@ -1,40 +1,198 @@ -# Scandir - -The most simple way of using `Scandir` is the example below. Use this if you just need the final results. - -```rust -// collect() starts the worker thread and waits until it has finished. The line below is blocking. -let results = Scandir::new(&path, None)?.collect()?; -``` - -If you need some more information, which `Scandir` via `instance` provides then use the example below. - -```rust -let mut instance = Scandir::new(&path, None)?; -// collect() starts the worker thread and waits until it has finished. The line below is blocking. -let results = instance.collect()?; -``` - -The example below uses extended metadata to identify more file types. Of course, it is slower. - -```rust -let mut instance = Scandir::new(&path, None)?.return_type(ReturnType::Ext); -let results = instance.collect()?; -``` - -If you want to have intermediate results, e.g. you want to show the progress to the user, the use the example below. - -```rust -let mut instance = Scandir::new(&path, None)?; -instance.start()?; -loop { - if !instance.busy() { - break; - } - let new_results = instance.results(true); - // Do something - thread::sleep(Duration::from_millis(10)); -} -// collect() immediately returns because the worker thread has already finished. -let results = instance.collect()?; -``` +# The API of class `Scandir` + +## ScandirResult + +Is an enum which can be: + +`DirEntry` +`DirEntryExt` + +## `DirEntry` + +- `path` relative path +- `is_symlink` `True` is entry is a symbolic link. +- `is_dir` `True` is entry is a directory. +- `is_file` `True` is entry is a file. +- `st_ctime` creation time in seconds as float. +- `st_mtime` modification time in seconds as float. +- `st_atime` access time in seconds as float. +- `st_size` size of entry. + +## `DirEntryExt` + +- `is_symlink` `True` is entry is a symbolic link. +- `is_dir` `True` is entry is a directory. +- `is_file` `True` is entry is a file. +- `st_ctime` creation time in seconds as float. +- `st_mtime` modification time in seconds as float. +- `st_atime` access time in seconds as float. +- `st_mode` file access mode / rights. +- `st_ino` inode number (only for Unix). +- `st_dev` device number (only for Unix). +- `st_nlink` number of hard links. +- `st_size` size of entry. +- `st_blksize` block size of file system. +- `st_blocks` number of blocks used. +- `st_uid` user id (only for Unix). +- `st_gid` groud id (only for Unix). +- `st_rdev` device number (for character and block devices on Unix). + +## Scandir::new>(root_path: P, store: Option) -> Result + +Creates a class instance for getting the metadata of the entries of a file tree. The class instance initially does nothing. To start the scan either the method `start` or the method `collect` has to be called. + +### Class members + +- `root_path` is directory to scan. `~` is allowed on Unix systems. +- `sorted` if `true` alphabetically sort results. +- `skip_hidden` if `true` ignore all hidden files and directories. +- `metadata` if `true` also fetch some metadata. +- `metadata_ext` if `true` also fetch extended metadata. +- `max_depth` is maximum depth of iteration. If `0` then depth limit is disabled. +- `dir_include` list of patterns for directories to include. +- `dir_exclude` list of patterns for directories to exclude. +- `file_include` list of patterns for files to include. +- `file_exclude` list of patterns for files to exclude. +- `case_sensitive` if `true` then do case sensitive pattern matching. +- `return_type` defines type of data returned. +- `store` store results in local structure. + +For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). + +### Return types + +- `ReturnType::Base` return `DirEntry` objects. +- `ReturnType::Ext` return `DirEntryExt` objects. + +### sorted(mut self, sorted: bool) -> Self + +Return results in sorted order. + +### skip_hidden(mut self, skip_hidden: bool) -> Self + +Set to `true` to skip hidden (starting with a dot) files. + +### max_depth(mut self, depth: usize) -> Self + +Set the maximum depth of entries yield by the iterator. + +### max_file_cnt(mut self, max_file_cnt: usize) -> Self + +Set maximum number of files to collect. + +### dir_include(mut self, dir_include: Option>) -> Self + +Set directory include filter. + +### dir_exclude(mut self, dir_exclude: Option>) -> Self + +Set directory exclude filter. + +### file_include(mut self, file_include: Option>) -> Self + +Set file include filter. + +### file_exclude(mut self, file_exclude: Option>) -> Self + +Set file exclude filter. + +### case_sensitive(mut self, case_sensitive: bool) -> Self + +Set case sensitive filename filtering. + +### return_type(mut self, return_type: ReturnType) -> Self + +Set extended file type counting. + +### clear(&mut self) + +Clear all results. + +### start(&mut self) -> Result<(), Error> + +Start parsing the directory tree in background. Raises an exception if a task is already running. + +### join(&mut self) -> bool + +Wait for parsing task to finish. + +### stop(&mut self) -> bool + +Stop parsing task. + +### collect(&mut self) -> Result + +Calculate statistics and return a `Toc` object when the task has finished. This method is blocking. + +### has_results(&mut self, only_new: bool) -> bool + +If `only_new` is `true` this method returns `true` if new results are available, +If `only_new` is `false` this method returns `true` if results are available, + +### results_cnt(&mut self, only_new: bool) -> usize + +If `only_new` is `true` this method returns number of new results, +If `only_new` is `false` this method returns number of total results, + +### results(&mut self, only_new: bool) -> ScandirResults + +If `only_new` is `true` this method returns new results, +If `only_new` is `false` this method returns total results, + +### has_entries(&mut self, only_new: bool) -> bool + +If `only_new` is `true` this method returns `true` if new results are available, +If `only_new` is `false` this method returns `true` if results are available, + +### entries_cnt(&mut self, only_new: bool) -> usize + +If `only_new` is `true` this method returns number of new results, +If `only_new` is `false` this method returns number of total results, + +### entries(&mut self, only_new: bool) -> Vec + +If `only_new` is `true` this method returns new results, +If `only_new` is `false` this method returns total results, + +### has_errors(&mut self) -> bool + +Returns `true` if errors occured while scanning the file tree. + +### errors_cnt(&mut self) -> usize + +Returns number of errors occured while scanning the file tree. + +### errors(&mut self, only_new: bool) -> ErrorsType + +Returns the errors. + +### to_speedy(&self) -> Result, speedy::Error> + +Returns the results serialized with `speedy`. +For this method the feature `speedy` needs to be enabled. + +### to_bincode(&self) -> bincode::Result> + +Returns the results serialized with `bincode`. +For this method the feature `bincode` needs to be enabled. + +### to_json(&self) -> serde_json::Result + +Returns the results serialized as `json`. +For this method the feature `json` needs to be enabled. + +### statistics(&self) -> Statistics + +Returns the statistics of the results. + +### duration(&mut self) -> f64 + +Returns the duration of the task in seconds as float. As long as the task is running it will return 0. + +### finished(&self) -> bool + +Returns `true` after the task has finished. + +### busy(&self) -> bool + +Returns `true` while a task is running. diff --git a/scandir/doc/walk.md b/scandir/doc/walk.md index 8b1983b..6b0050a 100644 --- a/scandir/doc/walk.md +++ b/scandir/doc/walk.md @@ -1,40 +1,132 @@ -# Walk - -The most simple way of using `Walk` is the example below. Use this if you just need the final results. - -```rust -// collect() starts the worker thread and waits until it has finished. The line below is blocking. -let results = Walk::new(&path, None)?.collect()?; -``` - -If you need some more information, which `Walk` via `instance` provides then use the example below. - -```rust -let mut instance = Walk::new(&path, None)?; -// collect() starts the worker thread and waits until it has finished. The line below is blocking. -let results = instance.collect()?; -``` - -The example below uses extended metadata to identify more file types. Of course, it is slower. - -```rust -let mut instance = Walk::new(&path, None)?.return_type(ReturnType::Ext); -let results = instance.collect()?; -``` - -If you want to have intermediate results, e.g. you want to show the progress to the user, the use the example below. - -```rust -let mut instance = Walk::new(&path, None)?; -instance.start()?; -loop { - if !instance.busy() { - break; - } - let new_results = instance.results(true); - // Do something - thread::sleep(Duration::from_millis(10)); -} -// collect() immediately returns because the worker thread has already finished. -let results = instance.collect()?; -``` +# The API of class `Walk` + +## Toc + +The `Toc` class is the return value of class method `results` and `collect` of class `Walk`. + +### `Toc` has following class members + +- `dirs` list of directory names. +- `files` list of filenames. +- `symlinks` list of symlink names. +- `other` list of names of all other entry types. +- `errors` list of access errors (list of strings). + +## Walk::new>(root_path: P, store: Option) -> Result + +Creates a class instance for getting the file tree. The class instance initially does nothing. To start the scan either the method `start` or the method `collect` has to be called. + +### Class members + +- `root_path` is directory to scan. `~` is allowed on Unix systems. +- `sorted` if `true` alphabetically sort results. +- `skip_hidden` if `true` then ignore all hidden files and directories. +- `max_depth` is maximum depth of iteration. If `0` then depth limit is disabled. +- `dir_include` list of patterns for directories to include. +- `dir_exclude` list of patterns for directories to exclude. +- `file_include` list of patterns for files to include. +- `file_exclude` list of patterns for files to exclude. +- `case_sensitive` if `true` then do case sensitive pattern matching. +- `return_type` defines type of data returned. +- `store` store results in local structure. + +For valid file patterns see module [glob](https://docs.rs/glob/0.3.0/glob/struct.Pattern.html). + +### Return types + +- `ReturnType::Base` return `dirs` and `files` as `os.walk` does. +- `ReturnType::Ext` return additional data: `symlinks`, `other` and `errors`. + +**Please note:** +> Due to limitations of jwalk the returned errors just contain the error message without any information to which files the errors correspond to. + +### sorted(mut self, sorted: bool) -> Self + +Return results in sorted order. + +### skip_hidden(mut self, skip_hidden: bool) -> Self + +Set to `true` to skip hidden (starting with a dot) files. + +### max_depth(mut self, depth: usize) -> Self + +Set the maximum depth of entries yield by the iterator. + +### max_file_cnt(mut self, max_file_cnt: usize) -> Self + +Set maximum number of files to collect. + +### dir_include(mut self, dir_include: Option>) -> Self + +Set directory include filter. + +### dir_exclude(mut self, dir_exclude: Option>) -> Self + +Set directory exclude filter. + +### file_include(mut self, file_include: Option>) -> Self + +Set file include filter. + +### file_exclude(mut self, file_exclude: Option>) -> Self + +Set file exclude filter. + +### case_sensitive(mut self, case_sensitive: bool) -> Self + +Set case sensitive filename filtering. + +### return_type(mut self, return_type: ReturnType) -> Self + +Set extended file type counting. + +### clear(&mut self) + +Clear all results. + +### start(&mut self) -> Result<(), Error> + +Start parsing the directory tree in background. Raises an exception if a task is already running. + +### join(&mut self) -> bool + +Wait for parsing task to finish. + +### stop(&mut self) -> bool + +Stop parsing task. + +### collect(&mut self) -> Result + +Calculate statistics and return a `Toc` object when the task has finished. This method is blocking. + +### has_results(&mut self, only_new: bool) -> bool + +If `only_new` is `true` this method returns `true` if new results are available, +If `only_new` is `false` this method returns `true` if results are available, + +### results_cnt(&mut self, only_new: bool) -> usize + +If `only_new` is `true` this method returns number of new results, +If `only_new` is `false` this method returns number of total results, + +### results(&mut self, only_new: bool) -> Vec<(String, Toc)> + +If `only_new` is `true` this method returns new results, +If `only_new` is `false` this method returns total results, + +### has_errors(&mut self) -> bool + +Returns `true` if errors occured while scanning the directory tree. The errors can be found in the statistics object. + +### duration(&mut self) -> f64 + +Returns the duration of the task in seconds as float. As long as the task is running it will return 0. + +### finished(&self) -> bool + +Returns `true` after the task has finished. + +### busy(&self) -> bool + +Returns `true` while a task is running.