πͺ¦ This repo will soon be retired. extendio has been superseded by other snivilised projects (traverse, li18ngo, pants) as a result of lessons learned on the journey to learning Go programming.
This project provides extensions/alternative implementations to Go standard libraries, typically (typically but not limited to) io
and filepath
. It is intended the client should be abe to use this alongside the standard libraries like io.fs
, but to make it easier to do so, the convention within extendio
will be to name sub-packages it contains with a prefix of x, so that there is no clash with the standard version and therefore nullifies the requirement to use an alternative alias; eg the fs
package inside extendio
is called xfs
.
To invoke a traversal, create a PrimarySession
with the root path:
import ("github.com/snivilised/extendio/xfs/nav")
session := nav.PrimarySession{
Path: "/foo/bar/",
}
then configure by calling the Configure
method on the session:
callback := nav.LabelledTraverseCallback{
Fn: func(item *nav.TraverseItem) error {
fmt.Printf("Current Item Path: '%v' \n", item.Path)
err := something
return err
},
}
result := session.Configure(func(o *nav.TraverseOptions) {
o.Store.Subscription = nav.SubscribeFolders
o.Callback = callback
}).Run()
noOfFoldersFound := (*result.Metrics)[nav.MetricNoFoldersEn].Count
π Points of Note:
-
the callback here is actually an instance of
LabelledTraverseCallback
, which is astruct
that contains the function to be invoked and aLabel
. TheLabel
is optional and was defined for debugging purposes. When you have a lot of func definitions, its difficult to identify which is which without having some form of identification. -
function signature of
TraverseCallback
is defined as follows:
func(item *TraverseItem) error
Configure
requires a function to be passed in that receives an instance ofTraverseOptions
, which is already populated with default values. The function the client provides simply sets the required options (see options reference). β TheCallback
option is mandatory, if not set then traversal will fail with a panic.- the call to
Configure
returns an instance ofNavigationRunner
, which contains a singleRun
method that returns aTraverseResult
- the
TraverseResult
contains aMetrics
(of typeMetricCollection
) item which currently indicates the number of files and folders the callback has been invoked for during the traversal. To inspect, use theMetricEnum
(MetricNoFilesEn
,MetricNoFoldersEn
) to index intoMetrics
as illustrated in the example. - this example traverses the file system rooted at the path indicated in the session ('/foo/bar/') and invokes the callback for all folders found in the tree.
- Provides a pre-emptive declarative paradigm, to allow the client to be notified on a wider set of criteria and to minimise callback invocation. This allows for more efficiency when navigating large directory trees.
- More comprehensive filtering capabilities incorporating that which is already provided by
filepath.Match
. The filtering will include positive and negative matching for globs (shell patterns) and regular expressions. - The client is able to define custom filters
- The callback function signature will differ from
WalkDir
. Instead of being passed just the correspondingfs.DirEntry
, another custom type will be introduced which contains as a memberfs.DirEntry
. More properties can be attached to this new abstraction to support more features (as indicated below). - Add
Depth
property. This will indicate to the callback how many levels of descending has occurred relative to the root directory. - Add
IsLeaf
property. The client may need to know if the current directory being notified for is a leaf directory. In fact as part of the declarative paradigm, the client may if they wish request to be notified for leaf nodes only and this will be achieved using theIsLeaf
property.
- Add
Resume
function. Typically required in recovery scenarios, particularly when a large directory tree is being traversed and has been terminated part way, possibly in response to a CTRL-C interrupt. Instead of requiring a full traversal of the directory tree, theResume
function can be used to only process that part of the tree not visited in the previous run. TheResume
function would require theRoot
path parameter, and a checkpoint path. The term fractured ancestor is introduced which denotes those directory nodes in the tree whose contents were only partially visited. Starting at the checkpoint,Resume
would traverse the tree beginning at the checkpoint, then get the parent and find successor sibling nodes, invoking their corresponding trees. Then ascend and repeat the process until the root is encountered.Resume
needs to invokeTraverse
for each sub tree individually.
- In order to support i18n, error handling will be implemented slightly differently to the standard error handling paradigm already established in Go. Simply returning an error which is just a string containing an error message, is not i18n friendly. We could just return an error code which of course would be documented, but it would be more useful to return a richer abstraction, another object which contains various properties about the error. This object will contain an error code (probably int based, or pseudo enum). It can even contain a string member which contains the error message in English, but the error code would allow for messages to be translated (possibly using Go templates). The exact implementation has not been finalised yet, but this is the direction of travel.
- Contains an alternative version of bus. The requirement for a bus implementation is based upon the need to create loosely coupled internal code. The original bus was designed with concurrency in mind so it uses locks to achieve thread safety. This aspect is surplus to requirements as all we need it for are synchronous scenarios, so it has been striped out.
The Traverse feature comes with many options to customise the way a file system is traversed illustrated in the following table:
Name | - | - | - | Default | Reference |
---|---|---|---|---|---|
Storeπ | REF | ||||
Subscriptionπ | SubscribeAny | REF | |||
DoExtendπ | false | ||||
Behaviours | |||||
SubPathπ | |||||
KeepTrailingSepπ | true | ||||
Sort | |||||
IsCaseSensitiveπ | false | ||||
DirectoryEntryOrderπ | DirectoryEntryOrderFoldersFirstEn | ||||
Listenπ | |||||
InclusiveStartπ | true | ||||
InclusiveStopπ | false | ||||
Loggingπ | |||||
Path | ~/snivilised.extendio.nav.log | ||||
TimeStampFormat | 2006-01-02 15:04:05 | ||||
Level | InfoLevel | ||||
Rotation | |||||
MaxSizeInMb | 50 | ||||
MaxNoOfBackups | 3 | ||||
MaxAgeInDays | 28 | ||||
Callback | β (mandatory) | ||||
Notifyπ | |||||
OnBegin | no-op | ||||
OnEnd | no-op | ||||
OnDescend | no-op | ||||
OnAscend | no-op | ||||
OnStart | no-op | ||||
OnStop | no-op | ||||
Hooksπ | |||||
QueryStatus | LstatHookFn | ||||
ReadDirectory | ReadEntries | ||||
FolderSubPath | RootParentSubPath | ||||
FileSubPath | RootParentSubPath | ||||
InitFilters | InitFiltersHookFn | ||||
Sort | CaseSensitiveSortHookFn / CaseInSensitiveSortHookFn | ||||
Extend | DefaultExtendHookFn / no-op | ||||
Listenπ | |||||
Start | no-op | ||||
Stop | no-op | ||||
Persistπ | |||||
Format | PersistInJSONEn |
Sort.IsCaseSensitive
: blah
Sort.DirectoryEntryOrder
: blah
Logging
: blah
A subscription defines which file system item type the callback gets invoked for. The client can make a subscription of one of the following types:
- files (
SubscribeFiles
): callback invoked for file items only - folders (
SubscribeFolders
): callback invoked for folder items only - folders with files (
SubscribeFoldersWithFiles
): callback invoked for folder items only, but includes all files that are contained inside the folder, as theChildren
property ofTraverseItem
that the callback is invoked with - all (
SubscribeAny
): callback invoked for files and folders
Extra semantics have been assigned to folders which allows for enhanced filtering. Each folder is allocated a scope depending on a combination of factors including the depth of that folder relative to the root and whether the folder contains any child folders. Available scopes are:
- root: the root node, ie the path specified by the user to start traversing from
- top: any node that is a direct descendent of the root node
- leaf: any node that has no sub folders
- intermediate: nodes which are neither root, top or leaf nodes
A node may contain multiple scope designations. The following are valid combinations:
- root and leaf
- top and leaf
There are 2 categories of filters, a node filter (defined in options at Options.Store.FilterDefs.Node
) and a child filter (defined at Options.Store.FilterDefs.Children
). The node filter is applied to a single entity (the file system item, for which the callback is being invoked for), where as the child filter is a compound filter which is applied to a collection, ie the list of the current folder's file items (for subscription type folders with files).
The following filter types are available:
- regex: built in filter by a Go regular expression
- glob: built in filter by a glob pattern, characterised by use of *
- custom: allows the client to perform custom filtering
built in filters also benefit from the following features
- negation: a filter's logic can be reversed, by setting the
Negate
property of theFilterDef
totrue
. Any node will now only be invoked for, if it does not match the defined pattern. - scope: a filter can be restricted to only be applied to those matching the defined scope. Eg a filter may specify a scope of intermediate which means that it is only applicable to intermediate nodes. To turn off scope based filtering, use the all scope (
ScopeAllEn
) in the filter definition (FilterDef.Scope
) - ifNotApplicable: when scope filtering is in use, we can also change the behaviour of the filter if it is not applicable to the node. By default, if the filter is not applicable, the callback will not be invoked for that node. The client can invert this behaviour so that if the filter is not applicable, then the filter should not activate and allow the callback to be invoked. To use this, the
FilterDef
'sIfNotApplicable
property should be set totrue
.
The Extension provides extra information contained in the TraverseItem
that is passed to the client callback. To request the Extension, the client should set the DoExtend
property in the traverse options at Options.Store.DoExtend
to true
.
β Warning: Only use the properties on the Extension (TraverseItem.Extension
) if the DoExtend
described above has been set to true. If Extension is not active, attempting to reference a field on it will result in a panic.
Extension properties include the following:
- Depth: traversal depth relative to the root
- IsLeaf: is the item a leaf node (file items are always leaf nodes)
- Name: is just the name portion of the item's path (
TraverseItem.Path
) - Parent: is the parent path of the current node
- SubPath: represents the relative path between the root and the current node
- NodeScope: scope designation applied to the current node
- Custom: a client defined property that can be set by overriding the Extension (see next)
The Extension can be overridden using the hook function. The default Extension hook is implemented by exported function DefaultExtendHookFn
. The client needs to set a custom extend function on the options at: Options.Hooks.Extend
. See hooks for function signature. If the client just needs to augment the default functionality rather than replace it, in the custom function implemented by the client, just needs to invoke the default function DefaultExtendHookFn
.
When composing the SubPath
on the Extension, 2 hooks are employed, 1 for files FileSubPath
and the other for folders FolderSubPath
. The SubPath created by both of these can be configured to retain a trailing path separator using option setting Options.Store.Behaviours.SubPath.KeepTrailingSep
which defaults to true
.
The behaviour of the traversal process can be modified by use of the declared hooks. The following shows the hooks with the function type and default hook indicated inside brackets:
QueryStatus
(QueryStatusHookFn
,LstatHookFn
): acquires thefs.FileInfo
entry of the root nodeReadDirectory
(ReadDirectoryHookFn
,ReadEntries
): reads the contents of a directoryFolderSubPath
(SubPathHookFn
,RootParentSubPath
): used to populate theSubPath
property ofTraverseItem.Extension
for folder nodesFileSubPath
(SubPathHookFn
,RootParentSubPath
): used to populate theSubPath
property ofTraverseItem.Extension
for file nodesInitFilters
(FilterInitHookFn
,InitFiltersHookFn
): filter initialisation functionSort
(SortEntriesHookFn
, set depending on value ofOptions.Store.Behaviours.Sort.IsCaseSensitive
): sorting function WhenOptions.Store.Behaviours.Sort.IsCaseSensitive
is set totrue
, then the default function isCaseSensitiveSortHookFn
otherwiseCaseInSensitiveSortHookFn
Extend
(ExtendHookFn
, set depending on value ofOptions.Store.DoExtend
): WhenOptions.Store.DoExtend
is set totrue
, then the default function isDefaultExtendHookFn
otherwise set to an internally defined no op function.
Enables client to be called back during specific moments of the traversal. The following notifications are available (with the function type indicated inside brackets):
OnBegin
(BeginHandler
): beginning of traversalOnEnd
(EndHandler
): end of traversalOnDescend
(AscendancyHandler
): invoked as a folder is descendedOnAscend
(AscendancyHandler
): invoked as a folder is ascendedOnStart
(ListenHandler
): start listening condition met (if listening enabled)OnStop
(ListenHandler
): finish listening condition met (if listening enabled)
The Listen feature allows the client to define a particular condition when callback invocation is to start and when to stop. The client does this by defining predicate functions in the options at Options.Listen.Start
and Options.Listen.Stop
.
The client can choose to define either or both of the Listen events. If Start is defined, then once traversal begins, the callback will not be invoked until the first node is encountered that satisfies the condition. If Stop is defined, then the callback will cease to be called at the point when the End predicate fires and the traversal is ended early.
The Start and Stop conditions are defined using ListenBy
, eg:
session.Configure(func(o *nav.TraverseOptions) {
o.Store.Behaviours.Listen.InclusiveStart = true
o.Store.Behaviours.Listen.InclusiveStop = false
o.Listen.Start = nav.ListenBy{
Name: "Start listening at Night Drive",
Fn: func(item *nav.TraverseItem) bool {
return item.Extension.Name == "Night Drive"
},
}
o.Listen.Stop = nav.ListenBy{
Name: "Stop listening at Electric Youth",
Fn: func(item *nav.TraverseItem) bool {
return item.Extension.Name == "Electric Youth"
},
}
})
π Points of Note:
- start listening when node found whose name is "Night Drive"
- stop listening when node found whose name is "Electric Youth"
InclusiveStart
andInclusiveStop
shown in this example are the default values so do not need to be specified, (just showed here for illustration). The Inclusive settings allows the client to adjust whether the callback is invoked at the time the predicate is fired. When Inclusive is true, the callback is invoked for the current item. When false, the callback is not invoked for the current node item. So for the default settings, the callback is invoked when the Start predicate fires, but not when the Stop predicate fires (inclusive for Start and exclusive for Stop)- the predicates for Start and for Stop are defined by the
Listener
interface. This means that the client can use a filter to define these predicates, the previous example defined with filters is shown as follows:
π₯ NOT IMPLEMENTED YET see issue #125
session.Configure(func(o *nav.TraverseOptions) {
o.Listen.Start = nav.RegexFilter{
Filter: nav.Filter{
Name: "Start listening at Night Drive",
RequiredScope: nav.ScopeAllEn,
Pattern: "^Night Drive$",
},
}
o.Listen.Start = nav.GlobFilter{
Filter: nav.Filter{
Name: "Stop listening at Electric Youth",
RequiredScope: nav.ScopeAllEn,
Pattern: "Electric Youth",
},
}
})
RxGo
To support concurrency features, Extendio uses the reactive model provided by RxGo. However, since RxGo seems to be a dead project with its last release in April 2021 and its unit tests not currently running successfully, the decision has been made to re-implement this locally. One of the main reasons for the project no longer being actively maintained is the release of generics feature in Go version 1.18, and supporting generics in RxGo would require significant effort to re-write the entire library. While work on this has begun, it's unclear when this will be delivered. Despite this, the reactive model's support for concurrency is highly valued, and Extendio aims to make use of a minimal functionality set for parallel processing during directory traversal. The goal is to replace it with the next version of RxGo when it becomes available.
See: