-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformer.compile improvements #451
Comments
Also, the current rulesets are a bit limited. For instance, |
Compilation/rewriting rules was a key contribution of the ICTIR paper, but I'm not sure if we should continue to maintain it.
|
I think that in a few use cases, it's valuable. E.g., rank cutoffs, including text metadata during retrieval, etc. Though I'm also not opposed to dropping the feature completely. There's currently a rule for merging pyterrier/pyterrier/batchretrieve.py Lines 851 to 854 in 776fe2e
|
Current Approach
Currently, we use
matchpy
to support pipeline optimization. A set of replacement rules are defined inpyterrier.transformer.rewrite_rules
, which all get executed whenTransformer.compile()
is called.Transformer.compile
is not used heavily, though it can be important for getting the most out of pipelines.Problems with the current approach
The current approach has several downsides:
ComposedPipeline
to define how it can be combined/optimized, not some other place in the code.rewrite_rules
variable.matchpy.Operations
with anarity
attribute.matchpy
feels like overkill for our needs, which are relatively simple combination rules. Using it introduces a dependency that we probably don't need.Proposed Solution
I propose dropping
matchpy
in favor of defining optimization logic in individual transformers. This will be faciliated with three methods:Transformer.compile(self) -> Transformer
-- existing method, but now individual transformers are responsible for implementing it. By default, it returns the transformer itself, but can be overridden with other functionality, such as mergingComposedPipeline
together.Transformer.fuse_left(self, left: Transformer) -> Optinal[Transformer]
-- Fuses this transformer with a transformer that immediately precedes this one in a pipeline. The new transformer should have the same effect as performing the two transformers in sequence, i.e.,pipeline_unfused
andpipeline_fused
in the following example should provide the same results:Transformer.fuse_right(self, right: Transformer) -> Optinal[Transformer]
-- Similar tofuse_left
but for a transformer that immediately follows this one.I'd expect that most pipelines wouldn't need to implement any of these methods. Perhaps only
ComposedPipeline
would definecompile
, which would be responsible for performing the merging logic, something like:An example implemetnation of
fuse_left
that merges aBatchRetrieve
into aFeaturesBatchRetrieve
, using the current logic:The text was updated successfully, but these errors were encountered: