Allow recursive keep_keys #8571

MPStudyly · 2024-07-01T17:32:43Z

SUMMARY

This PR adds a parameter and functionality to the keep_keys filter introduced by #8438 to allow recursive searches over nested lists of dicts. This is a draft for now, as it for sure needs some polishing. Also I was unsure if the checks performed should be moved to the according plugins_utils file. remove_keys and replace_keys are also not touched by this (yet), as I'd like to have some input on this first.

Due to no issue existing yet, I was unaware on how to name/number a changelog fragment appropriately. If I can simply use the ref number of the PR, I'll create one with that. If I have to open a feature request ticket, I'll do that.

ISSUE TYPE

Feature Pull Request

COMPONENT NAME

keep_keys

ADDITIONAL INFORMATION

- name: Debug before change
  vars:
    router_static_routes:
    - address_families:
        - afi: "ipv4"
          routes:
            - description: "WANv4"
              dest: "0.0.0.0/0" # default route
              next_hops:
                - forward_router_address: "192.168.0.1"
        - afi: "ipv6"
          routes:
            - description: "WANv6"
              dest: "::/0" # default route
              next_hops:
                - forward_router_address: "2001:db8::1"
  run_once: true
  ansible.builtin.debug:
    msg: {{ groups.all | map('extract', hostvars) | community.general.keep_keys(target=['dest', 'forward_router_address']) }}

gives

TASK [Debug static routes non recursive] ****************************************
ok: [localhost] => 
  msg: []

while

- name: Debug adter change
  vars:
    router_static_routes:
    - address_families:
        - afi: "ipv4"
          routes:
            - description: "WANv4"
              dest: "0.0.0.0/0" # default route
              next_hops:
                - forward_router_address: "192.168.0.1"
        - afi: "ipv6"
          routes:
            - description: "WANv6"
              dest: "::/0" # default route
              next_hops:
                - forward_router_address: "2001:db8::1"
  run_once: true
  ansible.builtin.debug:
    msg: {{ groups.all | map('extract', hostvars) | community.general.keep_keys(target=['dest', 'forward_router_address'], recurse_lists=true)) }}

gives

TASK [Debug static routes recursive] ************************************
ok: [localhost] => 
  msg:
  - router_static_routes:
    - address_families:
      - routes:
        - dest: 0.0.0.0/0
          next_hops:
          - forward_router_address: 192.168.0.1
      - routes:
        - dest: ::/0
          next_hops:
          - forward_router_address: 2001:db8::1

Add a parameter to allow recursive search of keys to be kept Adjust result yielding to loop over subsequent lists of dicts if requested

ansibullbot · 2024-07-01T17:47:45Z

cc @vbotka
click here for bot help

vbotka · 2024-07-02T02:20:13Z

Get the lists you want to transform and map the filter

    t: [key_in_list, other]
    rl: "{{ testvar |
            map(attribute='list_of_dicts') |
            map('community.general.keep_keys', target=t) }}"

gives

  rl:
    - - {key_in_list: one, other: one other}
      - {key_in_list: two, other: two other}
    - - {key_in_list: three, other: three other}
      - {key_in_list: four, other: four other}

Use the filter dict_kv if you want a list of dictionaries

  rd: "{{ rl | map('community.general.dict_kv', 'list_of_dicts' ) }}"

gives what you want

  rd:
    - list_of_dicts:
      - {key_in_list: one, other: one other}
      - {key_in_list: two, other: two other}
    - list_of_dicts:
      - {key_in_list: three, other: three other}
      - {key_in_list: four, other: four other}

This use case is not consistent and can be solved with current tools. I'd propose to reject this PR.

MPStudyly · 2024-07-02T06:13:21Z

Get the lists you want to transform and map the filter

    t: [key_in_list, other]
    rl: "{{ testvar |
            map(attribute='list_of_dicts') |
            map('community.general.keep_keys', target=t) }}"

gives

  rl:
    - - {key_in_list: one, other: one other}
      - {key_in_list: two, other: two other}
    - - {key_in_list: three, other: three other}
      - {key_in_list: four, other: four other}

You are right, it does. But as another lever gets added, this won't work anymore. I should have made a more accurate example of the situation I'm facing to show how this PR makes sense.

Use the filter dict_kv if you want a list of dictionaries

  rd: "{{ rl | map('community.general.dict_kv', 'list_of_dicts' ) }}"

gives what you want

  rd:
    - list_of_dicts:
      - {key_in_list: one, other: one other}
      - {key_in_list: two, other: two other}
    - list_of_dicts:
      - {key_in_list: three, other: three other}
      - {key_in_list: four, other: four other}

Again, in this case, it does. It still requires me to "hardcode" a new dict key name to restore the original layout. Also this breaks, when at least one other level is added.

This use case is not consistent and can be solved with current tools. I'd propose to reject this PR.

As explained, there is no meaningful way yet to apply this filter to deeply nested dicts with lists of dicts. Mapping the way through a dict is a possible way around (map(attribute=...)), but heavily limits flexibility, as the whole dict structure has to be known beforehand. I will update my example code to better reflect the actual problem I try to solve.

EDIT: I've updated the example @vbotka, please have another look.

vbotka · 2024-07-02T09:04:52Z

There are recursive filters ansible.utils (keep_keys, remove_keys, and replace_keys) . The design of the community.general filters is simple, compact, and clear. To keep it that way, the filters are intentionally not recursive. The complexity of your PR would increase the maintenance effort not worth the use case benefit.

Instead, the filter ansible.utils.keep_keys is recursive. Quoting:

This plugin keeps only specified keys from provided data recursively

There is a bug:

keep_keys doesn't exclude a key if the value is list.
keep_keys doesn't exclude a key if the value is list. ansible.utils#353

Instead of fixing the filter ansible.utils.keep_keys complexity I decided to write the simple non-recursive filter community.general.keep_keys. I'm sure you understand that merging your PR would be counterproductive. If you need the recursion I suggest you fix the ansible.utils filters or write your recursive filters, e.g. keep_keys_recursive. Please feel free to use the code.

To avoid a monolithic design the code must be properly structured. If you decide to write your filter reuse the code from plugins/filter/keep_keys.py

    if matching_parameter == 'equal':
        def keep_key(key):
            return key in tt
    elif matching_parameter == 'starts_with':
        def keep_key(key):
            return key.startswith(tt)
    elif matching_parameter == 'ends_with':
        def keep_key(key):
            return key.endswith(tt)
    elif matching_parameter == 'regex':
        def keep_key(key):
            return tt.match(key) is not None

    return [dict((k, v) for k, v in d.items() if keep_key(k)) for d in data]

and create function _keep_keys in plugins/plugin_utils/keys_filter.py

def _keep_keys(data,tt, matching_parameter):
    ...

Then, import the functions in your code

from ansible_collections.community.general.plugins.plugin_utils.keys_filter import (
    _keys_filter_params,
    _keys_filter_target_str,
    _keep_keys)

MPStudyly · 2024-07-11T09:17:50Z

Sorry for my late response.

There are recursive filters ansible.utils (keep_keys, remove_keys, and replace_keys) . The design of the community.general filters is simple, compact, and clear. To keep it that way, the filters are intentionally not recursive. The complexity of your PR would increase the maintenance effort not worth the use case benefit.

Okay, while I don't like the naming collision of these two filters that basically do different stuff (I know, the FQDN is different, but still), I accept that keeping complexity low is a valid reason to not merge this.

Instead, the filter ansible.utils.keep_keys is recursive. Quoting:

This plugin keeps only specified keys from provided data recursively

There is a bug:
* keep_keys doesn't exclude a key if the value is list.
  [keep_keys doesn't exclude a key if the value is list. ansible.utils#353](https://github.com/ansible-collections/ansible.utils/issues/353)

That's the exact reason we resorted to the community.general variant. We first misunderstood this to be recursive as well, which it wasn't. As we had a look at both filters, we found this one way easier to adapt than fixing the other one.

Instead of fixing the filter ansible.utils.keep_keys complexity I decided to write the simple non-recursive filter community.general.keep_keys. I'm sure you understand that merging your PR would be counterproductive. If you need the recursion I suggest you fix the ansible.utils filters or write your recursive filters, e.g. keep_keys_recursive. Please feel free to use the code.

I'll see if I can have another look at the ansible.utils filter. We found that code unnecessarily hard to grasp. For now we're working with a local copy of this filter only, as that suffices our needs.

To avoid a monolithic design the code must be properly structured. If you decide to write your filter reuse the code from plugins/filter/keep_keys.py

    if matching_parameter == 'equal':
        def keep_key(key):
            return key in tt
    elif matching_parameter == 'starts_with':
        def keep_key(key):
            return key.startswith(tt)
    elif matching_parameter == 'ends_with':
        def keep_key(key):
            return key.endswith(tt)
    elif matching_parameter == 'regex':
        def keep_key(key):
            return tt.match(key) is not None

    return [dict((k, v) for k, v in d.items() if keep_key(k)) for d in data]

and create function _keep_keys in plugins/plugin_utils/keys_filter.py

def _keep_keys(data,tt, matching_parameter):
    ...

Then, import the functions in your code

from ansible_collections.community.general.plugins.plugin_utils.keys_filter import (
    _keys_filter_params,
    _keys_filter_target_str,
    _keep_keys)

Thx for the input! This is why the whole PR is tagged as a draft, this code wasn't meant to be merged as is anyway, as there is quite some stuff missing (e.g. remove_keys, replace_keys adaptions, tests, etc.) :)

How should we proceed with this? Should we close the PR and move on at ansible.utils or is this still of relevance for the community.general collection?

EDIT: I just noticed your issue/PR which referenced this. As far as the examples go, it looks like it is doing the exact same thing, but I don't really get the naming there.

vbotka · 2024-07-11T11:44:41Z

How should we proceed with this? Should we close the PR and move on at ansible.utils or is this still of relevance for the community.general collection?

Yes. I think this PR should be closed. I think the justification to solve this use case by this filter is weak. IMO, if the data are reported by a device you might want to parse it first. There is a new universe in Parsing semi-structured text with Ansible. If this is configuration data you might find a better structure that fits your use case. A custom filter might be a solution too. Generally, the below structure is wrong

    - ld:
      - {key: 1, other: other1}
      - {key: 2, other: other2}
    - ld:
      - {key: 3, other: other3}
      - {key: 4, other: other4}

This can be reduced to what you ultimately need (optionally, including other dictionaries)

  ld:
    1: other1
    2: other2
    3: other3
    4: other4

, or if you need the batches

  ld:
    - 1: other1
      2: other2
    - 3: other3
      4: other4

EDIT: I just noticed your issue/PR which referenced this. As far as the examples go, it looks like it is doing the exact same thing, but I don't really get the naming there.

Yes. The filter community.general.reveal_ansible_type and the test community.general.ansible_type should help to solve use cases like yours. For more details see the plugin examples or the integration test. There are also my public notes.

MPStudyly · 2024-07-15T08:51:59Z

How should we proceed with this? Should we close the PR and move on at ansible.utils or is this still of relevance for the community.general collection?

Yes. I think this PR should be closed. I think the justification to solve this use case by this filter is weak. IMO, if the data are reported by a device you might want to parse it first. There is a new universe in Parsing semi-structured text with Ansible. If this is configuration data you might find a better structure that fits your use case. A custom filter might be a solution too. Generally, the below structure is wrong
    - ld:
      - {key: 1, other: other1}
      - {key: 2, other: other2}
    - ld:
      - {key: 3, other: other3}
      - {key: 4, other: other4}

The structure is actually required by the collection we are using to configure routers. Though, we came to a similar conclusion that treating this structure as source of truth is not optimal. Some restructuring already cleared most of the issues we had in this regard.

Principally I don't see why such structures should be "wrong" per se, as nested lists of dicts are way easier to traverse in a generic way than nested dicts. I know there is dict2items, but that often unnecessarily bloats templates, at least in our use cases.

EDIT: I just noticed your issue/PR which referenced this. As far as the examples go, it looks like it is doing the exact same thing, but I don't really get the naming there.

Yes. The filter community.general.reveal_ansible_type and the test community.general.ansible_type should help to solve use cases like yours. For more details see the plugin examples or the integration test. There are also my public notes.

I'm not 100% convinced yet that it would really help us/our use case (if we didn't work around it yet), as it is still required to generate a list of dicts from a list of lists using a statically defined key. I appreciate your work though!

vbotka · 2024-07-15T09:49:48Z

    - ld:
      - {key: 1, other: other1}
      - {key: 2, other: other2}
    - ld:
      - {key: 3, other: other3}
      - {key: 4, other: other4}

... I don't see why such structures should be "wrong" per se, as nested lists of dicts are way easier to traverse
in a generic way than nested dicts. I know there is dict2items, but that often unnecessarily bloats templates

The dictionary's lower complexity O(1) should be chosen over list O(n) when possible. See TimeComplexity. The benefit of lower complexity exceeds the cost of Ansible filter dict2items, or Python method .items(). The tools (filters, tests, methods, ...) are designed to cover rational choices.

MPStudyly · 2024-07-15T11:00:07Z

    - ld:
      - {key: 1, other: other1}
      - {key: 2, other: other2}
    - ld:
      - {key: 3, other: other3}
      - {key: 4, other: other4}
... I don't see why such structures should be "wrong" per se, as nested lists of dicts are way easier to traverse
in a generic way than nested dicts. I know there is dict2items, but that often unnecessarily bloats templates

The dictionary's lower complexity O(1) should be chosen over list O(n) when possible. See TimeComplexity. The benefit of lower complexity exceeds the cost of Ansible filter dict2items, or Python method .items(). The tools (filters, tests, methods, ...) are designed to cover rational choices.

Thank you for explaining this, I was completely ignorant about these types of complexity in this context 😅

Allow recursive keep_keys

54e3cd1

Add a parameter to allow recursive search of keys to be kept Adjust result yielding to loop over subsequent lists of dicts if requested

MPStudyly marked this pull request as draft July 1, 2024 17:36

MPStudyly marked this pull request as ready for review July 1, 2024 17:38

MPStudyly marked this pull request as draft July 1, 2024 17:38

This comment was marked as outdated.

Sign in to view

ansibullbot added WIP Work in progress ci_verified Push fixes to PR branch to re-run CI feature This issue/PR relates to a feature request filter filter plugin new_contributor Help guide this first time contributor plugins plugin (any type) labels Jul 1, 2024

vbotka mentioned this pull request Jul 7, 2024

Create a filter and a test to reveal/validate Ansible data types. #8594

Closed

1 task

ansibullbot added has_issue stale_ci CI is older than 7 days, rerun before merging labels Jul 11, 2024

MPStudyly closed this Jul 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow recursive keep_keys #8571

Allow recursive keep_keys #8571

MPStudyly commented Jul 1, 2024 •

edited

Loading

This comment was marked as outdated.

ansibullbot commented Jul 1, 2024

vbotka commented Jul 2, 2024

MPStudyly commented Jul 2, 2024 •

edited

Loading

vbotka commented Jul 2, 2024 •

edited

Loading

MPStudyly commented Jul 11, 2024 •

edited

Loading

vbotka commented Jul 11, 2024

MPStudyly commented Jul 15, 2024 •

edited

Loading

vbotka commented Jul 15, 2024

MPStudyly commented Jul 15, 2024

Allow recursive keep_keys #8571

Allow recursive keep_keys #8571

Conversation

MPStudyly commented Jul 1, 2024 • edited Loading

SUMMARY

ISSUE TYPE

COMPONENT NAME

ADDITIONAL INFORMATION

This comment was marked as outdated.

ansibullbot commented Jul 1, 2024

vbotka commented Jul 2, 2024

MPStudyly commented Jul 2, 2024 • edited Loading

vbotka commented Jul 2, 2024 • edited Loading

MPStudyly commented Jul 11, 2024 • edited Loading

vbotka commented Jul 11, 2024

MPStudyly commented Jul 15, 2024 • edited Loading

vbotka commented Jul 15, 2024

MPStudyly commented Jul 15, 2024

MPStudyly commented Jul 1, 2024 •

edited

Loading

MPStudyly commented Jul 2, 2024 •

edited

Loading

vbotka commented Jul 2, 2024 •

edited

Loading

MPStudyly commented Jul 11, 2024 •

edited

Loading

MPStudyly commented Jul 15, 2024 •

edited

Loading