Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Mapper stream_name glob syntax #1605

Closed
visch opened this issue Apr 12, 2023 · 5 comments
Closed

feat: Mapper stream_name glob syntax #1605

visch opened this issue Apr 12, 2023 · 5 comments
Assignees
Labels
kind/Feature New feature or request valuestream/SDK

Comments

@visch
Copy link
Contributor

visch commented Apr 12, 2023

Feature scope

Other

Description

Came across this use case today for stream_maps.

Example stream map today

    config:
      start_date: '2023-02-20'
      stream_maps:
        "employer_stats_report":
          "client_id": "config['client_id_inject']"
        "campaign_performance_stats":
          "client_id": "config['client_id_inject']"
      stream_map_config:
        client_id_inject: $TAP_INDEED_CLIENT_ID

In production we have to do this for >7 streams. Ideally this would look something like

    config:
      start_date: '2023-02-20'
      stream_maps:
        "*":
          "client_id": "config['client_id_inject']"
      stream_map_config:
        client_id_inject: $TAP_INDEED_CLIENT_ID

There's other fixes here for this problem that don't involve stream_maps but thought it was a good idea!

@visch visch added kind/Feature New feature or request valuestream/SDK labels Apr 12, 2023
@stale
Copy link

stale bot commented Aug 10, 2023

This has been marked as stale because it is unassigned, and has not had recent activity. It will be closed after 21 days if no further activity occurs. If this should never go stale, please add the evergreen label, or request that it be added.

@stale stale bot added the stale label Aug 10, 2023
@stale stale bot closed this as completed Aug 31, 2023
@tayloramurphy tayloramurphy reopened this Aug 31, 2023
@stale stale bot removed the stale label Aug 31, 2023
@gruckion
Copy link

Also want this. I feel like the docs say it is allowed.

https://sdk.meltano.com/en/latest/stream_maps.html#applying-a-mapping-across-two-or-more-streams

I tried to apply a change based on the docs. But I do not see acc_num in the outputted model 🤷

        stream_maps:
          "*":
            acc_num: account_number
            account_number: __NULL__

What I actually want is.

        stream_maps:
          "*":
            created_at:
              record['created_at'] if record.get('created_at') and '0000'
              not in record['created_at'] and '-00' not in record['created_at'] else
              '1970-01-01 00:00:00'
            updated_at:
              record['updated_at'] if record.get('updated_at') and '0000'
              not in record['updated_at'] and '-00' not in record['updated_at'] else
              '1970-01-01 00:00:00'

But this doesn't work. I'm doing a tap mysql -> postgres. The source mysql db has a load of invalid postgres date / timestamps that I need to accomidate.

Disgustingly I have to put

        stream_maps:
          mydatabase-my_table_name:
            created_at:
              record['created_at'] if record.get('created_at') and '0000'
              not in record['created_at'] and '-00' not in record['created_at'] else
              '1970-01-01 00:00:00'
            updated_at:
              record['updated_at'] if record.get('updated_at') and '0000'
              not in record['updated_at'] and '-00' not in record['updated_at'] else
              '1970-01-01 00:00:00'

Manually for each and every table 😢 .

@gruckion
Copy link

@visch did you ever get a good solution for this?

The docs suggest it is possible but it doesn't appear to work.

@edgarrmondragon
Copy link
Collaborator

@gruckion

But this doesn't work.

Can you elaborate? I can confirm the following works:

plugins:
  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
    config:
      database: postgres
      port: 5433
      host: localhost
      user: postgres
      stream_maps:
        "*":
          acc_num: account_number
          account_number: __NULL__

when piping some sample data to the loader:

cat tap.jsonl | meltano invoke target-postgres

and I do get the property mapped across all streams.

Input data
{"type": "SCHEMA", "stream": "bbva", "schema": {"properties": {"id": {"type": "integer"}, "account_number": {"type": "string"}}}, "key_properties": ["id"]}
{"type": "RECORD", "stream": "bbva", "record": {"id": 1, "account_number": "123456789"}, "time_extracted": "2024-01-01T00:00:01Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 2, "account_number": "987654321"}, "time_extracted": "2024-01-01T00:00:02Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 3, "account_number": "456123789"}, "time_extracted": "2024-01-01T00:00:03Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 4, "account_number": "321654987"}, "time_extracted": "2024-01-01T00:00:04Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 5, "account_number": "789123456"}, "time_extracted": "2024-01-01T00:00:05Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 6, "account_number": "654987321"}, "time_extracted": "2024-01-01T00:00:06Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 7, "account_number": "147258369"}, "time_extracted": "2024-01-01T00:00:07Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 8, "account_number": "963852741"}, "time_extracted": "2024-01-01T00:00:08Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 9, "account_number": "258147963"}, "time_extracted": "2024-01-01T00:00:09Z"}
{"type": "RECORD", "stream": "bbva", "record": {"id": 10, "account_number": "741369852"}, "time_extracted": "2024-01-01T00:00:10Z"}
{"type": "SCHEMA", "stream": "banamex", "schema": {"properties": {"id": {"type": "integer"}, "account_number": {"type": "string"}}}, "key_properties": ["id"]}
{"type": "RECORD", "stream": "banamex", "record": {"id": 11, "account_number": "852963741"}, "time_extracted": "2024-01-01T00:00:01Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 12, "account_number": "369147258"}, "time_extracted": "2024-01-01T00:00:02Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 13, "account_number": "159753486"}, "time_extracted": "2024-01-01T00:00:03Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 14, "account_number": "486231579"}, "time_extracted": "2024-01-01T00:00:04Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 15, "account_number": "753159624"}, "time_extracted": "2024-01-01T00:00:05Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 16, "account_number": "624897531"}, "time_extracted": "2024-01-01T00:00:06Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 17, "account_number": "531246879"}, "time_extracted": "2024-01-01T00:00:07Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 18, "account_number": "798512346"}, "time_extracted": "2024-01-01T00:00:08Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 19, "account_number": "246835791"}, "time_extracted": "2024-01-01T00:00:09Z"}
{"type": "RECORD", "stream": "banamex", "record": {"id": 20, "account_number": "135792468"}, "time_extracted": "2024-01-01T00:00:10Z"}

@edgarrmondragon
Copy link
Collaborator

Closing (via #1888), refer to #2602 or create a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/Feature New feature or request valuestream/SDK
Projects
None yet
Development

No branches or pull requests

4 participants