Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCS Additonal e2e tests #1306

Merged

Conversation

bharatgulati
Copy link
Contributor

No description provided.

@bharatgulati bharatgulati added the build Trigger unit test build label Sep 22, 2023
@bharatgulati bharatgulati force-pushed the gcsAdditonalTests branch 2 times, most recently from 7c1cd96 to 7a10c09 Compare September 22, 2023 14:00
@bharatgulati bharatgulati marked this pull request as ready for review September 23, 2023 08:43
Then Validate the data from GCS Source to GCS Sink with expected csv file and target data in GCS bucket

@GCS_CSV @GCS_SINK_TEST
Scenario: To verify the pipeline is getting failed from GCS to GCS when Schema is not cleared in GCS source On Single File
Copy link
Member

@itsankit-google itsankit-google Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: To verify the pipeline....when default schema is not cleared....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to To verify the pipeline is getting failed from GCS to GCS when default schema is not cleared in GCS source On Single File

Then Validate the data from GCS Source to GCS Sink with expected json file and target data in GCS bucket

@GCS_MULTIPLE_FILES_REGEX_TEST @GCS_SINK_TEST
Scenario: To verify the pipeline is getting failed from GCS to GCS On Multiple File with filter regex without using connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are multiple files having different schema here?

Can we add a success scenario as well to understand when multiple files are supported?

Copy link
Contributor Author

@bharatgulati bharatgulati Sep 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three files were used (vehicle_inventory-part-1, vehicle_inventory-part-2, book) using the same schema in vehicle_inventory-part-1, vehicle_inventory-part-2 and a different schema is used in book, and putting regex filter on top of it.(.+vehicle_inventory.*). Hence, the output schema generated based on same schema.

We have already covered the success situation in this test. Mistakenly, the word failure has been used instead of success.
Changed the scenario outline to :- To verify data is getting transferred from GCS to GCS On Multiple File with filter regex without using connection

@@ -85,3 +85,306 @@ Feature: GCS source - Verification of GCS to GCS Additional Tests successful
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Validate the data transferred from GCS Source to GCS Sink with Expected avro file and target data in GCS bucket

@GCS_CSV @GCS_SINK_TEST @EXISTING_GCS_CONNECTION
Scenario: To verify data is getting transferred from GCS Source to GCS Sink using test Schema Detection On Single File with connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: To verify data is getting transferred from GCS Source to GCS Sink using Schema Detection On Single File with connection
test seems unnecessary here.
This comment applies to all similar scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the ITN Class in Single File and Multiple file with regex (useConnection is true) and in Multiple File with different schema (useConnection is not used). So, We are removing a particular tests which has not used useConnection in Multiple file with different schema. If you are suggesting that to remove unnecessary tests, then we can remove those tests.

1)Removing a scenario for multiple file with different schema with connection(True).
2)Removing a scenario for multiple file with regex with connection(False).
3)Removing a scenario for single file with connection(False).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just suggested to remove test word from the scenario name not the whole scenario.

Sorry for the confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed test word from the scenario.

Then Verify the pipeline status is "Failed"

@GCS_MULTIPLE_FILES_TEST @GCS_SINK_TEST @EXISTING_GCS_CONNECTION
Scenario: To verify the pipeline is getting failed from GCS Source to GCS Sink On Multiple File with connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear from the scenario title why will the pipeline fail? Is it due to multiple files having different schema?

Then we can rename it as To verify the pipeline is getting failed from GCS Source to GCS Sink On Multiple File having different schemas with connection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, pipeline will fail due to multiple files having different schema.
Renamed it to 'To verify the pipeline is getting failed from GCS Source to GCS Sink On Multiple File having different schemas with connection'.
Thanks for the suggestion.

Then Verify the pipeline status is "Failed"

@GCS_MULTIPLE_FILES_TEST @GCS_SINK_TEST
Scenario: To verify the pipeline is getting failed from GCS Source to GCS Sink On Multiple File without connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comment here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it to 'To verify the pipeline is getting failed from GCS Source to GCS Sink On Multiple File having different schemas without connection'

Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash commits & rebase so that we can merge.

@bharatgulati
Copy link
Contributor Author

Please squash commits & rebase so that we can merge.

All commits were Squashed, and the code was rebased. When the build is green. I'll notify you when it's ready to be merged.

@bharatgulati bharatgulati added build Trigger unit test build and removed build Trigger unit test build labels Oct 4, 2023
@itsankit-google itsankit-google merged commit f75f09d into data-integrations:develop Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Trigger unit test build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants