Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BQ e2e Tests Scenarios #1311

Merged
merged 1 commit into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 150 additions & 0 deletions src/e2e-test/features/bigquery/sink/BigQueryToBigQuerySink.feature
Original file line number Diff line number Diff line change
Expand Up @@ -195,3 +195,153 @@ Feature: BigQuery sink - Verification of BigQuery to BigQuery successful data tr
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Verify the partition table is created with partitioned on field "bqPartitionFieldTime"

@BQ_EXISTING_SOURCE_DATATYPE_TEST @BQ_EXISTING_SINK_DATATYPE_TEST @EXISTING_BQ_CONNECTION
Scenario: Validate user is able to read the records from BigQuery source(existing table),source table here has more columns than BigQuery sink(existing table) with update button schema with use connection functionality
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Source"
When Select plugin: "BigQuery" from the plugins list as: "Source"
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQuery" from the plugins list as: "Sink"
Then Connect plugins: "BigQuery" and "BigQuery2" to establish connection
Then Navigate to the properties page of plugin: "BigQuery"
Then Click plugin property: "switch-useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Click on the Browse button inside plugin properties
Then Select connection data row with name: "dataset"
Then Select connection data row with name: "bqSourceTable"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Verify input plugin property: "table" contains value: "bqSourceTable"
Then Click on the Get Schema button
Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery2"
Then Click plugin property: "useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Enter input plugin property: "referenceName" with value: "BQSinkReferenceName"
Then Click on the Browse button inside plugin properties
Then Click SELECT button inside connection data row with name: "dataset"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Enter input plugin property: "table" with value: "bqTargetTable"
Then Click plugin property: "updateTableSchema"
Then Validate "BigQuery" plugin properties
Then Close the BigQuery properties
Then Save the pipeline
Then Preview and run the pipeline
Then Wait till pipeline preview is in running state
Then Open and capture pipeline preview logs
Then Verify the preview run status of pipeline in the logs is "succeeded"
Then Close the pipeline logs
Then Close the preview
Then Deploy the pipeline
Then Run the Pipeline in Runtime
Then Wait till pipeline is in running state
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bgInsertDatatypeFile"

@BQ_INSERT_SOURCE_TEST @BQ_UPDATE_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario:Validate successful records transfer from BigQuery to BigQuery with Advanced Operations Update without updating the destination table schema with use connection functionality
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Source"
When Select plugin: "BigQuery" from the plugins list as: "Source"
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQuery" from the plugins list as: "Sink"
Then Connect plugins: "BigQuery" and "BigQuery2" to establish connection
Then Navigate to the properties page of plugin: "BigQuery"
Then Click plugin property: "switch-useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Click on the Browse button inside plugin properties
Then Select connection data row with name: "dataset"
Then Select connection data row with name: "bqSourceTable"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Verify input plugin property: "table" contains value: "bqSourceTable"
Then Click on the Get Schema button
Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery2"
Then Click plugin property: "useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Enter input plugin property: "referenceName" with value: "BQSinkReferenceName"
Then Click on the Browse button inside plugin properties
Then Click SELECT button inside connection data row with name: "dataset"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Enter input plugin property: "table" with value: "bqTargetTable"
And Select radio button plugin property: "operation" with value: "update"
Then Click on the Add Button of the property: "relationTableKey" with value:
| TableKey |
Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Save the pipeline
Then Preview and run the pipeline
Then Wait till pipeline preview is in running state
Then Open and capture pipeline preview logs
Then Verify the preview run status of pipeline in the logs is "succeeded"
Then Close the pipeline logs
Then Close the preview
Then Deploy the pipeline
Then Run the Pipeline in Runtime
Then Wait till pipeline is in running state
Then Open and capture logs
Then Close the pipeline logs
Then Verify the pipeline status is "Succeeded"
Then Validate the values of records transferred to BQ sink is equal to the values from source BigQuery table

@BQ_INSERT_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario:Validate successful records transfer from BigQuery to BigQuery with Advanced operations Upsert without updating the destination table schema with use connection functionality
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Source"
When Select plugin: "BigQuery" from the plugins list as: "Source"
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQuery" from the plugins list as: "Sink"
Then Connect plugins: "BigQuery" and "BigQuery2" to establish connection
Then Navigate to the properties page of plugin: "BigQuery"
Then Click plugin property: "switch-useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Click on the Browse button inside plugin properties
Then Select connection data row with name: "dataset"
Then Select connection data row with name: "bqSourceTable"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Verify input plugin property: "table" contains value: "bqSourceTable"
Then Click on the Get Schema button
Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery2"
Then Click plugin property: "useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Enter input plugin property: "referenceName" with value: "BQSinkReferenceName"
Then Click on the Browse button inside plugin properties
Then Click SELECT button inside connection data row with name: "dataset"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Enter input plugin property: "table" with value: "bqTargetTable"
And Select radio button plugin property: "operation" with value: "upsert"
Then Click on the Add Button of the property: "relationTableKey" with value:
| TableKey |
Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Save the pipeline
Then Preview and run the pipeline
Then Wait till pipeline preview is in running state
Then Open and capture pipeline preview logs
Then Verify the preview run status of pipeline in the logs is "succeeded"
Then Close the pipeline logs
Then Close the preview
Then Deploy the pipeline
Then Run the Pipeline in Runtime
Then Wait till pipeline is in running state
Then Open and capture logs
Then Close the pipeline logs
Then Verify the pipeline status is "Succeeded"
Then Validate the values of records transferred to BQ sink is equal to the values from source BigQuery table
92 changes: 92 additions & 0 deletions src/e2e-test/features/bigquery/source/BigQueryToBigQuery.feature
Original file line number Diff line number Diff line change
Expand Up @@ -262,3 +262,95 @@ Feature: BigQuery source - Verification of BigQuery to BigQuery successful data
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Validate the values of records transferred to BQ sink is equal to the values from source BigQuery table

@BQ_EXISTING_SOURCE_TEST @BQ_EXISTING_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario: Validate user is able to read data from BigQuery source(existing table) and store them in BigQuery sink(existing table) with use connection functionality
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Source"
When Select plugin: "BigQuery" from the plugins list as: "Source"
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQuery" from the plugins list as: "Sink"
Then Connect plugins: "BigQuery" and "BigQuery2" to establish connection
Then Navigate to the properties page of plugin: "BigQuery"
Then Click plugin property: "switch-useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Click on the Browse button inside plugin properties
Then Select connection data row with name: "dataset"
Then Select connection data row with name: "bqSourceTable"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Verify input plugin property: "table" contains value: "bqSourceTable"
Then Click on the Get Schema button
Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery2"
Then Click plugin property: "useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Enter input plugin property: "referenceName" with value: "BQSinkReferenceName"
Then Click on the Browse button inside plugin properties
Then Click SELECT button inside connection data row with name: "dataset"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Enter input plugin property: "table" with value: "bqTargetTable"
Then Validate "BigQuery" plugin properties
Then Close the BigQuery properties
Then Save the pipeline
Then Preview and run the pipeline
Then Wait till pipeline preview is in running state
Then Open and capture pipeline preview logs
Then Verify the preview run status of pipeline in the logs is "succeeded"
Then Close the pipeline logs
Then Close the preview
Then Deploy the pipeline
Then Run the Pipeline in Runtime
Then Wait till pipeline is in running state
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqExpectedFile"

@BQ_EXISTING_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario: Validate user is able to read data from BigQuery source(existing table) without clicking on the validate button of BigQuery source and store them in BigQuery sink(new table) with use connection functionality
Given Open Datafusion Project to configure pipeline
When Expand Plugin group in the LHS plugins list: "Source"
When Select plugin: "BigQuery" from the plugins list as: "Source"
When Expand Plugin group in the LHS plugins list: "Sink"
When Select plugin: "BigQuery" from the plugins list as: "Sink"
Then Connect plugins: "BigQuery" and "BigQuery2" to establish connection
Then Navigate to the properties page of plugin: "BigQuery"
Then Click plugin property: "switch-useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Click on the Browse button inside plugin properties
Then Select connection data row with name: "dataset"
Then Select connection data row with name: "bqSourceTable"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Verify input plugin property: "table" contains value: "bqSourceTable"
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery2"
Then Click plugin property: "useConnection"
Then Click on the Browse Connections button
Then Select connection: "bqConnectionName"
Then Enter input plugin property: "referenceName" with value: "BQSinkReferenceName"
Then Click on the Browse button inside plugin properties
Then Click SELECT button inside connection data row with name: "dataset"
Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Enter input plugin property: "table" with value: "bqTargetTable"
Then Validate "BigQuery" plugin properties
Then Close the BigQuery properties
Then Save the pipeline
Then Preview and run the pipeline
Then Wait till pipeline preview is in running state
Then Open and capture pipeline preview logs
Then Verify the preview run status of pipeline in the logs is "succeeded"
Then Close the pipeline logs
Then Close the preview
Then Deploy the pipeline
Then Run the Pipeline in Runtime
Then Wait till pipeline is in running state
Then Open and capture logs
Then Verify the pipeline status is "Succeeded"
Then Validate the values of records transferred to BQ sink is equal to the values from source BigQuery table
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
package io.cdap.plugin.bigquery.stepsdesign;

import com.esotericsoftware.minlog.Log;
import com.google.cloud.bigquery.FieldValueList;
import com.google.cloud.bigquery.TableResult;
import com.google.gson.Gson;
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
import io.cdap.e2e.utils.BigQueryClient;
import io.cdap.e2e.utils.PluginPropertyUtils;
import io.cucumber.core.logging.Logger;
import io.cucumber.core.logging.LoggerFactory;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.net.URISyntaxException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.HashMap;
import java.util.Map;

/**
* BigQuery Plugin Existing Table validation.
*/
public class BQValidationExistingTables {

private static final Logger LOG = LoggerFactory.getLogger(BQValidationExistingTables.class);
private static final Gson gson = new Gson();

/**
* Validates the actual data in a BigQuery table against the expected data in a JSON file.
* @param table The name of the BigQuery table to retrieve data from.
* @param fileName The name of the JSON file containing the expected data.
* @return True if the actual data matches the expected data, false otherwise.
*/
public static boolean validateActualDataToExpectedData(String table, String fileName) throws IOException,
InterruptedException, URISyntaxException {
Map<String, JsonObject> bigQueryMap = new HashMap<>();
Map<String, JsonObject> fileMap = new HashMap<>();
Path bqExpectedFilePath = Paths.get(BQValidationExistingTables.class.getResource("/" + fileName).toURI());

getBigQueryTableData(table, bigQueryMap);
getFileData(bqExpectedFilePath.toString(), fileMap);
boolean isMatched = bigQueryMap.equals(fileMap);
return isMatched;
}

/**
* Reads a JSON file line by line and populates a map with JSON objects using a specified ID key.
*@param fileName The path to the JSON file to be read.
* @param fileMap A map where the extracted JSON objects will be stored with their ID values as keys.
*/

public static void getFileData(String fileName, Map<String, JsonObject> fileMap) {
try (BufferedReader br = new BufferedReader(new FileReader(fileName))) {
String line;
while ((line = br.readLine()) != null) {
JsonObject json = gson.fromJson(line, JsonObject.class);
String idKey = getIdKey(json);
if (idKey != null) {
JsonElement idElement = json.get(idKey);
if (idElement.isJsonPrimitive()) {
String idValue = idElement.getAsString();
fileMap.put(idValue, json);
}
} else {
Log.error("ID key not found");
}
}
} catch (IOException e) {
Log.error("Error reading the file: " + e.getMessage());
}
}

private static void getBigQueryTableData(String targetTable, Map<String, JsonObject> bigQueryMap)
throws IOException, InterruptedException {
String dataset = PluginPropertyUtils.pluginProp("dataset");
String projectId = PluginPropertyUtils.pluginProp("projectId");
String selectQuery = "SELECT TO_JSON(t) FROM `" + projectId + "." + dataset + "." + targetTable + "` AS t";
TableResult result = BigQueryClient.getQueryResult(selectQuery);

for (FieldValueList row : result.iterateAll()) {
JsonObject json = gson.fromJson(row.get(0).getStringValue(), JsonObject.class);
String idKey = getIdKey(json); // Get the actual ID key from the JSON object
if (idKey != null) {
JsonElement idElement = json.get(idKey);
if (idElement.isJsonPrimitive()) {
String id = idElement.getAsString();
bigQueryMap.put(id, json);
} else {
Log.error("Data Mismatched");
}
}
}
}

/**
* Retrieves the key for the ID element in the provided JSON object.
*
* @param json The JSON object to search for the ID key.
*/
private static String getIdKey(JsonObject json) {
if (json.has("ID")) {
return "ID";
} else if (json.has("Name")) {
return "Name";
} else if (json.has("Price")) {
return "Price";
} else if (json.has("Customer_Exists")) {
return "Customer_Exists";
} else {
return null;
}
}
}
Loading