Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

removing locals and ref's blocks to repository names, the rulesets will be based of the custom propertys > rulesets #48

Merged
merged 5 commits into from
Sep 4, 2024

Conversation

ConnorOKane-Kainos
Copy link
Collaborator

@ConnorOKane-Kainos ConnorOKane-Kainos commented Sep 4, 2024

Jira link - https://tools.hmcts.net/jira/browse/DTSPO-18475

See PROJ-XXXXXX

Change description

Updated the locals file to remove the old code we no longer need for the rulesets. As the rulesets will be based of repos that are assigned the custom property called is_production.

Fixed formatting in the python script.

Removed rules that where not required upon second review to ensure rules are relevant.

Testing done

Checklist

  • commit messages are meaningful and follow good commit message guidelines
  • README and other documentation has been updated / added (if needed)
  • tests have been updated / new tests has been added (if needed)
  • Does this PR introduce a breaking change

🤖AEP PR SUMMARY🤖

  • components/locals.tf

    • Removed the block defining local variables excluded_repositories, all_repositories, and included_repositories.
    • Updated the env_display_names local variable.
  • components/main.tf

    • Updated the name attribute of the "github_organization_ruleset" resource.
    • Cleared the include attribute of the "repository_name" block.
    • Removed attributes from the "pull_request" block.
  • custom-properties/set_org_custom_properties.py

    • Added error handling for the API response status code not being 200, 204.
    • Updated the docstrings for the functions set_custom_properties, get_custom_properties, and load_production_repos.

…ll be based of the custom propertys > rulesets
Copy link

github-actions bot commented Sep 4, 2024

Terraform Improvements

Use of locals for Clear Naming and Reusability

  • Before: The removal of locals for excluded_repositories and related filtering logic simplifies the code. However, ensuring that the logic for filtering or modifying repository lists is encapsulated using locals could make the Terraform code more maintainable and readable.
  • Example: If there's a future requirement to exclude or include repositories based on additional criteria, considering structuring your locals to easily accommodate such changes could be beneficial. For example:

hcl
locals {
// Assuming future modifications
dynamic_repository_filter = [for repo in local.all_repositories : repo if condition]
}


This pattern allows for more flexibility and clearer intention behind repository filtering or modifications.

### Python Improvements

#### Error Handling Enhancement

- **Context:** Proper error handling in the case of API failures is crucial for maintaining robustness in scripts interacting with external services.
- **Improvement:** While logging and raising HTTP errors are implemented, consider adding a mechanism for retrying failed requests, especially for transient network issues or rate limiting by the GitHub API.
- **Example:** Implement a retry mechanism with exponential backoff for the API calls.

```python
import time
import requests

def robust_request(url, max_retries=5):
    retry_delay = 1  # start with 1 second
    for attempt in range(max_retries):
        response = requests.get(url)
        if response.status_code == 200:
            return response
        print(f\"Request failed, retrying in {retry_delay} seconds...\")
        time.sleep(retry_delay)
        retry_delay *= 2  # Exponential backoff
    # Final attempt, if this fails, propagate the error
    response.raise_for_status()

# This function would replace direct calls to requests.get or requests.patch in the existing functions.

Resource Handling and Comments

  • Pay attention to resource handling, especially when dealing with file operations. Ensuring that files are properly closed after being opened is important for resource management.
  • Example:
    • Use with statements when dealing with files to ensure they're automatically closed.
def load_production_repos():
    try:
        with open('production-repos.json', 'r') as file:
            return json.load(file)
    except FileNotFoundError as e:
        logging.error(f\"File not found: {e}\")
    except json.JSONDecodeError as e:
        logging.error(f\"JSON decode error: {e}\")

Use of Docstrings

  • Assessment: The addition of Error Handling and Args sections in the comments enhances the documentation, making it easier to understand the function's purpose, parameters, and error handling practices. This aligns well with best practices in Python development.
  • Suggestion: Ensure that the docstrings (currently described in comments) are following the Python docstring format for improved consistency and compatibility with documentation tools.
def load_production_repos():
    \"\"\"
    Loads a list of production repositories from a JSON file.

    Args:
        None

    Returns:
        list: The parsed JSON content from the production-repos.json file.

    Error Handling:
        - Handles FileNotFoundError by logging an error if the JSON file is not found...
        - Handles JSONDecodeError by logging an error if the JSON file cannot be parsed correctly...

    \"\"\"

Copy link

github-actions bot commented Sep 4, 2024

Terraform Code Review

locals.tf Changes

While the adjustment in locals.tf simplifies the configuration by removing excluded_repositories and all_repositories, this change might limit flexibility. Consider the following improvements:

  1. Reintroduce Flexibility for Repository Inclusion/Exclusion:
    If the use case may require dynamic repository filtering in the future, consider reintroducing a mechanism to include or exclude repositories. For example:
    hcl
    locals {
    // Placeholder for future logic to exclude or include repositories dynamically
    excluded_repositories = toset([])
    included_repositories = toset(["repo1", "repo2"]) // Example default
    }
    This approach retains flexibility without complicating the current setup.
    
    
  2. Use Descriptive Names for Key Values:
    The env_display_names map introduces a clear naming scheme for environments. Continuing this pattern, consider adding comments or using more descriptive key names to enhance readability and maintainability.

main.tf Adjustments

Setting include and exclude to null in repository_name may affect functionality. Ensure that this change aligns with your GitHub organization's policy requirements. If this is a preparation for future development, document it accordingly.

Python Script Enhancements (set_org_custom_properties.py)

  1. Consistent Error Handling Documentation:
    The addition of error handling documentation is excellent. Ensure consistency in describing error handling across functions to aid maintainability.

  2. Refactor Error Handling:
    Consider creating a separate function to handle API response errors to standardize error logging and reduce repetition. For instance:

    def handle_api_error(response):
        if not response.ok:
            error_msg = response.json().get('message', f\"HTTP Status Code: {response.status_code}\")
            logging.error(f\"API Request failed: {error_msg}\")
            response.raise_for_status()

    This function can then be used across define_custom_property, set_custom_properties, and get_custom_properties.

  3. Enhance Security by Using Environment Variables for Sensitive Data:
    Ensure that any API keys or sensitive configuration data are not hardcoded but instead accessed securely via environment variables or a secure vault solution.

  4. Improve Resource Management:
    When working with files or network connections, ensure resources are managed properly using context managers (with statement in Python) to avoid leaks.

  5. Logging and Monitoring:
    Increase the robustness of logging, especially for operations such as file loading and API requests. Monitoring success cases and anomalies can significantly improve maintenance.

General Recommendations

  • Documentation:
    Further document the code, especially complex logic or configurations that may not be immediately clear to maintainers or other developers.

  • Testing:
    Introduce unit tests for the Python scripts to validate not only the success paths but also error handling. For Terraform, consider using a combination of terraform validate for syntax checks and integration tests to ensure configurations behave as expected.

  • Cost and Carbon Usage:
    Given the nature of the changes, there's no direct implication on cost or carbon footprint from the provided diff. However, efficiently managing cloud resources and minimizing wasteful API calls can indirectly contribute to cost saving and reduced carbon usage. Consider implementing logging or a cost monitoring tool for visibility into resource utilization and potential optimizations.

Copy link

github-actions bot commented Sep 4, 2024

Terraform Improvements

  1. Refactor Removal of included_repositories Calculation:
    The removal of the included_repositories computation in locals.tf and the subsequent edit to set include = [] directly in main.tf suggests that including repositories is now handled differently or not at all. Ensure this aligns with the intended use case. If dynamic inclusion is needed in the future, consider reintroducing a flexible mechanism.

    Example Improvement:
    hcl
    locals {
    // Potentially reintroduce with a dynamic condition if required
    included_repositories = var.enable_dynamic_inclusion ? [for repo in local.all_repositories : repo if condition] : []
    }

    
    
  2. Validation of env_display_names:
    It's beneficial to validate the contents of env_display_names to ensure they meet expected patterns or values, especially if these values impact deployment or operational aspects. Terraform 0.13+ supports custom validation rules.

    Example Improvement:

    variable \"env_display_names\" {
      type = map(string)
    
      validation {
        condition     = can(regex(\"^[A-Za-z]+$\", value))
        error_message = \"The environment display names must only contain letters.\"
      }
    }

Python Script Improvements

  1. Enhance Error Handling with More Specific Exceptions:
    While generic HTTP error handling is implemented, catching and handling more specific exceptions could improve troubleshooting and user feedback.

    Example Improvement:

    import requests
    
    try:
        response = requests.get('https://api.example.com/endpoint')
        response.raise_for_status()
    except requests.exceptions.HTTPError as errh:
        print(f'Http Error: {errh}')
    except requests.exceptions.ConnectionError as errc:
        print(f'Error Connecting: {errc}')
    except requests.exceptions.Timeout as errt:
        print(f'Timeout Error: {errt}')
    except requests.exceptions.RequestException as err:
        print(f'Oops: Something Else: {err}')
  2. Optimization of JSON File Reading in load_production_repos:
    Current error handling for file reading and JSON parsing is good, but optimizing file reading and handling large JSON files can further enhance performance and error management.

    Example Improvement:

    import json
    
    def load_production_repos_optimized():
        try:
            with open('production-repos.json', 'r') as file:
                return json.load(file)
        except FileNotFoundError as e:
            print(f\"File not found: {e}\")
        except json.JSONDecodeError as e:
            print(f\"JSON Decode Error: {e}\")
  3. Duplication in Comments:
    The comments repeating "Setting the custom properties for a repository." could be seen as redundant since the function names (set_custom_properties) are already descriptive. Ensure comments add value rather than restating the obvious.

    Example Improvement:

    # Instead of repeating the function's purpose, use comments to explain non-obvious parts of the code or why specific choices were made.
  4. Documenting Expected Structure of properties:
    For set_custom_properties, it's clear what repo_full_name should look like, but the properties dictionary's expected structure is not documented. Providing examples or a schema can help users understand how to use the function correctly.

    Example Improvement:

    def set_custom_properties(repo_full_name, properties):
        \"\"\"
        Sets the custom properties for a repository.
    
        Args:
            repo_full_name (str): The full name of the repository (org/repo).
            properties (dict): The custom properties to set, formatted as { \"property_name\": value }.
        \"\"\"

General Considerations

  • Cost and Carbon Usage: The changes appear to be mostly code organization and processing enhancements without directly impacting infrastructure costs or carbon usage. Still, optimizing code and reducing unnecessary API calls or large file processing can have indirect benefits on efficiency and resource usage, leading to cost savings and reduced carbon footprint in the long run.

  • Security: Ensure that any sensitive information, particularly in the context of API interactions (e.g., tokens, repository names), is securely handled, not logged in plaintext, and that permissions are scoped as narrowly as possible.

Copy link

github-actions bot commented Sep 4, 2024

Code Review Suggestions

Terraform Files

  1. Environment Variable Consistency:
    For components/locals.tf, ensure consistency in environment variable naming conventions. The removal of excluded_repositories and all_repositories simplifies the local variables but consider if this simplification aligns with future scalability or customization requirements. For instance, if excluded_repositories might be used for specific environment setups, a more dynamic approach to include or exclude repositories might be necessary.

  2. Repository Inclusion Logic:

    • After updating main.tf to set include under repository_name to an empty list, evaluate the implications on your GitHub organization ruleset. This change might inadvertently apply rules to repositories you intended to exclude or not apply to any repositories at all, depending on your intended behavior.
    • If the intent is to dynamically include/exclude repositories, a better approach might involve parameterizing the inclusion/exclusion lists or implementing a conditional logic that can be easily adjusted.
  3. Best Practices with Terraform:

    • Immutable Infrastructure: Consider adopting an immutable infrastructure approach with Terraform, especially for critical settings like GitHub organization rules. This involves fully defining the desired state in code and avoiding manual changes that could lead to drift from the codified configurations.
    • Version Pinning: Ensure that the Terraform GitHub provider and other providers used are pinned to specific versions to avoid unexpected changes due to provider updates.

Python Scripts

  1. Error Handling and Logging:

    • Enhancements in error handling and logging are commendable. Further improvements could include more descriptive log messages that provide context around the operation being performed during the error. This helps in quicker debugging and resolution.
    • Structured logging might be beneficial for easier parsing of logs, especially in distributed systems or when aggregating logs into a centralized system.
  2. API Call Optimization:

    • When setting custom properties (set_custom_properties function) or retrieving them (get_custom_properties function), consider batching requests or using GraphQL to minimize the number of calls to GitHub's API, thus reducing rate limit consumption and potential latency.
    • In scenarios where multiple repositories are processed, leveraging concurrency mechanisms (e.g., threading, asyncio) could significantly improve performance by handling API calls in parallel.
  3. Configuration Management:

    • Move API endpoint URLs, tokens, and other configurable parameters to a separate configuration file or environment variables. This promotes better security practices by avoiding hard-coded sensitive information and facilitates easier changes to configuration without modifying the code.
  4. Documentation and Type Hints:

    • The addition of Args and Error Handling documentation is good practice. Further improve code readability and maintainability by incorporating Python type hints for function parameters and return types.
    • Example:
      python
      def set_custom_properties(repo_full_name: str, properties: dict) -> None:
      
      
  5. Cost and Carbon Usage Consideration: While not directly applicable to the changes reviewed, always consider the cost (both operational and environmental) of increased API call volumes and data processing. Efficient code that minimizes unnecessary processing and optimizes resource usage contributes to lower operational costs and reduced carbon footprint.

In terms of cost, the direct changes here do not imply a substantial shift in operational cost in GBP or carbon usage. However, indirectly, optimizations in API call efficiency and processing can contribute to cost savings over time, especially at scale.

Copy link

github-actions bot commented Sep 4, 2024

Terraform Code Review

  1. Use of Hard-coded Values:

    • In the github_organization_ruleset resource, the include parameter is now set to [] (an empty list). This may be intentional, but if the goal is to dynamically include repositories based on certain criteria, consider reintroducing logic to dynamically set this.
      • Suggestion: If you had to remove local.included_repositories due to changes in how repositories are decided, consider defining a new logic under locals that reflects the current requirements. Possibly leverage Terraform's data sources or external data source to fetch and filter repository names dynamically.
  2. Configuration Naming Consistency:

    • The renaming of the ruleset's name from "Rule Set Production Repositories" to "Production Repositories" represents an improvement in clarity. Ensure that this renaming aligns with any other dependent configurations or documentation to maintain consistency.
      • Suggestion: Review documentation or any infrastructure as code (IaC) references to ensure they match the new naming convention to avoid confusion.

Python Code Review

  1. Error Handling Improvements:

    • The Python code snippets include basic error handling (checking for certain status codes), which is a good practice. However, consider enhancing this error handling by attempting retries for specific HTTP error status codes that are typically transient (e.g., 502, 503, 504).
      • Suggestion: Use a retry library (like backoff or retry) to implement exponential backoff for transient errors. This can help to make your script more robust in the face of temporary issues with the GitHub API.
  2. Security Practice:

    • The code does not show how credentials for the GitHub API are handled. For best practices regarding security, ensure that API keys or tokens are not hard-coded but rather securely fetched from environment variables or secrets managers.
      • Suggestion: Use os.environ to access environment variables or integrate with a secrets manager (e.g., AWS Secrets Manager, Google Cloud Secret Manager) to fetch API credentials securely at runtime.
  3. Optimization:

    • Consider adding functionality to cache responses from the GitHub API, especially for information that does not change frequently (e.g., repository custom properties). This can reduce the number of API calls, thus saving cost, reducing rate limit consumption, and improving overall script performance.
      • Suggestion: Utilize a lightweight caching mechanism such as a file-based cache or in-memory cache (e.g., cachetools) to store responses for frequently accessed data.
  4. Logging Enhancements:

    • The Python script includes logging in error scenarios, which is positive. However, for better traceability and operational insight, consider enhancing the logging to include successful operation messages and possibly debug level logging for more granular troubleshooting.
      • Suggestion: Extend the use of the logging module to include different logging levels (DEBUG, INFO, WARNING, ERROR) across the script to have a better understanding of the script's execution path, especially useful in larger scale or automated environments.

General Recommendations

  • Documentation:
    • Ensure changes are well-documented, especially the rationale behind removing certain blocks or hardcoding values. Updated documentation helps in future maintenance and understanding the evolution of the codebase.
  • Code Review and Testing:
    • Comprehensive code review practices, including running static code analysis tools, can preemptively identify potential issues. Additionally, integrating automated tests can ensure the reliability and stability of both the Terraform configuration and the Python scripts.
  • Cost Analysis:
    • The changes reviewed do not directly indicate a significant impact on costs. However, consider monitoring API usage and storage consumption as they scale, especially for larger or dynamic datasets managed by the Python scripts.
  • Environmental Impact:
    • Optimization and efficiency in code execution, as well as effective error handling and retry mechanisms, contribute to reduced compute usage. This indirectly impacts the carbon footprint by minimizing unnecessary processing and data transfer.

@ConnorOKane-Kainos ConnorOKane-Kainos merged commit f1e2542 into main Sep 4, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants