Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DTSPO 18475 Fix Up #39

Closed
wants to merge 100 commits into from
Closed

DTSPO 18475 Fix Up #39

wants to merge 100 commits into from

Conversation

ConnorOKane-Kainos
Copy link
Collaborator

@ConnorOKane-Kainos ConnorOKane-Kainos commented Aug 7, 2024

  • Fresh PR made with changes

Jira link - https://tools.hmcts.net/jira/secure/RapidBoard.jspa?rapidView=1617&projectKey=DTSPO&view=detail&selectedIssue=DTSPO-18475&sprint=14008

Change description

  • I have created a python script that will be used to create the custom properties as this is currently unsupported through the terraform provider. This script will run and create the property called is_production at the organisation level, which will then be passed down to the repository level from the JSON file.

  • Made some changes to the terraform pipeline to include the CNP library and ensure that the format check happens during the PR process.

  • A rule set will be created at the organisation level with rules such as must require 1 PR review before merging, this will then also be passed down to the repository level.

  • We created a terraform pre check pipeline that will trigger on a pull request and will check if the terraform code needs formatted, if it does then this will fix the code if required. We also ensured the Terraform CI/CD pipeline runs in parallel with the pre-check.

Checklist

  • commit messages are meaningful and follow good commit message guidelines
  • README and other documentation has been updated / added (if needed)
  • tests have been updated / new tests has been added (if needed)
  • Does this PR introduce a breaking change

🤖AEP PR SUMMARY🤖

ℹ️ Created new files:

  • .github/workflows/pr-reviewer.yaml
  • .github/workflows/pr-summary.yaml
  • .github/workflows/terraform-precheck.yaml
  • .github/workflows/terraform.yaml
  • custom-properties/set_org_custom_properties.py

⚙️ Updated files:

  • ReadMe.md
  • .github/workflows/update-repos.yaml
  • scripts/update-readme.py
  • scripts/update-repo-list.py

The pull request introduced a new CI pipeline for PR reviewing and summary, a Terraform pre-check workflow, and a Terraform CI/CD pipeline. Additionally, custom properties were defined for the organization using the set_org_custom_properties.py script. The ReadMe was also updated to display repository counts and link to the production repositories JSON file.

Copy link

General Observations and Improvements

Consistency and Best Practices

  1. Use of Specific Version Tags for GitHub Actions: It's a best practice to use specific versions of GitHub Actions rather than tags like v2 or v4. This ensures consistent behavior in your workflows as new versions of actions may introduce breaking changes. For example:
    yaml

    • uses: actions/checkout@v2 # Consider using a specific version, e.g., actions/checkout@v2.3.4
    
    
  2. Uniform Environmental Variable Naming: Ensure consistency in environmental variable naming. Prefer uppercase with underscores for separation. There's inconsistency in casing, e.g., WORKING_DIRECTORY vs ApiHost.

    apiHost: \"https://example.com\" # Should be API_HOST: \"https://example.com\"
  3. Hardcoded API Host in GitHub Actions Workflows: The API_HOST is hardcoded in multiple jobs. Consider moving such values to GitHub Secrets or at least to GitHub environment configs for easier management across different environments (dev, staging, production).

    API_HOST: \"https://app-gippi-api-s-latest-uksouth.azurewebsites.net/\"
  4. Sensitive Data Handling: Ensure that all sensitive tokens, API keys, and any form of credentials are securely managed using GitHub Secrets, and not passed as plain text or hardcoded in scripts.

  5. Echoing Potentially Sensitive Data: In the script sections where responses from APIs are echoed, be cautious about potentially sensitive data being printed in logs.

    echo \"Response: $response\"
  6. Error Handling in Script Steps: Ensure proper error handling in your scripts. In bash scripts particularly, consider using set -euo pipefail at the start of your scripts. This ensures the script exits on first error (-e), undefined variable references cause errors (-u), and errors in pipelines are caught (-o pipefail).

  7. Define External Dependencies Clearly: For scripts relying on external dependencies (jq, curl), ensure there's clear documentation or a setup step that ensures these dependencies are met. For jq:

    - name: Install jq
      run: sudo apt-get install jq -y

    Consider having a dedicated step for dependencies installation to ensure it's clear for maintainers.

  8. Duplicated Workflow Jobs: It seems like similar jobs are replicated across different workflow files which could be consolidated or reused to avoid repetition and promote maintainability.

Security

  1. Validate External Inputs: When dealing with external inputs, especially in case of handling API responses or dealing with file operations, ensure to validate and sanitize these inputs to mitigate injection attacks or unintended operations.

  2. Print Sensitive Data: Ensure no sensitive data from API responses is logged unintentionally.

  3. Use of GitHub Tokens: Make sure GitHub tokens are used with the minimal scope necessary for the job at hand and handled securely.

Workflow Dispatch

  1. Explicit Trigger Controls: For workflows that can be dispatched manually (workflow_dispatch trigger), ensure proper documentation is in place so that contributors understand their use and potential impacts.

  2. Branch Targeting for Pull Requests: In the pull_request trigger, only the main branch is targeted. If the workflow is intended for use with feature branches or other branches as well, consider expanding the trigger criteria.

    pull_request:
      branches:
        - main
  3. Versioning in Actions:
    Always aim to use specific versions of GitHub Actions to avoid breaking changes affecting your workflows. Investigate if the latest versions provide any necessary features or security patches.

Copy link

The provided git diff includes a variety of workflow and script updates, each with its focus. Here are additional improvements and specific examples to consider for enhanced code quality, security, and best practices:

General

  • Ensure all external actions used in workflows specify a SHA or version tag to prevent unexpected changes from affecting your pipelines. For example:
    yaml

    • uses: actions/checkout@v2
    Should be:
    ```yaml
    - uses: actions/checkout@v2.3.4 # Use the latest release or an appropriate commit SHA
    
  • Add shell: specification for all run: commands for consistency and clarity, even where default shells are used.

Security

  • Minimize permissions in all GitHub Actions to adhere to the principle of least privilege. For example, if a job does not need write permissions to issues, those should be explicitly set to read or none.
    permissions:
      contents: write
      issues: read

Code Quality and Best Practices

  • For the Python scripts, ensure that error handling is comprehensive, especially try/except blocks should catch specific exceptions rather than general exceptions, e.g., except Exception as e: could be more specific about the types of exceptions expected from the operations inside the try block.

  • In Python scripts, consider using more descriptive variable names for readability. For example, prod_count could be production_repository_count.

  • Utilize GitHub environment secrets instead of pipeline-level env variables for sensitive or environment-specific configurations. This enhances security and makes the pipelines more versatile across different environments or configurations without modifying the code.

  • For operations that might fail, such as network calls or file operations, implement retry logic (with exponential backoff, if possible) to increase robustness. For instance, when calling GitHub APIs, transient network errors or rate limiting may cause requests to fail.

Efficiency

  • Where possible, leverage GitHub's matrix strategy to parallelize jobs that can be executed independently, reducing overall workflow execution time.

Examples:

Best Practice: Improved Action Versioning

- uses: actions/checkout@v2.3.4
- uses: azure/login@v1.3.0

Security: Minimized Permissions

permissions:
  contents: write
  issues: read
  pull-requests: write

Efficiency: Using Matrix for Parallelization

jobs:
  build:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node: [12, 14, 16]
    steps:
    - uses: actions/checkout@v2
    - name: Use Node.js ${{ matrix.node }}
      uses: actions/setup-node@v2
      with:
        node-version: ${{ matrix.node }}
    - run: npm install
    - run: npm test

Cost & Carbon Usage

  • While the changes proposed do not directly impact cost, following best practices such as minimizing runtime through parallelization and efficient retries can lead to more efficient use of GitHub Actions minutes, indirectly affecting cost.
  • Specific cost or carbon usage improvements are harder to quantify without more context on the frequency of workflow runs and the size of operations involved. However, optimizing the efficiency of operations generally contributes to lower computational resource utilization, aligning with sustainability goals.

These recommendations aim to balance code maintainability, security, efficiency, and adherence to best practices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants