Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(tests): Create tests for testing local and remote worker selection #587

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

restevens402
Copy link
Contributor

@restevens402 restevens402 commented Oct 7, 2024

Introduce MockHost and MockNetwork structs for unit tests to simulate network interactions. Adjust method visibility in WorkHandlerManager for better accessibility and fix logging in AppConfig. Add new unit tests for worker selection and handlers.

Added tests for worker selection and work distribution.
One challenge I faced is that we don't have the ability to create a "remote" node in this kind of test.
I added a mock Host to try to overcome this, but it was not sufficient to do what I hoped. I am leaving the Mock code in so that we can build on it so we can write tests that need remote node testing.

Should these tests be refactored into their own package?
Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

Introduce MockHost and MockNetwork structs for unit tests to simulate network interactions. Adjust method visibility in WorkHandlerManager for better accessibility and fix logging in AppConfig. Add new unit tests for worker selection and handlers.
Copy link

github-actions bot commented Oct 7, 2024

PR description is too short and seems to not fulfill PR template, please fill in

@restevens402 restevens402 linked an issue Oct 15, 2024 that may be closed by this pull request
commit d654c56
Author: smeb y <48400087+a3165458@users.noreply.github.com>
Date:   Mon Oct 21 15:55:05 2024 +0800

    fix(Dockerfile): add ca-certificate (#604)

    Update Dockerfile

    Signed-off-by: smeb y <48400087+a3165458@users.noreply.github.com>

commit c1e624a
Author: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date:   Thu Oct 17 15:23:03 2024 +0200

    chore(docs): drop unnecessary duplicated content

    Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

commit 9529e4c
Author: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date:   Thu Oct 17 15:21:07 2024 +0200

    chore(docs): update .env.example (#603)

    Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

commit 038fad6
Author: Ettore Di Giacinto <mudler@users.noreply.github.com>
Date:   Thu Oct 17 14:55:25 2024 +0200

    fix(contracts): load config from embedded (#602)

    * fix(contracts): load config from embedded

    There was still some code reading from the filesystem instead of the
    embedded files in the binary.
    Regression introduced in #523.

    Fixes: #578

    See also: #579

    Signed-off-by: mudler <mudler@localai.io>

    * chore(tests): temporarly disable Twitter tests

    These are going to be taken care of as part of
    #573

    Signed-off-by: mudler <mudler@localai.io>

    ---------

    Signed-off-by: mudler <mudler@localai.io>

commit a8a77a6
Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com>
Date:   Tue Oct 15 14:32:29 2024 -0700

    feat(twitter): Implement random sleep and improve login process (#601)

    - Add RandomSleep function to introduce variability in request timing
    - Update NewScraper to use RandomSleep before and after login attempts
    - Adjust sleep duration range to 500ms - 2s for more natural behavior
    - Improve error handling and logging in the login process

commit 01ec8c4
Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com>
Date:   Tue Oct 15 08:59:08 2024 -0700

    chore(version): update protocol version and update twitter_cookies.example.json

commit 6f594e1
Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com>
Date:   Tue Oct 15 08:48:01 2024 -0700

    feat(twitter): Enhanced Twitter Worker Selection Algorithm (#591)

    * Add detailed error logging and track worker update time

    Enhanced the worker manager to append specific error messages to a list for better debugging. Additionally, updated node data to track the last update time, improving data consistency and traceability.

    * Update version.go

    * refactor(twitter): remove retry functionality from scraper

    - Remove Retry function and MaxRetries constant from config.go
    - Update ScrapeFollowersForProfile, ScrapeTweetsProfile, and ScrapeTweetsByQuery
      to remove Retry wrapper
    - Adjust error handling in each function to directly return errors
    - Simplify code structure and reduce complexity
    - Maintain rate limit handling functionality

    * chore(workers): update max workers to 50

    * chore(workers): upate to 25

    * feat(pubsub): improve node sorting algorithm for Twitter reliability

    - Prioritize nodes with more recent last returned tweets
    - Maintain high importance for total returned tweet count
    - Consider time since last timeout to allow recovery from temporary issues
    - Deprioritize nodes with recent "not found" occurrences
    - Remove NotFoundCount from sorting criteria

    This change aims to better balance node performance and recent activity,
    while allowing nodes to recover quickly from temporary issues like rate limiting.

    * feat(workers): improve Twitter worker selection algorithm

    - Modify GetEligibleWorkers to use a specialized selection for Twitter workers
    - Introduce controlled randomness in Twitter worker selection
    - Balance between prioritizing high-performing Twitter workers and fair distribution
    - Maintain existing behavior for non-Twitter worker selection
    - Preserve handling of local worker and respect original worker limit

    This change enhances the worker selection algorithm for Twitter tasks to provide
    a better balance between utilizing top-performing nodes and ensuring fair work
    distribution. It introduces a dynamic pool size calculation and controlled
    randomness for Twitter workers, while maintaining the existing round-robin
    approach for other worker types.

    ---------

    Co-authored-by: Bob Stevens <35038919+restevens402@users.noreply.github.com>

commit f09fb20
Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com>
Date:   Tue Oct 8 15:38:45 2024 -0700

    Feat(workers) implement adaptive worker selection for improved task distribution (#589)

    * feat(worker-selection): Implement performance-based worker sorting

    - Add performance metrics fields to NodeData struct
    - Implement NodeSorter for flexible sorting of worker nodes
    - Create SortNodesByTwitterReliability function for Twitter workers
    - Update GetEligibleWorkerNodes to use category-specific sorting
    - Modify GetEligibleWorkers to use sorted workers and add worker limit

    This commit enhances the worker selection process by prioritizing workers
    based on their performance metrics. It introduces a flexible sorting
    mechanism that can be easily extended to other worker categories in the
    future. The changes improve reliability and efficiency in task allocation
    across the Masa Oracle network.

    * feat(worker-selection): Implement priority-based selection for Twitter work

    - Update DistributeWork to use priority selection for Twitter category
    - Maintain round-robin selection for other work categories by shuffling workers
    - Integrate new GetEligibleWorkers function with work type-specific behavior
    - Respect MaxRemoteWorkers limit for all work types
    - Add distinct logging for Twitter and non-Twitter worker selection

    This commit enhances the work distribution process by implementing
    priority-based worker selection for Twitter-related tasks while
    preserving the existing round-robin behavior for other work types.
    It leverages the newly added performance metrics to choose the most
    reliable workers for Twitter tasks, and ensures consistent behavior
    for other categories by shuffling the worker list. This hybrid approach
    improves efficiency for Twitter tasks while maintaining the expected
    behavior for all other work types.

    * Update .gitignore

    * feat(worker-selection): Implement priority-based sorting for Twitter workers

    - Add LastNotFoundTime and NotFoundCount fields to NodeData struct
    - Enhance SortNodesByTwitterReliability function with multi-criteria sorting:
      1. Prioritize nodes found more often (lower NotFoundCount)
      2. Consider recency of last not-found occurrence
      3. Sort by higher number of returned tweets
      4. Consider recency of last returned tweet
      5. Prioritize nodes with fewer timeouts
      6. Consider recency of last timeout
      7. Use PeerId for stable sorting when no performance data is available
    - Remove random shuffling from GetEligibleWorkers function

    This commit improves worker selection for Twitter tasks by implementing
    a more sophisticated sorting algorithm that takes into account node
    reliability and performance metrics. It aims to enhance the efficiency
    and reliability of task distribution in the Masa Oracle network.

    * feat(worker-selection): Update Twitter fields in NodeData and Worker Manager

    Add functions to update Twitter-related metrics in NodeData and integrate updates into Worker Manager processes. This ensures accurate tracking of tweet-related events and peer activity in the system.

    * feat(worker-selection): Add unit tests for NodeData and NodeDataTracker

    Introduce unit tests for the NodeData and NodeDataTracker functionalities, covering scenarios involving updates to Twitter-related fields. These tests ensure the correctness of the UpdateTwitterFields method in NodeData and the UpdateNodeDataTwitter method in NodeDataTracker.

    * chore(workers): update timeouts and bump version

    ---------

    Co-authored-by: Bob Stevens <35038919+restevens402@users.noreply.github.com>

commit 0ef0df4
Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com>
Date:   Tue Oct 8 14:37:01 2024 -0700

    feat(api): Add configurable API server enablement (#586)

    * feat(api): Add configurable API server enablement

    This commit introduces a new feature that allows the API server to be
    conditionally enabled or disabled based on configuration. The changes
    include:

    1. In cmd/masa-node/main.go:
       - Refactored signal handling into a separate function `handleSignals`
       - Added conditional logic to start the API server only if enabled
       - Improved logging to indicate API server status

    2. In pkg/config/app.go:
       - Added `APIEnabled` field to the `AppConfig` struct
       - Set default value for `APIEnabled` to false in `setDefaultConfig`
       - Added command-line flag for `apiEnabled` in `setCommandLineConfig`

    3. In pkg/config/constants.go:
       - Added `APIEnabled` constant for environment variable configuration

    These changes provide more flexibility in node configuration, allowing
    users to run the node with or without the API server. This can be useful
    for security purposes or in scenarios where the API is not needed.

    The API can now be enabled via:
    - Environment variable: API_ENABLED=true
    - Command-line flag: --apiEnabled
    - Configuration file: apiEnabled: true

    By default, the API server will be disabled for enhanced security.

    * chore(config): update to take api-enabled=true and update Makefile with run-api case

    * Update Makefile
Copy link

PR description is too short and seems to not fulfill PR template, please fill in

The GetEligibleWorkers function now requires an additional integer parameter. This change updates the test code to include the new mandatory parameter, ensuring the tests remain compatible with the modified function signature.

Changed IP source to "https://checkip.amazonaws.com"
Copy link

PR description is too short and seems to not fulfill PR template, please fill in

mudler
mudler previously approved these changes Oct 29, 2024
Copy link
Contributor

@mudler mudler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

…d add tests.

Introducing `UseLocalWorkerAsRemote` to treat the local worker as remote for testing purposes. This includes updates to the node options, new tests in `worker_test.go`, and modifications in the worker selection logic. Also adjusted error logging for loading environment variables.
Copy link

github-actions bot commented Nov 7, 2024

PR description is too short and seems to not fulfill PR template, please fill in

Copy link

codecov bot commented Nov 7, 2024

Codecov Report

Attention: Patch coverage is 7.63359% with 121 lines in your changes missing coverage. Please review.

Project coverage is 17.03%. Comparing base (119f740) to head (cca860d).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/tests/mock_host.go 0.00% 119 Missing ⚠️
node/options.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##            main     #587      +/-   ##
=========================================
+ Coverage   8.89%   17.03%   +8.14%     
=========================================
  Files         99      100       +1     
  Lines       7800     7923     +123     
=========================================
+ Hits         694     1350     +656     
+ Misses      7032     6418     -614     
- Partials      74      155      +81     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

This commit upgrades various package versions in the go.mod file, ensuring the project uses more recent library versions. These updates include enhancements, bug fixes, and new features from the dependencies, which can improve the overall performance and security of the project.
Copy link

github-actions bot commented Nov 7, 2024

PR description is too short and seems to not fulfill PR template, please fill in

@restevens402 restevens402 changed the title Create MockHost for testing and update method visibility test: Create tests for testing local and remote worker selection Nov 7, 2024
Copy link

github-actions bot commented Nov 7, 2024

PR description is too short and seems to not fulfill PR template, please fill in

@mudler mudler changed the title test: Create tests for testing local and remote worker selection chore(tests): Create tests for testing local and remote worker selection Nov 7, 2024
Copy link

github-actions bot commented Nov 7, 2024

PR description is too short and seems to not fulfill PR template, please fill in

mudler
mudler previously approved these changes Nov 13, 2024
mudler and others added 3 commits November 22, 2024 10:58
Signed-off-by: mudler <mudler@localai.io>
Changed logging from Info to Debug for less verbose output. Improved error type checking by using errors.Is for context.DeadlineExceeded. Removed an unnecessary import statement in common.go.
Set the log level to Debug in the test initialization to ensure detailed logging. Updated the OracleNode creation with additional parameters including IsValidator, WithPageSize, and WithOracleProtocol for enhanced configuration flexibility. Modified test expectations to reflect changes in remote and local workers.
Introduce mutex locks in the peer connection count and node data update functions to ensure thread safety. These changes prevent potential race conditions when multiple goroutines access shared resources. Note that some outdated comments have been updated to reflect the need for careful cleanup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test: worker selection integration test
2 participants