-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(tests): Create tests for testing local and remote worker selection #587
base: main
Are you sure you want to change the base?
Conversation
Introduce MockHost and MockNetwork structs for unit tests to simulate network interactions. Adjust method visibility in WorkHandlerManager for better accessibility and fix logging in AppConfig. Add new unit tests for worker selection and handlers.
PR description is too short and seems to not fulfill PR template, please fill in |
commit d654c56 Author: smeb y <48400087+a3165458@users.noreply.github.com> Date: Mon Oct 21 15:55:05 2024 +0800 fix(Dockerfile): add ca-certificate (#604) Update Dockerfile Signed-off-by: smeb y <48400087+a3165458@users.noreply.github.com> commit c1e624a Author: Ettore Di Giacinto <mudler@users.noreply.github.com> Date: Thu Oct 17 15:23:03 2024 +0200 chore(docs): drop unnecessary duplicated content Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> commit 9529e4c Author: Ettore Di Giacinto <mudler@users.noreply.github.com> Date: Thu Oct 17 15:21:07 2024 +0200 chore(docs): update .env.example (#603) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> commit 038fad6 Author: Ettore Di Giacinto <mudler@users.noreply.github.com> Date: Thu Oct 17 14:55:25 2024 +0200 fix(contracts): load config from embedded (#602) * fix(contracts): load config from embedded There was still some code reading from the filesystem instead of the embedded files in the binary. Regression introduced in #523. Fixes: #578 See also: #579 Signed-off-by: mudler <mudler@localai.io> * chore(tests): temporarly disable Twitter tests These are going to be taken care of as part of #573 Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: mudler <mudler@localai.io> commit a8a77a6 Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com> Date: Tue Oct 15 14:32:29 2024 -0700 feat(twitter): Implement random sleep and improve login process (#601) - Add RandomSleep function to introduce variability in request timing - Update NewScraper to use RandomSleep before and after login attempts - Adjust sleep duration range to 500ms - 2s for more natural behavior - Improve error handling and logging in the login process commit 01ec8c4 Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com> Date: Tue Oct 15 08:59:08 2024 -0700 chore(version): update protocol version and update twitter_cookies.example.json commit 6f594e1 Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com> Date: Tue Oct 15 08:48:01 2024 -0700 feat(twitter): Enhanced Twitter Worker Selection Algorithm (#591) * Add detailed error logging and track worker update time Enhanced the worker manager to append specific error messages to a list for better debugging. Additionally, updated node data to track the last update time, improving data consistency and traceability. * Update version.go * refactor(twitter): remove retry functionality from scraper - Remove Retry function and MaxRetries constant from config.go - Update ScrapeFollowersForProfile, ScrapeTweetsProfile, and ScrapeTweetsByQuery to remove Retry wrapper - Adjust error handling in each function to directly return errors - Simplify code structure and reduce complexity - Maintain rate limit handling functionality * chore(workers): update max workers to 50 * chore(workers): upate to 25 * feat(pubsub): improve node sorting algorithm for Twitter reliability - Prioritize nodes with more recent last returned tweets - Maintain high importance for total returned tweet count - Consider time since last timeout to allow recovery from temporary issues - Deprioritize nodes with recent "not found" occurrences - Remove NotFoundCount from sorting criteria This change aims to better balance node performance and recent activity, while allowing nodes to recover quickly from temporary issues like rate limiting. * feat(workers): improve Twitter worker selection algorithm - Modify GetEligibleWorkers to use a specialized selection for Twitter workers - Introduce controlled randomness in Twitter worker selection - Balance between prioritizing high-performing Twitter workers and fair distribution - Maintain existing behavior for non-Twitter worker selection - Preserve handling of local worker and respect original worker limit This change enhances the worker selection algorithm for Twitter tasks to provide a better balance between utilizing top-performing nodes and ensuring fair work distribution. It introduces a dynamic pool size calculation and controlled randomness for Twitter workers, while maintaining the existing round-robin approach for other worker types. --------- Co-authored-by: Bob Stevens <35038919+restevens402@users.noreply.github.com> commit f09fb20 Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com> Date: Tue Oct 8 15:38:45 2024 -0700 Feat(workers) implement adaptive worker selection for improved task distribution (#589) * feat(worker-selection): Implement performance-based worker sorting - Add performance metrics fields to NodeData struct - Implement NodeSorter for flexible sorting of worker nodes - Create SortNodesByTwitterReliability function for Twitter workers - Update GetEligibleWorkerNodes to use category-specific sorting - Modify GetEligibleWorkers to use sorted workers and add worker limit This commit enhances the worker selection process by prioritizing workers based on their performance metrics. It introduces a flexible sorting mechanism that can be easily extended to other worker categories in the future. The changes improve reliability and efficiency in task allocation across the Masa Oracle network. * feat(worker-selection): Implement priority-based selection for Twitter work - Update DistributeWork to use priority selection for Twitter category - Maintain round-robin selection for other work categories by shuffling workers - Integrate new GetEligibleWorkers function with work type-specific behavior - Respect MaxRemoteWorkers limit for all work types - Add distinct logging for Twitter and non-Twitter worker selection This commit enhances the work distribution process by implementing priority-based worker selection for Twitter-related tasks while preserving the existing round-robin behavior for other work types. It leverages the newly added performance metrics to choose the most reliable workers for Twitter tasks, and ensures consistent behavior for other categories by shuffling the worker list. This hybrid approach improves efficiency for Twitter tasks while maintaining the expected behavior for all other work types. * Update .gitignore * feat(worker-selection): Implement priority-based sorting for Twitter workers - Add LastNotFoundTime and NotFoundCount fields to NodeData struct - Enhance SortNodesByTwitterReliability function with multi-criteria sorting: 1. Prioritize nodes found more often (lower NotFoundCount) 2. Consider recency of last not-found occurrence 3. Sort by higher number of returned tweets 4. Consider recency of last returned tweet 5. Prioritize nodes with fewer timeouts 6. Consider recency of last timeout 7. Use PeerId for stable sorting when no performance data is available - Remove random shuffling from GetEligibleWorkers function This commit improves worker selection for Twitter tasks by implementing a more sophisticated sorting algorithm that takes into account node reliability and performance metrics. It aims to enhance the efficiency and reliability of task distribution in the Masa Oracle network. * feat(worker-selection): Update Twitter fields in NodeData and Worker Manager Add functions to update Twitter-related metrics in NodeData and integrate updates into Worker Manager processes. This ensures accurate tracking of tweet-related events and peer activity in the system. * feat(worker-selection): Add unit tests for NodeData and NodeDataTracker Introduce unit tests for the NodeData and NodeDataTracker functionalities, covering scenarios involving updates to Twitter-related fields. These tests ensure the correctness of the UpdateTwitterFields method in NodeData and the UpdateNodeDataTwitter method in NodeDataTracker. * chore(workers): update timeouts and bump version --------- Co-authored-by: Bob Stevens <35038919+restevens402@users.noreply.github.com> commit 0ef0df4 Author: Brendan Playford <34052452+teslashibe@users.noreply.github.com> Date: Tue Oct 8 14:37:01 2024 -0700 feat(api): Add configurable API server enablement (#586) * feat(api): Add configurable API server enablement This commit introduces a new feature that allows the API server to be conditionally enabled or disabled based on configuration. The changes include: 1. In cmd/masa-node/main.go: - Refactored signal handling into a separate function `handleSignals` - Added conditional logic to start the API server only if enabled - Improved logging to indicate API server status 2. In pkg/config/app.go: - Added `APIEnabled` field to the `AppConfig` struct - Set default value for `APIEnabled` to false in `setDefaultConfig` - Added command-line flag for `apiEnabled` in `setCommandLineConfig` 3. In pkg/config/constants.go: - Added `APIEnabled` constant for environment variable configuration These changes provide more flexibility in node configuration, allowing users to run the node with or without the API server. This can be useful for security purposes or in scenarios where the API is not needed. The API can now be enabled via: - Environment variable: API_ENABLED=true - Command-line flag: --apiEnabled - Configuration file: apiEnabled: true By default, the API server will be disabled for enhanced security. * chore(config): update to take api-enabled=true and update Makefile with run-api case * Update Makefile
PR description is too short and seems to not fulfill PR template, please fill in |
The GetEligibleWorkers function now requires an additional integer parameter. This change updates the test code to include the new mandatory parameter, ensuring the tests remain compatible with the modified function signature. Changed IP source to "https://checkip.amazonaws.com"
PR description is too short and seems to not fulfill PR template, please fill in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
…d add tests. Introducing `UseLocalWorkerAsRemote` to treat the local worker as remote for testing purposes. This includes updates to the node options, new tests in `worker_test.go`, and modifications in the worker selection logic. Also adjusted error logging for loading environment variables.
PR description is too short and seems to not fulfill PR template, please fill in |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #587 +/- ##
=========================================
+ Coverage 8.89% 17.03% +8.14%
=========================================
Files 99 100 +1
Lines 7800 7923 +123
=========================================
+ Hits 694 1350 +656
+ Misses 7032 6418 -614
- Partials 74 155 +81 ☔ View full report in Codecov by Sentry. |
This commit upgrades various package versions in the go.mod file, ensuring the project uses more recent library versions. These updates include enhancements, bug fixes, and new features from the dependencies, which can improve the overall performance and security of the project.
PR description is too short and seems to not fulfill PR template, please fill in |
PR description is too short and seems to not fulfill PR template, please fill in |
PR description is too short and seems to not fulfill PR template, please fill in |
Signed-off-by: mudler <mudler@localai.io>
Changed logging from Info to Debug for less verbose output. Improved error type checking by using errors.Is for context.DeadlineExceeded. Removed an unnecessary import statement in common.go.
Set the log level to Debug in the test initialization to ensure detailed logging. Updated the OracleNode creation with additional parameters including IsValidator, WithPageSize, and WithOracleProtocol for enhanced configuration flexibility. Modified test expectations to reflect changes in remote and local workers.
Introduce mutex locks in the peer connection count and node data update functions to ensure thread safety. These changes prevent potential race conditions when multiple goroutines access shared resources. Note that some outdated comments have been updated to reflect the need for careful cleanup.
Introduce MockHost and MockNetwork structs for unit tests to simulate network interactions. Adjust method visibility in WorkHandlerManager for better accessibility and fix logging in AppConfig. Add new unit tests for worker selection and handlers.
Added tests for worker selection and work distribution.
One challenge I faced is that we don't have the ability to create a "remote" node in this kind of test.
I added a mock Host to try to overcome this, but it was not sufficient to do what I hoped. I am leaving the Mock code in so that we can build on it so we can write tests that need remote node testing.
Should these tests be refactored into their own package?
Notes for Reviewers
Signed commits