Releases: sustainable-computing-io/kepler-model-server
Releases · sustainable-computing-io/kepler-model-server
v0.7.12-release
What's Changed
- fix: missing run in run-collector-client make target by @sunya-ch in #389
- feat: get type in /best-models API and pass source and type from estimator by @sunya-ch in #392
- fix: remove remaining intel_rapl source in script by @sunya-ch in #400
- feat: add --machine-spec arg to estimator and get_machine_spec by @sunya-ch in #398
- feat: add compute similarity and loose selection by @sunya-ch in #401
- feat: add machine_spec to metadata by @sunya-ch in #405
server/pull/416 - chore(deps): Bump scipy from 1.14.0 to 1.14.1 by @dependabot in #393
- chore(deps): Bump werkzeug from 3.0.3 to 3.0.4 by @dependabot in #403
- fix: pass true as string in pipeline param by @sunya-ch in #430
- fix(estimator): logging and machine-spec computation by @sthaha in #444
- feat: support --config-dir arg to point to the configuration directory by @sthaha in #456
- chore(deps): Bump scikit-learn from 1.5.1 to 1.5.2 by @dependabot in #446
- chore(deps): Bump boto3 from 1.34.155 to 1.35.39 by @dependabot in #497
- chore(deps): Bump pandas from 2.2.2 to 2.2.3 by @dependabot in #500
- chore(deps): Bump prometheus-client from 0.20.0 to 0.21.0 by @dependabot in #501
- chore(deps): Bump boto3 from 1.35.39 to 1.35.43 by @dependabot in #504
- chore: refactor and enhance e2e test script by @vprashar2929 in #483
- fix(workflow): enable tekton test and integration test on every PR by @vimalk78 in #477
- fix: disable expose idle power by @vimalk78 in #510
- chore(deps): Bump psutil from 6.0.0 to 6.1.0 by @dependabot in #512
- fix(config): use predictable paths and proper typings by @sthaha in #509
- fix(prom): remove unused PROM_ configs by @sthaha in #515
- feat(vm_metrics): Enabled vm metric use for local model training by @KaiyiLiu1234 in #464
- fix: check length of PROM_THIRDPARTY_METRICS by @vprashar2929 in #516
Full Changelog: v0.7.11-2-release...v0.7.12
v0.7.11-2-release
What's Changed
- fix: add boto3, spec pipeline version, version export path, and BREAKING CHANGE remove group feature by @sunya-ch in #341
- chore(deps): Bump xgboost from 2.1.0 to 2.1.1 by @dependabot in #338
- chore(deps): Bump protobuf from 5.27.2 to 5.27.3 by @dependabot in #340
- chore: update README to use v0.7.11 by @sthaha in #335
- chore(deps): Bump boto3 from 1.34.69 to 1.34.154 by @dependabot in #353
- chore(deps): Bump boto3 from 1.34.154 to 1.34.155 by @dependabot in #357
- fix: pd.unique warning and ignore file not found by @sunya-ch in #360
- bump up kepler-action to 0.0.8 by @SamYuan1990 in #313
- Revert "bump up kepler-action to 0.0.8" by @sthaha in #363
- fix: incompatability of specpower model on 0.7.11 by @sunya-ch in #367
- chore: Use top-level kepler_model python package by @sthaha in #323
- doc(tekton): fix tekton release URL by @ideaship in #372
- chore: add integration test with latest by @sunya-ch in #374
- fix(estimator): use click to handle log-level by @sthaha in #375
- fix(server): use logging instead of print by @sthaha in #376
- feat: update select logic with spec similarity computation by @sunya-ch in #370
- fix: apply kepler tag to deploy manifests by @sunya-ch in #377
- Revert "feat: update select logic with spec similarity computation" by @sthaha in #381
- fix(estimator): crash when logging by @sthaha in #380
- fix: ignore whitespaces in config files by @sthaha in #384
- fix: mismatch model request check (unexpectedly-repeated model request) by @sunya-ch in #383
- fix: ignore whitespaces in MODEL_CONFIG file by @sthaha in #386
- chore: hatch fmt to format all source to 120 char width by @sthaha in #388
- chore(compose): add compose for local development by @sthaha in #382
- chore: minor cleanup to add logging by @sthaha in #387
New Contributors
Full Changelog: v0.7.11-release...v0.7.11-2-release
v0.7.11-release
What's Changed
- fix logic to obtain new model by @sunya-ch in #246
- change kepler config path and export func name by @sunya-ch in #253
- update local_dev_cluster_version to v0.0.5 by @sunya-ch in #255
- fix(s3): use python 3.10 by @sthaha in #285
- fix(tekton): update the README by @vprashar2929 in #298
- Revert to latest tag instead of v0.7 by @KaiyiLiu1234 in #283
- change image version to latest by @sunya-ch in #310
- enrich ec2 train/plot and remove replacement of NodeAttribute.PROCESSOR with instance profile by @sunya-ch in #311
- Base image fix and try to add container in dependbot scan by @SamYuan1990 in #317
- add manual training with entrypoint instruction by @sunya-ch in #312
- bump kepler version 0.7.11 by @sunya-ch in #325
- update instance list and pipeline for ec2-0.7.11 by @sunya-ch in #322
- change default pipeline to ec2-0.7.11 by @sunya-ch in #327
- make filter pod by benchmark optional by @sunya-ch in #326
- remove condition to append energy source by @sunya-ch in #330
- change default pipeline name to std_v0.7.11 by @sunya-ch in #329
- apply energy source param on train-model flow by @sunya-ch in #331
- fix: enhance cluster setup script for model training by @vprashar2929 in #320
New Contributors
- @vprashar2929 made their first contribution in #298
Full Changelog: v0.7.7-release...v0.7.11-release
v0.7.7-release
Highlights
- Add
page_cache_hit
in BPFOnly feature group - Remove KubeletOnly feature group
- Change bpf_cpu_time unit from
us
toms
- Change component energy source from
rapl
tointel_rapl
- Support third party metrics
- Add XgboostTrainer and CurveFitModel trainer class
- Add new trainers: XgboostFitTrainer, ExponentialRegressionTrainer, LogisticRegressionTrainer, LogarithmicRegressionTrainer
- Tekton pipeline integration for power model training
- Self-hosted power data collection and model training on EC2 baremetal spot instance
- Support multiple node types in a pipeline
- CI updates
- Add base-image build
- Add local-model-db
- Add Tekton test
- Add kepler-model-db integration test
- Enable dependabot
- New 317 node types of CPU Power Models from SPECPower database (for platform power - acpi energy source)
- New 5 node types of CPU power Models from Kepler metrics on EC2 instances with Stress-NG workload.
What's Changed
- add missing writer file by @sunya-ch in #177
- change python version 3.8+ to 3.8 by @sunya-ch in #180
- update base image for 0.7 by @sunya-ch in #182
- relabel with version 0.7 by @sunya-ch in #183
- migrate xgboost to trainer implementation by @sunya-ch in #184
- Support third party metrics in model server by @LeiZhou-97 in #185
- minor bug fixes to prepare a kind cluster by @knarayan in #186
- fix unittest (fix dep)/integration test (remove short option) by @sunya-ch in #188
- add page_cache_hit and MAPE by @sunya-ch in #192
- update weight format by @sunya-ch in #190
- changes to collect metrics from Prometheus with benchmark run outside of kepler-model-server by @knarayan in #191
- encode xgboost weight, fix MAPE, reorganize exported folder by @sunya-ch in #193
- add isolate_from_data and train_from_data, with refactor entrypoint (tekton prerequisite) by @sunya-ch in #195
- add Tekton pipelinerun by @sunya-ch in #196
- add to-csv option to query command and correct outdated description by @sunya-ch in #197
- [CI] enable dependabot for kepler model server by @SamYuan1990 in #198
- Bump docker/setup-qemu-action from 2 to 3 by @dependabot in #201
- Bump docker/login-action from 1 to 3 by @dependabot in #202
- Bump docker/setup-buildx-action from 1 to 3 by @dependabot in #203
- Bump sustainable-computing-io/kepler-action from 0.0.1 to 0.0.4 by @dependabot in #204
- Bump actions/checkout from 2 to 4 by @dependabot in #200
- [CI] image build logic enhancement by @SamYuan1990 in #199
- Bump actions/checkout from 3 to 4 by @dependabot in #207
- Integrate tekton to model_training and GitHub workflows by @sunya-ch in #209
- Bump sustainable-computing-io/aws_ec2_self_hosted_runner from 1 to 3 by @dependabot in #211
- Bump sustainable-computing-io/kepler-action from 0.0.4 to 0.0.5 by @dependabot in #208
- upgrade xgboost to 2.0.1 by @sunya-ch in #210
- Bump sustainable-computing-io/aws_ec2_self_hosted_runner from 3 to 4 by @dependabot in #217
- Add curvefit trainer (fix minor issue, update test data) by @sunya-ch in #215
- Update energy source rapl to intel_rapl by @Yanbo0101 in #218
- add local db, remove kubelet, update push-pr by @sunya-ch in #222
- update README, simplified fig by @sunya-ch in #223
- fix wrong image name by @sunya-ch in #224
- fix: minor doc typo corrections by @sthaha in #226
- add node type indexing by @sunya-ch in #225
- change cpu_time unit (us to ms) by @sunya-ch in #227
- fix pipeline bug and workflow by @sunya-ch in #228
- Add metadata plot and Fix bugs by @sunya-ch in #229
- feat: add uml diagrams by @vimalk78 in #230
- update curvefit (log func) and generate_spec by @sunya-ch in #232
- add MAPE to estimator result and update exported figure by @sunya-ch in #233
- specify kepler tag by @sunya-ch in #236
- fix env error in push-to-main ci by @sunya-ch in #237
- Bump sustainable-computing-io/kepler-action from 0.0.5 to 0.0.6 by @dependabot in #239
- [CI] adjust CI settings by @SamYuan1990 in #240
- Update kepler-model-db link, serve model by spec, kepler-model-db integration CI, visualize power curve by @sunya-ch in #244
New Contributors
- @LeiZhou-97 made their first contribution in #185
- @knarayan made their first contribution in #186
- @SamYuan1990 made their first contribution in #198
- @dependabot made their first contribution in #201
- @sthaha made their first contribution in #226
- @vimalk78 made their first contribution in #230
Full Changelog: v0.6.0-release...v0.7.7-release
v0.6.0
Power model information
Model Output Type
AbsPower, DynPower
Energy source
Energy/power source | Energy/power components |
---|---|
rapl | package, core, uncore, dram |
acpi | platform |
Feature group
Group Name | Features | Kepler Metric Source(s) |
---|---|---|
CounterOnly | COUNTER_FEATURES | Hardware Counter |
CgroupOnly | CGROUP_FEATURES | cGroups |
BPFOnly | BPF_FEATURES | BPF |
KubeletOnly | KUBELET_FEATURES | Kubelet |
IRQOnly | IRQ_FEATURES | IRQ |
AcceleratorOnly | ACCELERATOR_FEATURES | Accelerator |
CounterIRQCombined | COUNTER_FEATURES, IRQ_FEATURES | BPF and Hardware Counter |
Basic | COUNTER_FEATURES, CGROUP_FEATURES, BPF_FEATURES, KUBELET_FEATURES | All except IRQ and node information |
WorkloadOnly | COUNTER_FEATURES, CGROUP_FEATURES, BPF_FEATURES, IRQ_FEATURES, KUBELET_FEATURES, ACCELERATOR_FEATURES | All except node information |
Full | WORKLOAD_FEATURES, SYSTEM_FEATURES | All |
Train methods
- PolynomialRegression
- GradientBoostingRegressor
- SGDRegressor
- KNeighborsRegressor
- LinearRegression
- SVRRegressor
Power model accuracy report
version | machine ID | pipeline | feature group | component power source | total power source | Local LR MAE in watts (Node Components/Total) | Estimator Sidecar MAE in watts (Node Components/Total) | Reference Power Range in watts |
---|---|---|---|---|---|---|---|---|
0.6 | nx12 | std_v0.6 | BPFOnly | rapl | acpi | 66.32/93.57 | 34.40/49.52 | 505.79 |
PRs to other projects
Contributors
- @rootfs made their first contribution in #1
- @KaiyiLiu1234 made their first contribution in #3
- @husky-parul made their first contribution in #20
- @Shreyanand made their first contribution in #32
- @sunya-ch made their first contribution in #34
- @Yanbo0101 made their first contribution in #128
- @jiangphcn made their first contribution in #142
- @marceloamaral made their first contribution in #146
- @mcalman made their first contribution in #159
- @Saurabhkr952 made their first contribution in #168
Full Changelog: https://github.com/sustainable-computing-io/kepler-model-server/commits/v0.6.0-release