Skip to content

Commit

Permalink
add metric family (#42)
Browse files Browse the repository at this point in the history
* add metric family

the metric set designs do not necessarily map to an explicit
type, so I am adding family to allow for better categorization.

Signed-off-by: vsoch <vsoch@users.noreply.github.com>
  • Loading branch information
vsoch authored Aug 14, 2023
1 parent 85c05a9 commit 8b0ba8b
Show file tree
Hide file tree
Showing 18 changed files with 189 additions and 80 deletions.
10 changes: 10 additions & 0 deletions docs/_static/data/metrics.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,69 +2,79 @@
{
"name": "app-amg",
"description": "parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids",
"family": "simulation",
"type": "standalone",
"image": "ghcr.io/converged-computing/metric-amg:latest",
"url": "https://github.com/LLNL/AMG"
},
{
"name": "app-kripke",
"description": "parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids",
"family": "simulation",
"type": "standalone",
"image": "ghcr.io/converged-computing/metric-kripke:latest",
"url": "https://github.com/LLNL/Kripke"
},
{
"name": "app-lammps",
"description": "LAMMPS molecular dynamic simulation",
"family": "simulation",
"type": "standalone",
"image": "ghcr.io/converged-computing/metric-lammps:latest",
"url": "https://www.lammps.org/"
},
{
"name": "app-pennant",
"description": "Unstructured mesh hydrodynamics for advanced architectures ",
"family": "simulation",
"type": "standalone",
"image": "ghcr.io/converged-computing/metric-pennant:latest",
"url": "https://github.com/LLNL/pennant"
},
{
"name": "app-quicksilver",
"description": "A proxy app for the Monte Carlo Transport Code",
"family": "simulation",
"type": "standalone",
"image": "ghcr.io/converged-computing/metric-quicksilver:latest",
"url": "https://github.com/LLNL/Quicksilver"
},
{
"name": "io-fio",
"description": "Flexible IO Tester (FIO)",
"family": "storage",
"type": "storage",
"image": "ghcr.io/converged-computing/metric-fio:latest",
"url": "https://fio.readthedocs.io/en/latest/fio_doc.html"
},
{
"name": "io-sysstat",
"description": "statistics for Linux tasks (processes) : I/O, CPU, memory, etc.",
"family": "storage",
"type": "storage",
"image": "ghcr.io/converged-computing/metric-sysstat:latest",
"url": "https://github.com/sysstat/sysstat"
},
{
"name": "network-netmark",
"description": "point to point networking tool",
"family": "network",
"type": "standalone",
"image": "vanessa/netmark:latest",
"url": ""
},
{
"name": "network-osu-benchmark",
"description": "point to point MPI benchmarks",
"family": "network",
"type": "standalone",
"image": "ghcr.io/converged-computing/metric-osu-benchmark:latest",
"url": "https://mvapich.cse.ohio-state.edu/benchmarks/"
},
{
"name": "perf-sysstat",
"description": "statistics for Linux tasks (processes) : I/O, CPU, memory, etc.",
"family": "performance",
"type": "application",
"image": "ghcr.io/converged-computing/metric-sysstat:latest",
"url": "https://github.com/sysstat/sysstat"
Expand Down
18 changes: 12 additions & 6 deletions docs/_static/data/table.html
Original file line number Diff line number Diff line change
Expand Up @@ -416,6 +416,7 @@
<tr>
<th>Name</th>
<th>Type</th>
<th>Family</th>
<th>Description</th>
<th>Container</th>
</tr>
Expand All @@ -440,6 +441,7 @@
return "<a href='" + row['url'] + "' target='_blank'>" + data +"</a>";},
},
{ data: "type"},
{ data: "family"},
{ data: "description"},
{ data: "image",
render: function ( data, type, row ) {
Expand All @@ -453,14 +455,18 @@
}}
],
'rowCallback': function(row, data, index){
if(data.type == 'storage'){
$(row).find('td:eq(1)').css('background-color', '#f5f580'); // yellow
// Distinguish family
if(data.family == 'storage'){
$(row).find('td:eq(2)').css('background-color', 'lavender');
}
if(data.type == 'standalone'){
$(row).find('td:eq(1)').css('background-color', '#f79fb7'); // pinkish
if(data.family == 'performance'){
$(row).find('td:eq(2)').css('background-color', '#f79fb7');
}
if(data.type == 'application'){
$(row).find('td:eq(1)').css('background-color', '#8af98a'); // lime green
if(data.family == 'network'){
$(row).find('td:eq(2)').css('background-color', 'orange');
}
if(data.family == 'simulation'){
$(row).find('td:eq(2)').css('background-color', 'skyblue');
}
}
});
Expand Down
9 changes: 7 additions & 2 deletions docs/getting_started/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,11 @@ The following metrics are under development (or being planned).
- [Application Metrics](https://converged-computing.github.io/metrics-operator/getting_started/metrics.html#application)
- [Standalone Metrics](https://converged-computing.github.io/metrics-operator/getting_started/metrics.html#standalone)

<iframe src="../_static/data/table.html" style="width:100%; height:700px;" frameBorder="0"></iframe>
Each of the above is a metric design, which is primarily represented in the Metrics Operator code. However, within each design
there are different families of metrics (e.g., storage, network, performance, simulation) shown in the table below as the "Family" column.
We likely will tweak and improve upon these categories.

<iframe src="../_static/data/table.html" style="width:100%; height:800px;" frameBorder="0"></iframe>

All metrics can be customized with the following variables

Expand All @@ -18,7 +22,8 @@ All metrics can be customized with the following variables

## Implemented Metrics

Each metric has a link to the type, along with (optionally) examples.
Each metric has a link to the type, along with (optionally) examples. These sections will better be organized by
family once we decide on a more final set.

### Performance

Expand Down
5 changes: 3 additions & 2 deletions hack/docs-gen/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ package main

import (
"encoding/json"
"io/ioutil"
"log"
"os"
"sort"
Expand All @@ -23,6 +22,7 @@ var (
type MetricOutput struct {
Name string `json:"name"`
Description string `json:"description"`
Family string `json:"family"`
Type string `json:"type"`
Image string `json:"image"`
Url string `json:"url"`
Expand All @@ -38,6 +38,7 @@ func main() {
newRecord := MetricOutput{
Name: metric.Name(),
Description: metric.Description(),
Family: metric.Family(),
Type: metric.Type(),
Image: metric.Image(),
Url: metric.Url(),
Expand All @@ -54,7 +55,7 @@ func main() {
if err != nil {
log.Fatalf("Could not marshall records %s\n", err.Error())
}
err = ioutil.WriteFile(filename, file, 0644)
err = os.WriteFile(filename, file, 0644)
if err != nil {
log.Fatalf("Could not write to file %s: %s\n", filename, err.Error())
}
Expand Down
82 changes: 82 additions & 0 deletions pkg/jobs/application.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
/*
Copyright 2023 Lawrence Livermore National Security, LLC
(c.f. AUTHORS, NOTICE.LLNS, COPYING)
SPDX-License-Identifier: MIT
*/

package jobs

import (
api "github.com/converged-computing/metrics-operator/api/v1alpha1"
metrics "github.com/converged-computing/metrics-operator/pkg/metrics"
"k8s.io/apimachinery/pkg/util/intstr"
jobset "sigs.k8s.io/jobset/api/jobset/v1alpha2"
)

// These are common templates for application metrics

// SingleApplication is a Metric base for a simple application metric
// be accessible by other packages (and not conflict with function names)
type SingleApplication struct {
Identifier string
Rate int32
Summary string
Completions int32
Container string
Workdir string
ResourceSpec *api.ContainerResources
AttributeSpec *api.ContainerSpec
}

// Name returns the metric name
func (m SingleApplication) Name() string {
return m.Identifier
}

// Description returns the metric description
func (m SingleApplication) Description() string {
return m.Summary
}

// Default SingleApplication is generic performance family
func (m SingleApplication) Family() string {
return metrics.PerformanceFamily
}

// Return container resources for the metric container
func (m SingleApplication) Resources() *api.ContainerResources {
return m.ResourceSpec
}
func (m SingleApplication) Attributes() *api.ContainerSpec {
return m.AttributeSpec
}

// Validation
func (m SingleApplication) Validate(spec *api.MetricSet) bool {
return true
}

// Container variables
func (m SingleApplication) Image() string {
return m.Container
}
func (m SingleApplication) WorkingDir() string {
return m.Workdir
}

func (m SingleApplication) ReplicatedJobs(spec *api.MetricSet) ([]jobset.ReplicatedJob, error) {
return []jobset.ReplicatedJob{}, nil
}

func (m SingleApplication) ListOptions() map[string][]intstr.IntOrString {
return map[string][]intstr.IntOrString{}
}

func (m SingleApplication) SuccessJobs() []string {
return []string{}
}

func (m SingleApplication) Type() string {
return metrics.ApplicationMetric
}
5 changes: 5 additions & 0 deletions pkg/jobs/launcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,11 @@ func (m LauncherWorker) Description() string {
return m.Summary
}

// Family returns a generic performance family
func (m LauncherWorker) Family() string {
return metrics.PerformanceFamily
}

// Jobs required for success condition (n is the LauncherWorker run)
func (m *LauncherWorker) SuccessJobs() []string {
m.ensureDefaultNames()
Expand Down
5 changes: 5 additions & 0 deletions pkg/jobs/storage.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,11 @@ func (m StorageGeneric) Name() string {
return m.Identifier
}

// Family returns the storage family
func (m StorageGeneric) Family() string {
return metrics.StorageFamily
}

// Description returns the metric description
func (m StorageGeneric) Description() string {
return m.Summary
Expand Down
5 changes: 5 additions & 0 deletions pkg/metrics/app/amg.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ func (m AMG) Url() string {
return "https://github.com/LLNL/AMG"
}

// I think this is a simulation?
func (m AMG) Family() string {
return metrics.SimulationFamily
}

// Set custom options / attributes for the metric
func (m *AMG) SetOptions(metric *api.Metric) {
m.Rate = metric.Rate
Expand Down
5 changes: 5 additions & 0 deletions pkg/metrics/app/kripke.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@ func (m Kripke) Url() string {
return "https://github.com/LLNL/Kripke"
}

// I think this is a simulation?
func (m Kripke) Family() string {
return metrics.SimulationFamily
}

// Set custom options / attributes for the metric
func (m *Kripke) SetOptions(metric *api.Metric) {
m.Rate = metric.Rate
Expand Down
5 changes: 5 additions & 0 deletions pkg/metrics/app/lammps.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ func (m Lammps) Url() string {
return "https://www.lammps.org/"
}

// I think this is a simulation?
func (m Lammps) Family() string {
return metrics.SimulationFamily
}

// Set custom options / attributes for the metric
func (m *Lammps) SetOptions(metric *api.Metric) {
m.Rate = metric.Rate
Expand Down
5 changes: 5 additions & 0 deletions pkg/metrics/app/pennant.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ type Pennant struct {
prefix string
}

// I think this is a simulation?
func (m Pennant) Family() string {
return metrics.SimulationFamily
}

func (m Pennant) Url() string {
return "https://github.com/LLNL/pennant"
}
Expand Down
5 changes: 5 additions & 0 deletions pkg/metrics/app/quicksilver.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,11 @@ type Quicksilver struct {
mpirun string
}

// I think this is a simulation?
func (m Quicksilver) Family() string {
return metrics.SimulationFamily
}

func (m Quicksilver) Url() string {
return "https://github.com/LLNL/Quicksilver"
}
Expand Down
1 change: 1 addition & 0 deletions pkg/metrics/metrics.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ type Metric interface {
// Attributes to expose for containers
WorkingDir() string
Image() string
Family() string

// One or more replicated jobs to populate a JobSet
ReplicatedJobs(*api.MetricSet) ([]jobset.ReplicatedJob, error)
Expand Down
9 changes: 9 additions & 0 deletions pkg/metrics/metricset.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,18 @@ var (
)

const (
// Metric Design Types
ApplicationMetric = "application"
StorageMetric = "storage"
StandaloneMetric = "standalone"

// Metric Family Types (these likely can be changed)
StorageFamily = "storage"
NetworkFamily = "network"
SimulationFamily = "simulation"

// Generic (more than one type, CPU/io, etc)
PerformanceFamily = "performance"
)

// A MetricSet interface holds one or more Metrics
Expand Down
5 changes: 5 additions & 0 deletions pkg/metrics/network/netmark.go
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ type Netmark struct {
storeEachTrial bool
}

// Family returns the network family
func (n Netmark) Family() string {
return metrics.NetworkFamily
}

func (m Netmark) Url() string {
return ""
}
Expand Down
5 changes: 5 additions & 0 deletions pkg/metrics/network/osu-benchmark.go
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,11 @@ func (m OSUBenchmark) Validate(spec *api.MetricSet) bool {
return spec.Spec.Pods == 2
}

// Family returns the network family
func (n OSUBenchmark) Family() string {
return metrics.NetworkFamily
}

// Return lookup of entrypoint scripts
func (m OSUBenchmark) EntrypointScripts(
spec *api.MetricSet,
Expand Down
Loading

0 comments on commit 8b0ba8b

Please sign in to comment.