Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Telemetry Integration #569

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,8 @@ docker/stop:
docker/stack/up:
docker-compose up --detach

docker/stack/stop:
docker-compose stop
docker/otel/up:
docker-compose up postgres redis jaeger otel_collector --detach

docker/stack/build:
docker buildx build \
Expand All @@ -123,4 +123,4 @@ git/tag/push:
git tag armory-$(TAG)
git tag policy-engine-$(TAG)
git tag vault-$(TAG)
git push origin armory-$(TAG) policy-engine-$(TAG) vault-$(TAG)
git push origin armory-$(TAG) policy-engine-$(TAG) vault-$(TAG)
44 changes: 42 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ Alternatively, you can run the entire development stack in Docker containers.
make docker/stack/build

make docker/stack/up
make docker/stack/stop
make docker/stop
```

## Testing
Expand Down Expand Up @@ -116,7 +116,7 @@ below to generate a project of your choice.

```bash
# Generate an standard JavaScript library.
npx nx g @nrwl/workspace:lib
npx nx g @nrwl/js:lib
# Generate an NestJS library.
npx nx g @nx/nest:library
# Generate an NestJS application.
Expand All @@ -142,6 +142,46 @@ packages to NPM.
You can find the publishable packages listed in the `release.projects` value in
the `nx.json`.

## OpenTelemetry

The Armory Stack uses [OpenTelemetry](https://opentelemetry.io/docs/) (OTEL)
for observability through traces and metrics.

For details on how OTEL is instrumented, head over
[@narval/open-telemetry](./packages/open-telemetry/).

### Setup

1. Start local dependencies:

```bash
make docker/otel/up
```

2. Set environment variables:

```bash
OTEL_SDK_DISABLED=false
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
```

> [!NOTE]
> OTEL is disabled by default in development.

3. Restart the application.

4. Access Jaeger UI: http://localhost:16686

### Naming Conventions

- Traces: `operation.entity.action` (e.g., `policyStore.policies.update`)
- Metrics: `system_entity_unit_state` (e.g., `policy_store_updates_total`)

See [Metric Naming](./packages/nestjs-shared/src/lib/module/open-telemetry/service/metric.service.ts)
and [Trace/Span
Name](./packages/nestjs-shared/src/lib/module/open-telemetry/service/trace.service.ts)
adopted conventions.

## Troubleshooting

### DB URL in env variable fails when using `docker run`, but works when running outside docker
Expand Down
10 changes: 8 additions & 2 deletions apps/armory/.env.default
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ ADMIN_API_KEY=171c50ec62122b8c08362dcf9dce9b016ed615cfc7b90d4bc3fe5b223d967fb2
# APP db connection string
APP_DATABASE_URL=postgresql://postgres:postgres@localhost:5432/armory?schema=public

# MIGRATOR db credentials. host/port/name should be the same, username&password may be different
# Migrator db credentials.
# The host/port/name should be the same, username&password may be different
APP_DATABASE_USERNAME=postgres
APP_DATABASE_PASSWORD=postgres
APP_DATABASE_HOST=host.docker.internal
Expand All @@ -30,4 +31,9 @@ HISTORICAL_TRANSFER_FEED_PRIVATE_KEY=0xf5c8f17cc09215c5038f6b8d5e557c0d98d341236
POLICY_ENGINE_URLS=http://localhost:3010
POLICY_ENGINE_ADMIN_API_KEYS=engine-admin-api-key

MANAGED_DATASTORE_BASE_URL=http://localhost:3005/v1/data
MANAGED_DATASTORE_BASE_URL=http://localhost:3005/v1/data

# OpenTelemetry configuration.
OTEL_SDK_DISABLED=true
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
OTEL_LOGS_EXPORTER=otlp
5 changes: 3 additions & 2 deletions apps/armory/src/armory.module.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { ConfigModule } from '@narval/config-module'
import { HttpLoggerMiddleware, LoggerModule } from '@narval/nestjs-shared'
import { HttpLoggerMiddleware, LoggerModule, OpenTelemetryModule } from '@narval/nestjs-shared'
import { ClassSerializerInterceptor, MiddlewareConsumer, Module, NestModule } from '@nestjs/common'
import { APP_INTERCEPTOR } from '@nestjs/core'
import { AppModule } from './app/app.module'
Expand All @@ -18,7 +18,8 @@ const INFRASTRUCTURE_MODULES = [
load: [load],
isGlobal: true
}),
QueueModule.forRoot()
QueueModule.forRoot(),
OpenTelemetryModule.forRoot()
]

const DOMAIN_MODULES = [AppModule, OrchestrationModule, TransferTrackingModule, ManagedDataStoreModule, ClientModule]
Expand Down
3 changes: 2 additions & 1 deletion apps/armory/src/client/__test__/e2e/client.spec.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { ConfigModule, ConfigService } from '@narval/config-module'
import { LoggerModule, secret } from '@narval/nestjs-shared'
import { LoggerModule, OpenTelemetryModule, secret } from '@narval/nestjs-shared'
import { DataStoreConfiguration, HttpSource, PublicClient, Source, SourceType } from '@narval/policy-engine-shared'
import { getPublicKey, privateKeyToJwk } from '@narval/signature'
import { HttpStatus, INestApplication } from '@nestjs/common'
Expand Down Expand Up @@ -77,6 +77,7 @@ describe('Client', () => {
load: [load],
isGlobal: true
}),
OpenTelemetryModule.forTest(),
ClientModule
]
}).compile()
Expand Down
8 changes: 8 additions & 0 deletions apps/armory/src/main.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
import { instrumentOpenTelemetry } from '@narval/open-telemetry'

// IMPORTANT: OpenTelemetry SDK must be registered before any other imports to
// ensure proper instrumentation. The instrumentation packages patches Node.js
// runtime - if NestFactory or other dependencies load first, they'll use the
// unpatched runtime and won't be instrumented correctly.
instrumentOpenTelemetry({ serviceName: 'auth' })

import { ConfigService } from '@narval/config-module'
import { LoggerService, withApiVersion, withCors, withLogger, withSwagger } from '@narval/nestjs-shared'
import { ClassSerializerInterceptor, INestApplication, ValidationPipe } from '@nestjs/common'
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { ConfigModule } from '@narval/config-module'
import { LoggerModule, secret } from '@narval/nestjs-shared'
import { LoggerModule, OpenTelemetryModule, secret } from '@narval/nestjs-shared'
import {
Criterion,
Entities,
Expand Down Expand Up @@ -134,6 +134,7 @@ describe('Data Store', () => {
load: [load],
isGlobal: true
}),
OpenTelemetryModule.forTest(),
ManagedDataStoreModule
]
})
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import { MetricService, TraceService } from '@narval/nestjs-shared'
import { EntityStore } from '@narval/policy-engine-shared'
import { HttpStatus, Injectable, NotFoundException } from '@nestjs/common'
import { HttpStatus, Inject, Injectable, NotFoundException } from '@nestjs/common'
import { Counter } from '@opentelemetry/api'
import { ClientService } from '../../../client/core/service/client.service'
import { ClusterService } from '../../../policy-engine/core/service/cluster.service'
import { SetEntityStoreResponse } from '../../http/rest/dto/set-entity-store.dto'
Expand All @@ -8,21 +10,45 @@ import { SignatureService } from './signature.service'

@Injectable()
export class EntityDataStoreService extends SignatureService {
private getCounter: Counter
private setCounter: Counter

constructor(
private entityDataStoreRepository: EntityDataStoreRepository,
private clientService: ClientService,
private clusterService: ClusterService
private clusterService: ClusterService,
@Inject(TraceService) private traceService: TraceService,
@Inject(MetricService) private metricService: MetricService
) {
super()

this.getCounter = this.metricService.createCounter('entity_data_store_get_count')
this.setCounter = this.metricService.createCounter('entity_data_store_set_count')
}

async getEntities(clientId: string): Promise<EntityStore | null> {
this.getCounter.add(1, { clientId })

const span = this.traceService.startSpan(`${EntityDataStoreService.name}.getEntities`, {
attributes: { clientId }
})
Comment on lines +32 to +34
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm questioning naming code -- is there a benefit to using ${EntityDataStoreService.name} instead of just "EntityDataStoreService" as a string? When using a raw string instead of interpolation, it makes it easier to do a "find all" in the codebase when you're looking for the line a log was from.

Obv...it tells you the service & method name, so maybe my workflow is what doesn't make sense. But I tend to find something in logs, and want to just copy/paste in "find" to get to the file. wdyt?

I'm fine either way really, just throwing this out in case we all do this..

Copy link
Collaborator Author

@wcalderipe wcalderipe Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I opted for ${EntityDataStoreService.name} is to make the span name refactoring proof. Although the side effects are:

  • Refactoring the class name will affect the data
  • You can't simply copy the span name and look at the whole code base for it

I'm on the fence now.. let me think a bit.


const entityStore = await this.entityDataStoreRepository.getLatestDataStore(clientId)

return entityStore ? EntityStore.parse(entityStore.data) : null
const response = entityStore ? EntityStore.parse(entityStore.data) : null

span.end()

return response
}

async setEntities(clientId: string, payload: EntityStore) {
this.setCounter.add(1, { clientId })

const span = this.traceService.startSpan(`${EntityDataStoreService.name}.setEntities`, {
attributes: { clientId }
})

const client = await this.clientService.findById(clientId)

if (!client) {
Expand All @@ -47,10 +73,14 @@ export class EntityDataStoreService extends SignatureService {

const success = await this.clusterService.sync(clientId)

return SetEntityStoreResponse.parse({
const response = SetEntityStoreResponse.parse({
latestSync: { success },
entity: EntityStore.parse(data),
version
})

span.end()

return response
}
}
Original file line number Diff line number Diff line change
@@ -1,27 +1,53 @@
import { MetricService, TraceService } from '@narval/nestjs-shared'
import { PolicyStore } from '@narval/policy-engine-shared'
import { HttpStatus, Injectable, NotFoundException } from '@nestjs/common'
import { HttpStatus, Inject, Injectable, NotFoundException } from '@nestjs/common'
import { Counter } from '@opentelemetry/api'
import { ClientService } from '../../../client/core/service/client.service'
import { ClusterService } from '../../../policy-engine/core/service/cluster.service'
import { PolicyDataStoreRepository } from '../../persistence/repository/policy-data-store.repository'
import { SignatureService } from './signature.service'

@Injectable()
export class PolicyDataStoreService extends SignatureService {
private getCounter: Counter
private setCounter: Counter

constructor(
private policyDataStoreRepository: PolicyDataStoreRepository,
private clientService: ClientService,
private clusterService: ClusterService
private clusterService: ClusterService,
@Inject(TraceService) private traceService: TraceService,
@Inject(MetricService) private metricService: MetricService
) {
super()

this.getCounter = this.metricService.createCounter('policy_data_store_get_count')
this.setCounter = this.metricService.createCounter('policy_data_store_set_count')
Comment on lines +24 to +25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use constants for the metric names so that we have one place to look to find all the possible metrics we have?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would push the metrics discoverability problem to the observability stack rather than trying to solve it in the code.

}

async getPolicies(clientId: string): Promise<PolicyStore | null> {
this.getCounter.add(1, { clientId })

const span = this.traceService.startSpan(`${PolicyDataStoreService.name}.getPolicies`, {
attributes: { clientId }
})

const policyStore = await this.policyDataStoreRepository.getLatestDataStore(clientId)

return policyStore ? PolicyStore.parse(policyStore.data) : null
const response = policyStore ? PolicyStore.parse(policyStore.data) : null

span.end()

return response
}

async setPolicies(clientId: string, payload: PolicyStore) {
this.setCounter.add(1, { clientId })

const span = this.traceService.startSpan(`${PolicyDataStoreService.name}.setPolicies`, {
attributes: { clientId }
})

const client = await this.clientService.findById(clientId)

if (!client) {
Expand All @@ -46,10 +72,14 @@ export class PolicyDataStoreService extends SignatureService {

const success = await this.clusterService.sync(clientId)

return {
const response = {
latestSync: { success },
policy: PolicyStore.parse(data),
version
}

span.end()

return response
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,8 @@ export class DataStoreController {
status: HttpStatus.CREATED,
type: SetEntityStoreResponseDto
})
setEntities(@Query('clientId') clientId: string, @Body() body: SetEntityStoreDto) {
return this.entityDataStoreService.setEntities(clientId, body)
async setEntities(@Query('clientId') clientId: string, @Body() body: SetEntityStoreDto) {
return await this.entityDataStoreService.setEntities(clientId, body)
}

@Post('/policies')
Expand All @@ -105,11 +105,11 @@ export class DataStoreController {
status: HttpStatus.CREATED,
type: SetPolicyStoreResponseDto
})
setPolicies(
async setPolicies(
@Query('clientId') clientId: string,
@Body() body: SetPolicyStoreDto
): Promise<SetPolicyStoreResponseDto> {
return this.policyDataStoreService.setPolicies(clientId, body)
return await this.policyDataStoreService.setPolicies(clientId, body)
}

@Post('/sync')
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { ConfigModule } from '@narval/config-module'
import { LoggerModule, secret } from '@narval/nestjs-shared'
import { LoggerModule, OpenTelemetryModule, secret } from '@narval/nestjs-shared'
import { Action, AuthorizationRequest, Eip712TypedData, FIXTURE, Request } from '@narval/policy-engine-shared'
import { getQueueToken } from '@nestjs/bull'
import { HttpStatus, INestApplication } from '@nestjs/common'
Expand Down Expand Up @@ -56,6 +56,7 @@ describe('Authorization Request', () => {
}),
QueueModule.forRoot(),
PersistenceModule,
OpenTelemetryModule.forTest(),
OrchestrationModule
]
})
Expand Down
4 changes: 1 addition & 3 deletions apps/armory/src/orchestration/orchestration.module.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
import { BullAdapter } from '@bull-board/api/bullAdapter'
import { BullBoardModule } from '@bull-board/nestjs'
import { ConfigModule } from '@narval/config-module'
import { HttpModule } from '@nestjs/axios'
import { BullModule } from '@nestjs/bull'
import { Module } from '@nestjs/common'
import { ConditionalModule } from '@nestjs/config'
import { Env, isEnv, load } from '../armory.config'
import { Env, isEnv } from '../armory.config'
import { AUTHORIZATION_REQUEST_PROCESSING_QUEUE, DEFAULT_HTTP_MODULE_PROVIDERS } from '../armory.constant'
import { ClientModule } from '../client/client.module'
import { DataFeedModule } from '../data-feed/data-feed.module'
Expand All @@ -23,7 +22,6 @@ import { AuthorizationRequestProcessingConsumer } from './queue/consumer/authori
import { AuthorizationRequestProcessingProducer } from './queue/producer/authorization-request-processing.producer'

const INFRASTRUCTURE_MODULES = [
ConfigModule.forRoot({ load: [load] }),
HttpModule,
PersistenceModule,
// TODO (@wcalderipe, 11/01/24): Figure out why can't I have a wrapper to
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { ConfigModule } from '@narval/config-module'
import { LoggerModule, secret } from '@narval/nestjs-shared'
import { LoggerModule, OpenTelemetryModule, secret } from '@narval/nestjs-shared'
import {
Action,
AuthorizationRequest,
Expand Down Expand Up @@ -99,6 +99,7 @@ describe(AuthorizationRequestProcessingConsumer.name, () => {
BullModule.registerQueue({
name: AUTHORIZATION_REQUEST_PROCESSING_QUEUE
}),
OpenTelemetryModule.forTest(),
HttpModule,
PolicyEngineModule
],
Expand Down
Loading