Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init commit #7

Merged
merged 2 commits into from
Nov 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: docusaurus.io

on:
push:
branches:
# - gh-pages
- main
# Review gh actions docs if you want to further define triggers, paths, etc
# https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#on

permissions:
contents: write

jobs:
deploy:
name: Deploy to GitHub Pages
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 18
cache: npm

- name: Install dependencies
run: npm ci

- name: Install docusaurus
run: npm install --global docusaurus-init

- name: Build website
run: npm run build

# Popular action to deploy to GitHub Pages:
# Docs: https://github.com/peaceiris/actions-gh-pages#%EF%B8%8F-docusaurus
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v3
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
# Build output to publish to the `gh-pages` branch:
publish_dir: ./build
# The following lines assign commit authorship to the official
# GH-Actions bot for deploys to `gh-pages` branch:
# https://github.com/actions/checkout/issues/13#issuecomment-724415212
# The GH actions bot is used by default if you didn't specify the two fields.
# You can swap them out with your own user credentials.
user_name: github-actions[bot]
user_email: 41898282+github-actions[bot]@users.noreply.github.com
29 changes: 29 additions & 0 deletions .github/workflows/test-deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
name: Test deployment

on:
pull_request:
branches:
- main
# Review gh actions docs if you want to further define triggers, paths, etc
# https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#on

jobs:
test-deploy:
name: Test deployment
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 18
cache: npm

- name: Install dependencies
run: npm ci

- name: Install docusaurus
run: npm install --global docusaurus-init

- name: Build website
run: npm run build

20 changes: 20 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Dependencies
/node_modules

# Production
/build

# Generated files
.docusaurus
.cache-loader

# Misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local

npm-debug.log*
yarn-debug.log*
yarn-error.log*
3 changes: 3 additions & 0 deletions babel.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
module.exports = {
presets: [require.resolve('@docusaurus/core/lib/babel/preset')],
};
30 changes: 30 additions & 0 deletions docs/Deployment/Azure Model Deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
## Create and Deploy OpenAI Model in Azure

In this section, you will learn how to create an Azure OpenAI model GPT-3.5-turbo/GPT-4.

1. Login to the Azure portal (https://portal.azure.com/)
2. Use the search bar to look up **Azure OpenAI** and click it to navigate to the **Azure AI Services|Azure OpenAI** page.

![](img/step2.jpg)

3. In **Azure AI Services|Azure OpenAI**, click **Create** and fill in all the required fields.

![](img/step3.jpg)

4. You may need to request access to Azure OpenAI Services. Follow the link in the notification to do that.

![](img/step5.jpg)

5. When done, you should have your OpenAI model in the **Azure AI services** section. Click it to open and then click **Go to Azure OpenAI Studio** in the top bar.

![](img/step8.jpg)

6. In Azure OpenAI Studio, click **Deployment** in the navigation menu and click **Create new deployment**. Fill in the required fields and click **Create** to create a model.

![](img/step9.jpg)

> It is important to note that certain models may not be accessible for deployment in a particular region. If you need a particular model, you will have to submit a separate request or relocate Azure OpenAI to a different region.

7. Go back to your model page and click **Keys and Endpoint**. In this section, you can find your key and endpoint that you will need to provide in [AI DIAL configuration file](./dialConfig.yaml#L30).

![](img/step13.jpg)
98 changes: 98 additions & 0 deletions docs/Deployment/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Configuration

> Refer to the provided [example of config](./dialConfig.yaml), where you can find the description of the application-specific parameters.

The `dialConfig.yaml` configuration file of the AI DIAL application is comprised of several main sections:

* Standard parameters of helm chart. Refer to [Helm docs](https://helm.sh/).
* [Front-end parameters](#front-end-parameters): in `env` and `secrets` sections in config.
* [Back-end parameters](#back-end-parameters): in `proxy` section in the config.
* [Configuration of Adapters for models](#configuration-of-adapters).

> **Important**: it is assumed that you have a working knowledge of standard Helm chart parameters in order to define them within the configuration file.

## Front-End Parameters

> Refer to the [AI DIAL Chat](https://github.com/epam/ai-dial-chat#environment-variables) to view a complete documentation.

Configure front-end parameters in the [`env`](./dialConfig.yaml#L18) and [`secrets`](./dialConfig.yaml#L30) sections of the config file:

|Parameter|Description|
|---------|-----------|
|NEXTAUTH_URL|Public URL of the application. When deploying to production, set the `NEXTAUTH_URL` environment variable to the canonical URL of your site.|
|NEXTAUTH_SECRET|A random string used as a ceed for authentication. Used to encrypt the NextAuth.js JWT, and to hash email verification tokens.|
|NEXT_PUBLIC_DEFAULT_TEMPERATURE|Default temperature settings in the range: [0 1]|
|NEXT_PUBLIC_DEFAULT_SYSTEM_PROMPT|The default system prompt.|
|NEXT_PUBLIC_APP_NAME | Public Application Name |
|OPENAI_API_HOST|AI DIAL back-end URL.|
|OPENAI_API_KEY|Open AI API Key.|
|OPENAI_API_VERSION|Version of the OpenAI API.|
|DEFAULT_MODEL|Default LLM.|
|ENABLED_FEATURES|A list of UI features.|
|NEXT_PUBLIC_APP_NAME|Application name.|
|AVAILABLE_MODELS_USERS_LIMITATIONS|Specify models and users that have access to them. Skip to allow all users to access all models.|
|AVAILABLE_ADDONS_USERS_LIMITATIONS|Specify Addons and users that have access to them. Skip to allow all users to access all Addons.|
|CLIENT_ID|You client id at auth provider.|
|TENANT_ID|You tenant id at auth provider.|
|SECRET|You secret at auth provider.|
|NAME|Display name in AI DIAL app.|
|HOST|Auth provider URL.|
|AUDIENCE|Your audience at auth provider.|

## Back-End Parameters

> Refer to the [AI DIAL Core](https://github.com/epam/ai-dial-core) to view a complete documentation.

Configure back-end parameters in the [`proxy`](./dialConfig.yaml#L74) section of the config file.

You can provide dynamic or static configurations for the back-end. Provide path to the corresponding configuration in the `proxy.env` section.

Static settings are configured at application startup and do not change throughout the application lifecycle.

Priority order of configurations:

* Environment variables with extra "proxy." prefix. E.g. "proxy.server.port", "proxy.config.files".
* File specified in "PROXY_SETTINGS" environment variable.
* Default resource file: src/main/resources/proxy.settings.json.

|Parameter|Default Value|Description|
|---------|-----------|-------------|
|config.files|proxy.config.json |Config files with parts of the whole config.|
|config.reload|60000|Config reload interval in milliseconds.|
|identityProvider.jwksUrl|-|URL to the jwks provider.|
|identityProvider.appName|dial|App name to search in "resource_access" claim of JWT token to check access for deployments.|
|identityProvider.loggingKey|-|User information to search in claims of JWT token.|
|identityProvider.loggingSalt|-|Salt to hash user information for logging.|
|identityProvider.cacheSize|10|How many JWT tokens to cache.|
|identityProvider.cacheExpiration|10|How long to retain JWT tokens in cache.|
|identityProvider.cacheExpirationUnit|MINUTES|Unit of cache expiration.|

This file includes standard [Vertex library configurations](https://cloud.google.com/vertex-ai/docs/start/client-libraries).

**Dynamic** settings are defined in the `proxy.config.json` file by default. You can override settings in this file by configuring static settings.

### proxy.config.json parameters

> Refer to the [confuration file](./dialConfig.yaml#L112) to view an example.

|Parameter|Description|
|---------|-----------|
|routes|Path(s) Paths to route to specific upstreams or to respond with configured body.|
|route-rate|Parameters for vote endpoint.|
|applications|A list of deployed AI DIAL Applications and their parameters:<br />`endpoint`: AI DIAL Application API for chat completions.<br />`iconUrl`: a path to the icon used for the AI DIAL Application on the UI.<br />`description`: a brief description of the AI DIAL Application rendered on the UI.<br />`displayName`: a name of the AI DIAL Application used on the UI.|
|models|A list of deployed models and their parameters:<br />`type`: specify `chat` or `embedding` model type.<br />`iconUrl`: a path to the icon used for the model on the UI.<br />`description`: a brief description of the model rendered on the UI.<br />`displayName`: a name of the model rendered on the UI.<br />`endpoint`: model API for chat completions or embeddings.<br />`upstreams`: upstreams are used for load-balancing. A request will be sent to the configured model endpoint and will contain X-UPSTREAM-ENDPOINT and X-UPSTREAM-KEY headers:<br />`endpoint`: model endpoint.<br />`key`: your API key.|
|keys|API Keys parameters:<br />`<proxyKey>`: your API key.<br />`project`: a project name this key is assigned to.<br />`role`: name of one of the configured roles. Defines permissions for the key.<br />`userAuth`: can be disabled/enabled/optional.<br />**Disabled** - Authorization header is ignored and not sent to upstream.<br />**Enabled** - Authorization header is required and sent to upstream. Optional - Authorization header is optional and sent to upstream if present.|
|roles|A list of configured roles with their limitations. Specify a role name and in limits, specify models (can be models, Applications, Addons, Assistants) and limitation configurations:<br />`minute`: the total tokens per minute limit that can be sent to the model is managed using a floating window approach. This technique ensures a well-distributed rate-limiting mechanism, allowing control over the number of tokens sent to the model within a defined time frame, typically a one-minute window.<br />`day`: the total tokens per day limit that can be sent to the model is managed using a floating window approach. This method ensures a balanced rate-limiting mechanism, allowing control over the number of tokens sent to the model within a specified time frame, typically a 24-hour window.|

## Configuration of Adapters

To work with Azure, AWS or GCP models we use applications called Adapters. You can configure Adapters in the configuration file.

Refer to these repositories to view a complete documentation for:

* [Adapter for Bedrock](https://github.com/epam/ai-dial-adapter-bedrock)
* [Adapter for Vertex](https://github.com/epam/ai-dial-adapter-vertexai)
* [Adapter for OpenAI](https://github.com/epam/ai-dial-adapter-openai)

> Refer to the provided [example of config](./dialConfig.yaml#L263) to view configuration examples.

Loading