-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
transformers.js sample #581
Changes from 5 commits
389d123
81ece54
c8bf279
590f8bb
9a56588
a1dec5c
4fd7b9a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
--- | ||
title: Transformer.js | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The title field in the metadata section should be 'Transformers.js' instead of 'Transformer.js'.
|
||
sidebar: | ||
order: 20 | ||
--- | ||
import { Code } from "@astrojs/starlight/components" | ||
import sampleSrc from "../../../../../packages/sample/genaisrc/summary-with-transformers.genai?raw" | ||
|
||
|
||
HuggingFace [Transformers.js](https://huggingface.co/docs/transformers.js/index) is a JavaScript library | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The URL provided for Transformers.js documentation is incorrect; it should be
|
||
that lets you run pretrained models locally on your machine. The library uses [onnxruntime](https://onnxruntime.ai/) | ||
to leverage the CPU/GPU capabilities of your hardware. | ||
|
||
In this guide, we will show how to create [summaries](https://huggingface.co/tasks/summarization) using the [Transformers.js](https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.SummarizationPipeline) library. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The URL provided for Transformers.js summarization pipeline is incorrect; it should be
|
||
|
||
:::tip | ||
|
||
Transformers.js has an extensive list of tasks available. This guide will only cover one but checkout their [documentation](https://huggingface.co/docs/transformers.js/pipelines#tasks) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The URL provided for Transformers.js tasks documentation is incorrect; it should be
|
||
for more. | ||
|
||
::: | ||
|
||
## Installation | ||
|
||
Following the [installation instructions](https://huggingface.co/docs/transformers.js/installation), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The URL provided for Transformers.js installation instructions is incorrect; it should be
|
||
we add the [@xenova/transformers](https://www.npmjs.com/package/@xenova/transformers) to the current project. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The package name
|
||
|
||
```bash | ||
npm install @xenova/transformers | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The package name
|
||
``` | ||
|
||
You can also install this library globally to be able to use on any project | ||
|
||
```bash "-g" | ||
npm install -g @xenova/transformers | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The package name
|
||
``` | ||
|
||
## Import the pipeline | ||
|
||
The snippet below imports the Transformers.js library and loads the summarizer pipeline and model. | ||
You can specify a model name or let the library pick the latest and greatest. | ||
|
||
```js | ||
import { pipeline } from "@xenova/transformers" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The import statement uses an incorrect package name
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The import statement should use '@huggingface/transformers' instead of '@xenova/transformers'.
|
||
const summarizer = await pipeline("summarization") | ||
``` | ||
|
||
Allocating and loading the model can take some time, | ||
so it's best to do this at the beginning of your script | ||
and only once. | ||
|
||
:::note[Migrate your script to `.mjs`] | ||
|
||
To use the `Transformers.js` library, you need to use the `.mjs` extension for your script (or `.mts` for TypeScript support). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The advice to migrate scripts to
|
||
If your script is ending in `.genai.js`, rename it to `.genai.mjs`. | ||
|
||
::: | ||
|
||
## Invoke the pipeline | ||
|
||
The summarizer pipeline has a single argument, the content to summarize. It returns an array of summaries | ||
which we need to unpack to access the final summary text. This is what we do below and `summary_index` contains the summary text. | ||
|
||
```js | ||
const [summary] = await summarizer(content) | ||
// @ts-ignore | ||
const { summary_text } = summary | ||
``` | ||
|
||
## Final code | ||
|
||
The example below generates a summary of each input file | ||
before letting the model generate a full summary. | ||
|
||
<Code | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The file reference
|
||
title="transformers.genai.mjs" | ||
code={sampleSrc} | ||
wrap={true} | ||
lang="js" | ||
/> |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -45,7 +45,7 @@ import { | |
} from "../../core/src/models" | ||
import { createBundledParsers } from "../../core/src/pdf" | ||
import { AbortSignalOptions, TraceOptions } from "../../core/src/trace" | ||
import { unique } from "../../core/src/util" | ||
import { logVerbose, unique } from "../../core/src/util" | ||
|
||
class NodeServerManager implements ServerManager { | ||
async start(): Promise<void> { | ||
|
@@ -73,6 +73,7 @@ class ModelManager implements ModelService { | |
if (provider === MODEL_PROVIDER_OLLAMA) { | ||
if (this.pulled.includes(modelid)) return { ok: true } | ||
|
||
logVerbose(`ollama: pulling ${modelid}...`) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logVerbose function is called without any context. It might be better to provide more information about the current state of the application. 🤔
|
||
const conn = await this.getModelToken(modelid) | ||
const res = await fetch(`${conn.base}/api/pull`, { | ||
method: "POST", | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ import { assert } from "console" | |
import { host } from "./host" | ||
import { logError } from "./util" | ||
import { TraceOptions } from "./trace" | ||
import { pathToFileURL } from "url" | ||
import { fileURLToPath, pathToFileURL } from "url" | ||
|
||
function resolveGlobal(): any { | ||
if (typeof window !== "undefined") | ||
|
@@ -46,10 +46,10 @@ export async function importPrompt( | |
import.meta.url ?? | ||
pathToFileURL(__filename ?? host.projectFolder()).toString() | ||
|
||
trace?.itemValue(`import`, `${modulePath}, parent: ${parentURL}`) | ||
const onImport = (file: string) => { | ||
trace?.itemValue("📦 import", file) | ||
// trace?.itemValue("📦 import", fileURLToPath(file)) | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You have commented out the trace function call. This might lead to loss of important debugging information. Please ensure this is intentional. 🕵️♀️
|
||
onImport(modulePath) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The onImport function is called with modulePath but the result is not used or stored. This could lead to unexpected behavior. 😕
|
||
const { tsImport, register } = await import("tsx/esm/api") | ||
unregister = register({ onImport }) | ||
const module = await tsImport(modulePath, { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,6 +19,7 @@ import { isJSONSchema } from "./schema" | |
import { consoleLogFormat } from "./logging" | ||
import { resolveFileDataUri } from "./file" | ||
import { isGlobMatch } from "./glob" | ||
import { logVerbose } from "./util" | ||
|
||
export function createChatTurnGenerationContext( | ||
options: GenerationOptions, | ||
|
@@ -28,7 +29,10 @@ export function createChatTurnGenerationContext( | |
|
||
const log = (...args: any[]) => { | ||
const line = consoleLogFormat(...args) | ||
if (line) trace.log(line) | ||
if (line) { | ||
trace.log(line) | ||
logVerbose(line) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logVerbose function is called with the same argument as trace.log. This could lead to duplicate log entries. 🙃
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logVerbose function is called inside the trace.log condition. This could lead to excessive logging and performance issues. Consider moving it outside the condition or adding additional checks. 😮
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The logVerbose function is called right after trace.log with the same argument. This could lead to duplicate logs. Consider removing one of them to avoid unnecessary duplication. 🔄
|
||
} | ||
} | ||
const console = Object.freeze<PromptGenerationConsole>({ | ||
log, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
script({ | ||
title: "summary of summary - transformers.js", | ||
model: "ollama:phi3", | ||
files: ["src/rag/markdown.md"], | ||
tests: { | ||
files: ["src/rag/markdown.md"], | ||
keywords: ["markdown"], | ||
}, | ||
}) | ||
|
||
console.log("loading summarizer transformer") | ||
import { pipeline } from "@xenova/transformers" | ||
const summarizer = await pipeline("summarization") | ||
|
||
for (const file of env.files) { | ||
console.log(`summarizing ${file.filename}`) | ||
const [summary] = await summarizer(file.content) | ||
// @ts-ignore | ||
const { summary_text } = summary | ||
def("FILE", { | ||
filename: file.filename, | ||
// @ts-ignore | ||
content: summary_text, | ||
}) | ||
} | ||
|
||
console.log(`summarize all summaries`) | ||
$`Summarize all the contents in FILE.` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The filename
Transformer.js
should betransformers.js
to match the library's actual name.