关于tokens的统计 #30

suwubee · 2023-04-08T06:27:58Z

感谢如此精致简洁的代码。感觉所有stream形式都不方便统计tokens，后来看了下官方文档。数据流返回是按1个token返回的，也就是每一个data都是占用1个token。
那么不知道大佬能否加上输出完整之后的data数累加，这样可以作为消耗的tokens来计算。

suwubee · 2023-04-08T06:50:11Z

ai写了个

 const readChunk = async (totalChunks = 0) => { // 将 totalChunks 作为参数
                return reader.read().then(async ({value, done}) => {
                        if (!done) {
                            value = decoder.decode(value);
                            let chunks = value.split(/\n{2}/g);
                            chunks = chunks.filter(item => {
                                return item.trim();
                            });
                            for (let i = 0; i < chunks.length; i++) {
                                let chunk = chunks[i];
                                if (chunk) {
                                    totalChunks++; // 每次处理一个 chunk 时递增
                                    let payload;
                                    try {
                                        payload = JSON.parse(chunk.slice(6));
                                    } catch (e) {
                                        break;
                                    }
                                    if (payload.choices[0].finish_reason) {
                                        let lenStop = payload.choices[0].finish_reason === "length";
                                        let longReplyFlag = enableLongReply && lenStop;
                                        if (!enableLongReply && lenStop) {currentResEle.children[1].children[0].className = "halfRefReq"}
                                        else {currentResEle.children[1].children[0].className = "refreshReq"};
                                        if (existVoice && enableAutoVoice && currentVoiceIdx === autoVoiceDataIdx) {
                                            let voiceText = longReplyFlag ? "" : progressData.slice(autoVoiceIdx), stop = !longReplyFlag;
                                            autoSpeechEvent(voiceText, currentResEle, false, stop);
                                        }
                                        break;
                                    } else {
                                        let content = payload.choices[0].delta.content;
                                        if (content) {
                                            if (!progressData && !content.trim()) continue;
                                            if (existVoice && enableAutoVoice && currentVoiceIdx === autoVoiceDataIdx) {
                                                let spliter = content.match(/\.|\?|!|。|？|！|\n/);
                                                if (spliter) {
                                                    let voiceText = progressData.slice(autoVoiceIdx) + content.slice(0, spliter.index + 1);
                                                    autoVoiceIdx += voiceText.length;
                                                    autoSpeechEvent(voiceText, currentResEle);
                                                }
                                            }
                                            if (progressData) await delay();
                                            progressData += content;
                                            currentResEle.children[0].innerHTML = md.render(progressData);
                                            if (!isRefresh) {
                                                scrollToBottom();
                                            }
                                        }
                                    }
                                }
                            }
                            return readChunk(totalChunks);
                        } else {
                            console.log('Total chunks processed:', totalChunks); // 输出总的 chunks 数量
                            if (isRefresh) {
                                data[refreshIdx].content = progressData;
                                if (longReplyFlag) return streamGen(true);
                            } else {
                                if (long) {data[data.length - 1].content = progressData}
                                else {data.push({role: "assistant", content: progressData})}
                                if (longReplyFlag) return streamGen(true);
                            }
                            stopLoading(false);
                        }
                    });
                };

suwubee · 2023-04-08T06:57:28Z

然后再写一个tiktoken的服务，就可以统计输入的tokens了

xqdoo00o · 2023-04-08T09:03:02Z

是这样啊，我还在研究怎么前端实现tiktoken，，，不过输入确实要tiktoken才能算

suwubee · 2023-04-08T09:19:05Z

是这样啊，我还在研究怎么前端实现tiktoken，，，不过输入确实要tiktoken才能算

好像只能自己封装api了。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于tokens的统计 #30

关于tokens的统计 #30

suwubee commented Apr 8, 2023

suwubee commented Apr 8, 2023 •

edited

Loading

suwubee commented Apr 8, 2023

xqdoo00o commented Apr 8, 2023

suwubee commented Apr 8, 2023

关于tokens的统计 #30

关于tokens的统计 #30

Comments

suwubee commented Apr 8, 2023

suwubee commented Apr 8, 2023 • edited Loading

suwubee commented Apr 8, 2023

xqdoo00o commented Apr 8, 2023

suwubee commented Apr 8, 2023

suwubee commented Apr 8, 2023 •

edited

Loading