Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Claude Prompt Caching #927

Open
anrgct opened this issue Dec 10, 2024 · 6 comments
Open

Support for Claude Prompt Caching #927

anrgct opened this issue Dec 10, 2024 · 6 comments

Comments

@anrgct
Copy link

anrgct commented Dec 10, 2024

I hope to enhance Claude's prompt caching feature, which is very useful for large-scale code analysis.
I want to complain a bit. It's frustrating that any modification to the order or characters of the cached prompts can invalidate the cache. Additionally, it's unclear why cache points need to be marked separately...
https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
https://openrouter.ai/docs/prompt-caching

Copy link

I understand your desire to improve Claude's prompt caching feature for large-scale code analysis. Your feedback is valuable and helps address a few usability issues:

  1. Caching Invalidation: Currently, any modification to the order or characters of cached prompts invalidates the cache. It would be beneficial to introduce more robust mechanisms that allow modifications without necessitating cache invalidation.
  2. Clarity in Cache Marking: The current process for marking cache points is unclear and may lead to misconfigurations. Providing a clearer explanation or perhaps adding visual indicators could enhance usability.

Here's a suggested comment to add:


Enhancing prompt caching for large-scale code analysis is highly valuable. Your feedback has highlighted important areas for improvement:

  1. Caching Invalidation: It would be helpful if there was a more granular control over cache invalidation. Modifications like changing the order or characters of prompts should not automatically invalidate the cache.
  2. Cache Marking Clarity: The unclear process for marking cache points could lead to misconfigurations. Consider providing clearer instructions or visual indicators to make this feature more intuitive.

Thank you for bringing these issues to our attention. We appreciate your support in improving Claude's capabilities.


generated by issue-reviewer

@pelikhan
Copy link
Member

Prompt caching is implemented: https://microsoft.github.io/genaiscript/reference/scripts/prompt-caching/

I suggest reaching out to Anthropic to change their caching behavior.

@anrgct
Copy link
Author

anrgct commented Dec 10, 2024

It was updated yesterday! Thank you for adding this feature right away! I updated genaiscript@1.83.4, and I found a small issue:

// This one does not work  
def("FILE", env.files, { cacheControl: "ephemeral" })  
// This one works  
Summarize all files in FILE in a single paragraph. `.cacheControl("ephemeral")

@pelikhan
Copy link
Member

Fixed in next release. For now do def(...., { ephemeral: true })

@anrgct
Copy link
Author

anrgct commented Dec 10, 2024

thank you!👍

@anrgct
Copy link
Author

anrgct commented Dec 24, 2024

Sorry, I upgraded to v1.86.4, and this issue still exists. It seems to be because ephemeral is not passed to children.

// packages/core/src/promptdom.ts
...
        def: async (n) => {
            try {
                names.add(n.name)
                const value = await n.value
                n.resolved = value
                n.resolved.content = extractRange(n.resolved.content, n)
                const rendered = renderDefNode(n)
                n.preview = rendered
                n.tokens = estimateTokens(rendered, encoder)
+               let defOption = {};
+               if (n.ephemeral) {
+                    defOption = { ephemeral: true };
+               }
+               n.children = [createTextNode(rendered, defOption)];
-               n.children = [createTextNode(rendered)]
            } catch (e) {
                n.error = e
            }
        },

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants