-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Anthropic Claude 3 Haiku to generate topics via AWS Bedrock #503
base: develop
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #503 +/- ##
=============================================
- Coverage 45.70% 39.06% -6.64%
+ Complexity 319 282 -37
=============================================
Files 50 57 +7
Lines 2700 2785 +85
Branches 210 214 +4
=============================================
- Hits 1234 1088 -146
- Misses 1375 1613 +238
+ Partials 91 84 -7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm any thoughts about keeping this file around? This was generated using OpenAI and I included it for comparison purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea to give an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor documentation questions
pom.xml
Outdated
@@ -38,7 +38,7 @@ | |||
|
|||
<github.url>scm:git:git@github.com:dockstore/dockstore-support.git</github.url> | |||
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> | |||
<dockstore-core.version>1.16.0-alpha.16</dockstore-core.version> | |||
<dockstore-core.version>1.16.0-SNAPSHOT</dockstore-core.version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should be versioned, I think you can use a dated snapshot in a pinch
Like
<dependency>
<groupId>io.dockstore</groupId>
<artifactId>bom-internal</artifactId>
<version>1.16.0-20240926.155529-115</version>
<type>pom</type>
</dependency>
but I haven't tried it yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll update this once I create a new webservice tag 🙂
"#workflow/github.com/DataBiosphere/analysis_pipeline_WDL/ld-pruning-wdl",v7.1.1,https://raw.githubusercontent.com/DataBiosphere/analysis_pipeline_WDL/v7.1.1/ld-pruning/ld-pruning.wdl,eabd52bf3f223c58ad220a418f87f2ec23d2dd91958b61f12066144a8a84a444,false,4460,41,0.0011662500000000002,end_turn,"Calculates linkage disequilibrium, subsets GDS files, merges the subsetted files, and checks the merged file against the inputs." | ||
"#workflow/github.com/AnalysisCommons/genesis_wdl/genesis_GWAS",v1_5,https://raw.githubusercontent.com/AnalysisCommons/genesis_wdl/v1_5//genesis_GWAS.wdl,9c0c50df22bb95869dcc66d77693ecaf702b980878f38feca3ebd75e00eb9fbe,false,4265,43,0.00112,end_turn,"Execute the null model generation, association testing, and summarization tasks to perform a genome-wide association study (GWAS) using the GENESIS biostatistical package." | ||
"#workflow/github.com/aofarrel/covstats-wdl",master,https://raw.githubusercontent.com/aofarrel/covstats-wdl/master/covstats/covstats.wdl,49dec26c695bbb88e0e1198e04c6857f497c80b8b59f8bd86377eaf76ee74a4a,false,2532,42,6.855E-4,end_turn,"Perform read length and coverage analysis on input BAM/CRAM files, generate a report summarizing the results, and handle various file types and runtime configurations." | ||
"#workflow/github.com/broadinstitute/warp/Optimus",aa-PD2413,https://raw.githubusercontent.com/broadinstitute/warp/aa-PD2413/pipelines/skylab/optimus/Optimus.wdl,fb1b9fbd4be73e7210dec444d446c7405afcbcb11f9030391b5e63dd9defe4b6,false,3305,66,9.0875E-4,end_turn,"Imports necessary WDL workflows, defines input parameters, performs input checks, splits FASTQ files, aligns reads, merges BAM files, calculates gene and cell metrics, generates sparse count matrix, runs EmptyDrops, and produces an H5AD output file." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Imports necessary WDL workflows"
Amusing but I guess it's not wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea to give an example
} | ||
|
||
private String createClaudeRequest(String prompt) { | ||
final double temperature = 0.5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javadoc or link if there was reasoning behind a temperature of 0.5 , guessing this makes it more predicable less creative in general (compared to 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was copied from their example 😅 their default is 1, let me try that
|
||
private String createClaudeRequest(String prompt) { | ||
final double temperature = 0.5; | ||
final String anthropicVersion = "bedrock-2023-05-31"; // Must be this value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto for this, 2023 seems on the old side but I'm guessing it really means "includes data up to 2023" rather than "released in 2023"
token: YOUR_ADMIN_OR_CURATOR_DOCKSTORE_TOKEN | ||
|
||
[ai] | ||
openai-api-key: YOUR_OPENAI_API_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems since the openai code is present, could leave openai-api-key
and just note that it is optional
This is cool to see. I think it might be worth trying to do a benchmark of the performance of multiple models here and publishing on it. |
@@ -0,0 +1,24 @@ | |||
trsId,version,descriptorUrl,descriptorChecksum,isTruncated,promptTokens,completionTokens,cost,finishReason,aiTopic | |||
"#workflow/github.com/iwc-workflows/sars-cov-2-pe-illumina-artic-variant-calling/COVID-19-PE-ARTIC-ILLUMINA",main,https://raw.githubusercontent.com/iwc-workflows/sars-cov-2-pe-illumina-artic-variant-calling/main//pe-artic-variation.ga,dcc2761eb35156d7d09479112daf089439774fc29938f02cb6ee8cda87906758,false,24595,43,0.0062025,end_turn,"Trim ARTIC primer sequences, realign reads, call and filter variants, annotate variants, and apply a strand-bias soft filter to the final annotated variants." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come some of the verbs have s
and some don't? Trim
here, but Filters
on the next line? I guess it depends on the content?
I'd like it if they were all the same, but that's probably just me and probably isn't worth the effort and/or may not even make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be easy to fix by adding a description of the tense to the prompt, ala
starts with a first-person present tense verb
LOG.info("Generated topic for entry with TRS ID {} and version {}", trsId, versionId); | ||
String prompt = "Summarize the " + entryType + " in one sentence that starts with a verb in the <summary> tags. Use a maximum of 150 characters.\n<content>" + descriptorFile.getContent() + "</content>"; | ||
Optional<AIResponseInfo> aiResponseInfo = aiModel.generateTopic(prompt); | ||
if (aiResponseInfo.isPresent()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to change, but could use Optional.ifPresentOrElse
to convert this conditional to a "safer" form.
@@ -0,0 +1,24 @@ | |||
trsId,version,descriptorUrl,descriptorChecksum,isTruncated,promptTokens,completionTokens,cost,finishReason,aiTopic | |||
"#workflow/github.com/iwc-workflows/sars-cov-2-pe-illumina-artic-variant-calling/COVID-19-PE-ARTIC-ILLUMINA",main,https://raw.githubusercontent.com/iwc-workflows/sars-cov-2-pe-illumina-artic-variant-calling/main//pe-artic-variation.ga,dcc2761eb35156d7d09479112daf089439774fc29938f02cb6ee8cda87906758,false,24595,43,0.0062025,end_turn,"Trim ARTIC primer sequences, realign reads, call and filter variants, annotate variants, and apply a strand-bias soft filter to the final annotated variants." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be easy to fix by adding a description of the tense to the prompt, ala
starts with a first-person present tense verb
@@ -6,7 +6,7 @@ token: 08932ab0c9ae39a880905666902f8659633ae0232e94ba9f3d2094cb928397e7 | |||
|
|||
[s3] | |||
bucketName: local-dockstore-metrics-data | |||
endpointOverride: http://localhost:4566 | |||
endpointOverride: https://s3.localhost.localstack.cloud:4566 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this spurious or does it relate to the ai topics somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this was a side-effect of a necessary AWS SDK upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should've commented in this PR as well, but here's the reason why this changed: dockstore/dockstore#6003 (comment)
* | ||
* @return | ||
*/ | ||
public abstract Optional<AIResponseInfo> generateTopic(String prompt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually, we'll probably use AIModel
to submit prompts for other [non-topic-related] purposes, and it looks like the code already would support that, so would it make sense to name generateTopic
something more suggestive of the generalized purpose, like submitPrompt
? (and similarly, the field with a name that contains topic
in the response record)
/** | ||
* An AI model that generates topics. | ||
*/ | ||
public abstract class AIModel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In hierarchies like this, it can be useful to define the core interface as an interface
, ala:
public interface AIModel {
String getModelName();
double getPricePer1kInputTokens();
double getPricePer1kOutputTokens();
double getMaxContentLength();
[...]
}
Then, you can define the "utility" base abstract class as follows:
public abstract class BaseAIModel implements AIModel {
// implementations of basic functionality
}
Model-specific classes can then extend `BaseAIModel:
public class AnthropicClaudeModel extends BaseAIModel {
// claude-specific implementation
}
What does this buy you? Amongst other things, if you want to implement a new AIModel
that works differently than the derivatives of BaseAIModel
- say it wraps another AIModel
, or combines several AIModels
, or whatever - you can simply implement AIModel
directly, and not have to deal with all the stuff in BaseAIModel
that isn't needed or doesn't relate.
trsId,version,descriptorUrl,descriptorChecksum,isTruncated,promptTokens,completionTokens,finishReason,aiTopic | ||
"#workflow/github.com/dockstore-testing/testWorkflow",master,https://raw.githubusercontent.com/dockstore-testing/testWorkflow/master/Dockstore.cwl,07d68a2bce6118b31018c31a013325cda07030efc92750c036df4ab112849c02,true,16283,50,stop,"The workflow starts by inputting Illumina-sequenced ARTIC data and involves a voyeur." | ||
trsId,version,descriptorUrl,descriptorChecksum,isTruncated,promptTokens,completionTokens,cost,finishReason,aiTopic | ||
"#workflow/github.com/dockstore-testing/testWorkflow",master,https://raw.githubusercontent.com/dockstore-testing/testWorkflow/master/Dockstore.cwl,07d68a2bce6118b31018c31a013325cda07030efc92750c036df4ab112849c02,true,16283,50,0,stop,"The workflow starts by inputting Illumina-sequenced ARTIC data and involves a voyeur." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wut (lol)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#497
This was the least offensive way I could trigger one of the obscenity filters for training.
Amusingly. IIRC, I ruled out one of the filters because it found "breast cancer" too offensive which seemed like it could trip up some of our workflows.
Could be a fun conference poster |
Quality Gate failedFailed conditions |
Description
Corresponding webservice PR: dockstore/dockstore#6003
This PR updates the topics generator so that it uses AWS Bedrock to invoke the Anthropic Claude 3 Haiku model to generate topics. See the ticket comment for more details about why we want to do this and some benchmarking information.
Along with Claude 3 Haiku, the program can also invoke the Claude 3.5 Sonnet model, which is more intelligent and more expensive. This is mainly for experimentation or if we find that Claude 3 Haiku isn't doing so well for whatever reason.
I deprecated the OpenAI functionality in the program, but it is still accessible if we wish to experiment with it.
I've included a new CSV file in the results directory for the topics generated by Claude 3 Haiku. If you read the ticket, you'll see that these topics are cleaner than the ones I initially generated with the old prompt. So Claude 3 Haiku appears to be sufficient, but do take a look at the topics.
Review Instructions
(will fill in)
Issue
SEAB-6469
Security
If there are any concerns that require extra attention from the security team, highlight them here.
Please make sure that you've checked the following before submitting your pull request. Thanks!
mvn clean install
in the project that you have modified (until https://ucsc-cgl.atlassian.net/browse/SEAB-5300 adds multi-module support properly)