Skip to content

Commit

Permalink
improve lipophilicity and benchmarking templates
Browse files Browse the repository at this point in the history
  • Loading branch information
kjappelbaum committed Aug 13, 2024
1 parent db352d4 commit 113cd7e
Show file tree
Hide file tree
Showing 10 changed files with 83 additions and 46 deletions.
48 changes: 24 additions & 24 deletions data/tabular/choline_transporter_butkiewicz/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -62,56 +62,56 @@ templates:
- The polymer with the {compound_name__names__noun} of {compound_name#} has an {Tg_exp__names__noun} of {Tg_exp#} {Tg_exp__units}.
- The polymer with the {compound_name__names__noun} of {compound_name#} has a {Tg_calc__names__noun} of {Tg_calc#} {Tg_calc__units}.
- The polymer with the {compound_name__names__noun} of {compound_name#} has a {rho_300K_calc__names__noun} of {rho_300K_calc#} {rho_300K_calc__units}.
- What is the {Tg_exp__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}? Answer:<EOI> {Tg_exp#} {Tg_exp__units}.
- What is the {Tg_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}? Answer:<EOI> {Tg_calc#} {Tg_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}? Answer:<EOI> {rho_300K_calc#} {rho_300K_calc__units}.
- What is the {Tg_exp__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}? Answer:<EOI> {Tg_exp#} {Tg_exp__units}.
- What is the {Tg_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}? Answer:<EOI> {Tg_calc#} {Tg_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}? Answer:<EOI> {rho_300K_calc#} {rho_300K_calc__units}.
- What is the {Tg_exp__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}? Answer:<EOI>{Tg_exp#} {Tg_exp__units}.
- What is the {Tg_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}? Answer:<EOI>{Tg_calc#} {Tg_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}? Answer:<EOI>{rho_300K_calc#} {rho_300K_calc__units}.
- What is the {Tg_exp__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}? Answer:<EOI>{Tg_exp#} {Tg_exp__units}.
- What is the {Tg_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}? Answer:<EOI>{Tg_calc#} {Tg_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}? Answer:<EOI>{rho_300K_calc#} {rho_300K_calc__units}.
- The polymer with the {PSMILES__description} {PSMILES#} has an {Tg_exp__names__noun} of {Tg_exp#} {Tg_exp__units} and a {Tg_calc__names__noun} of {Tg_calc#} {Tg_calc__units}.
- The polymer with the {compound_name__names__noun} {compound_name#} has an {Tg_exp__names__noun} of {Tg_exp#} {Tg_exp__units} and a {Tg_calc__names__noun} of {Tg_calc#} {Tg_calc__units}.
- Compare the {Tg_exp__names__noun} and {Tg_calc__names__noun} for the polymer with the {PSMILES__description} {PSMILES#}. Answer:<EOI> {Tg_exp#} {Tg_exp__units}, {Tg_calc#} {Tg_calc__units}.
- Compare the {Tg_exp__names__noun} and {Tg_calc__names__noun} for the polymer with the {compound_name__names__noun} {compound_name#}. Answer:<EOI> {Tg_exp#} {Tg_exp__units}, {Tg_calc#} {Tg_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} at 300K? Answer:<EOI> {rho_300K_calc#} {rho_300K_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} at 300K? Answer:<EOI> {rho_300K_calc#} {rho_300K_calc__units}.
- What is the {Tg_exp__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} in Kelvin? Answer:<EOI> {Tg_exp#}.
- What is the {Tg_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} in Kelvin? Answer:<EOI> {Tg_calc#}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} in g/cm^3? Answer:<EOI> {rho_300K_calc#}.
- What is the {Tg_exp__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} in Kelvin? Answer:<EOI> {Tg_exp#}.
- What is the {Tg_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} in Kelvin? Answer:<EOI> {Tg_calc#}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} in g/cm^3? Answer:<EOI> {rho_300K_calc#}.
- Compare the {Tg_exp__names__noun} and {Tg_calc__names__noun} for the polymer with the {PSMILES__description} {PSMILES#}. Answer:<EOI>{Tg_exp#} {Tg_exp__units}, {Tg_calc#} {Tg_calc__units}.
- Compare the {Tg_exp__names__noun} and {Tg_calc__names__noun} for the polymer with the {compound_name__names__noun} {compound_name#}. Answer:<EOI>{Tg_exp#} {Tg_exp__units}, {Tg_calc#} {Tg_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} at 300K? Answer:<EOI>{rho_300K_calc#} {rho_300K_calc__units}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} at 300K? Answer:<EOI>{rho_300K_calc#} {rho_300K_calc__units}.
- What is the {Tg_exp__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} in Kelvin? Answer:<EOI>{Tg_exp#}.
- What is the {Tg_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} in Kelvin? Answer:<EOI>{Tg_calc#}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#} in g/cm^3? Answer:<EOI>{rho_300K_calc#}.
- What is the {Tg_exp__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} in Kelvin? Answer:<EOI>{Tg_exp#}.
- What is the {Tg_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} in Kelvin? Answer:<EOI>{Tg_calc#}.
- What is the {rho_300K_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#} in g/cm^3? Answer:<EOI>{rho_300K_calc#}.
- The polymer with the {PSMILES__description} {PSMILES#} has an {Tg_exp__names__noun} of {Tg_exp#} {Tg_exp__units} and a {rho_300K_calc__names__noun} of {rho_300K_calc#} {rho_300K_calc__units}.
- The polymer with the {compound_name__names__noun} {compound_name#} has an {Tg_exp__names__noun} of {Tg_exp#} {Tg_exp__units} and a {rho_300K_calc__names__noun} of {rho_300K_calc#} {rho_300K_calc__units}.
- Compare the {Tg_exp__names__noun} and {rho_300K_calc__names__noun} for the polymer with the {PSMILES__description} {PSMILES#}. Answer:<EOI> {Tg_exp#} {Tg_exp__units}, {rho_300K_calc#} {rho_300K_calc__units}.
- Compare the {Tg_exp__names__noun} and {rho_300K_calc__names__noun} for the polymer with the {compound_name__names__noun} {compound_name#}. Answer:<EOI> {Tg_exp#} {Tg_exp__units}, {rho_300K_calc#} {rho_300K_calc__units}.
- Compare the {Tg_exp__names__noun} and {rho_300K_calc__names__noun} for the polymer with the {PSMILES__description} {PSMILES#}. Answer:<EOI>{Tg_exp#} {Tg_exp__units}, {rho_300K_calc#} {rho_300K_calc__units}.
- Compare the {Tg_exp__names__noun} and {rho_300K_calc__names__noun} for the polymer with the {compound_name__names__noun} {compound_name#}. Answer:<EOI>{Tg_exp#} {Tg_exp__units}, {rho_300K_calc#} {rho_300K_calc__units}.
- |-
Question: What is the {Tg_exp__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}?
Constraint: You must pick either {%multiple_choice_enum%3%aA1} without using any other words.
Options:
{Tg_exp%}
Answer:<EOI> {%multiple_choice_result}
Answer:<EOI>{%multiple_choice_result}
- Question: What is the {Tg_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}?
Constraint: You must pick either {%multiple_choice_enum%3%aA1} without using any other words.
Options:
{Tg_calc%}
Answer:<EOI> {%multiple_choice_result}
Answer:<EOI>{%multiple_choice_result}
- Question: What is the {rho_300K_calc__names__noun} of the polymer with the {PSMILES__description} {PSMILES#}?
Constraint: You must pick either {%multiple_choice_enum%3%aA1} without using any other words.
Options:
{rho_300K_calc%}
Answer:<EOI> {%multiple_choice_result}
Answer:<EOI>{%multiple_choice_result}
- Question: What is the {Tg_exp__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}?
Constraint: You must pick either {%multiple_choice_enum%3%aA1} without using any other words.
Options:
{Tg_exp%}
Answer:<EOI> {%multiple_choice_result}
Answer:<EOI>{%multiple_choice_result}
- Question: What is the {Tg_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}?
Constraint: You must pick either {%multiple_choice_enum%3%aA1} without using any other words.
Options:
{Tg_calc%}
Answer:<EOI> {%multiple_choice_result}
Answer:<EOI>{%multiple_choice_result}
- Question: What is the {rho_300K_calc__names__noun} of the polymer with the {compound_name__names__noun} {compound_name#}?
Constraint: You must pick either {%multiple_choice_enum%3%aA1} without using any other words.
Options:
{rho_300K_calc%}
Answer:<EOI> {%multiple_choice_result}
Answer:<EOI>{%multiple_choice_result}
39 changes: 38 additions & 1 deletion data/tabular/lipophilicity/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ templates:
{exp%}
Answer: {%multiple_choice_result}
- |-
Question: Please estimate the {exp__names__noun} of {SMILES#} by picking one choice of {%multiple_choice_enum%3-6%aA1}.
Question: Please {#estimate|guess|predict|provide!} the {exp__names__noun} of {SMILES#} by picking one choice of {%multiple_choice_enum%3-6%aA1}.
Options:
{exp%}
Answer: {%multiple_choice_result}
Expand All @@ -57,3 +57,40 @@ templates:
Options:
{exp%}
Answer:<EOI>{%multiple_choice_result}
- |-
Question: What is the {exp__names__noun} for the {#molecule|chemical|compound!} represented by the {SMILES__description} {SMILES#}?
Answer:<EOI>{exp}
- |-
Task: Determine the {exp__names__noun} for the given {SMILES__description}.
Molecule: {SMILES#}
Answer:<EOI>{exp}
- |-
Task: Please {#estimate|guess|predict|provide!} the {exp__names__noun} for the following {SMILES__description}.
Molecule: {SMILES#}
Answer:<EOI>{exp}
- |-
Question: What is the experimental {exp__names__noun} for the molecule with the {SMILES__description} {SMILES#}?
Answer:<EOI>{exp}
- |-
Task: Identify the {exp__names__noun} for the given {#molecule|chemical|compound!}.
Molecule: {SMILES#}
Answer:<EOI>{exp}
- |-
Task: Please select the correct {exp__names__noun} for the {#molecule|chemical|compound!} represented by the {SMILES__description} {SMILES#}.
Options:
{exp%}
Answer:<EOI>{%multiple_choice_result}
- |-
Task: {#Estimate|Guess|Predict|Provide!} the {exp__names__noun} for the {#molecule|chemical|compound!} with the {SMILES__description} {SMILES#}.
Answer:<EOI>{exp}
6 changes: 3 additions & 3 deletions data/tabular/melting_points/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,17 +92,17 @@ templates:
Compound: {NAME#}
Result:<EOI> {mp#} {mp__units}
Result:<EOI>{mp#} {mp__units}
- |-
Task: Please estimate the {mp_names__noun} of a compound.
{SMILES__description}: {SMILES#}
Result:<EOI> {mp#} {mp__units}
Result:<EOI>{mp#} {mp__units}
- |-
Question: What is the {mp_names__noun} of a compound with the {SMILES__description} {SMILES#} in {mp__units}?
Answer:<EOI> {mp#}
Answer:<EOI>{mp#}
- |-
Question: Which molecule has a {mp_names__noun} of {mp#} {mp__units}?
Pick {%multiple_choice_enum%3%aA1}.
Expand Down
8 changes: 4 additions & 4 deletions data/tabular/mol_repr_transl/transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,18 +46,18 @@
"""User: Can you {#tell me|create|generate!} the {IDENTIFIER__names__noun} of the molecule with the {TARGET__names__noun} {TARGET#}?
Assistant: {#Yes|Of course|Sure|Yes, I'm happy to help!}, this molecule has a {IDENTIFIER__names__noun} of {IDENTIFIER#}.""", # noqa: E501
# Benchmarking text templates
"The molecule with the {IDENTIFIER__names__noun} {#representation of |!}{IDENTIFIER#} can also be represented with the {TARGET__names__noun}{# representation|!}:<EOI> {TARGET#}.", # noqa: E501
"The molecule with the {TARGET__names__noun} {#representation of |!}{TARGET#} can also be represented with the {IDENTIFIER__names__noun}{# representation|!}:<EOI> {IDENTIFIER#}.", # noqa: E501
"The molecule with the {IDENTIFIER__names__noun} {#representation of |!}{IDENTIFIER#} can also be represented with the {TARGET__names__noun}{# representation|!}:<EOI>{TARGET#}.", # noqa: E501
"The molecule with the {TARGET__names__noun} {#representation of |!}{TARGET#} can also be represented with the {IDENTIFIER__names__noun}{# representation|!}:<EOI>{IDENTIFIER#}.", # noqa: E501
"""Task: Please {#create|generate!} a molecule representation based on {#the input molecule representation and |!}the description.
Description: {#Generate|Create!} the {TARGET__names__noun} from the {IDENTIFIER__names__noun}.
{#Molecule |!}{IDENTIFIER__names__noun}: {IDENTIFIER#}
Constraint: Even if you are {#uncertain|not sure!}, you must answer with a representation without using any {#other|additional!} words.
Result:<EOI> {TARGET#}""", # noqa: E501
Result:<EOI>{TARGET#}""", # noqa: E501
"""Task: Please {#create|generate!} a molecule representation based on {#the input molecule representation and |!}the description.
Description: {#Generate|Create!} the {IDENTIFIER__names__noun} from the {TARGET__names__noun}.
{#Molecule |!}{TARGET__names__noun}: {TARGET#}
Constraint: Even if you are {#uncertain|not sure!}, you must answer with a representation without using any {#other|additional!} words.
Result:<EOI> {IDENTIFIER#}""", # noqa: E501
Result:<EOI>{IDENTIFIER#}""", # noqa: E501
],
}

Expand Down
4 changes: 2 additions & 2 deletions data/tabular/uniprot_binding_single/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,9 @@ templates:
Task: {#Find|Identify|Come up with!} a binding site in the {#AA sequence|amino acid sequence|peptide sequence|protein!} for the {#molecule|chemical|compound!}.
{#AA sequence|Amino acid sequence|Peptide sequence|Protein!}: {sequence#}
{SMILES__description}{# representation|!}: {SMILES#}
{#Output|Result!}:<EOI> {start_binding_site#}
{#Output|Result!}:<EOI>{start_binding_site#}
- |-
Task: {#Create|Design|Come up with!} a {#molecule|chemical|compound!} that binds to the given {#binding site|site|position|!} in the {#AA sequence|amino acid sequence|peptide sequence|protein!}.
{#AA sequence|Amino acid sequence|Peptide sequence|Protein!}: {sequence#}
Binding site{# position|!}: {start_binding_site#}
{#Output|Result!}:<EOI> {SMILES#}
{#Output|Result!}:<EOI>{SMILES#}
4 changes: 2 additions & 2 deletions data/tabular/uniprot_binding_sites_multiple/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,9 @@ templates:
Task: {#Find|Identify|Come up with!} a binding site in the {#AA sequence|amino acid sequence|peptide sequence|protein!} for the {#molecule|chemical|compound!}.
{#AA sequence|Amino acid sequence|Peptide sequence|Protein!}: {sequence#}
{SMILES__description}{# representation|!}: {SMILES#}
{#Output|Result!}:<EOI> {start_binding_site#}-{end_binding_site#}
{#Output|Result!}:<EOI>{start_binding_site#}-{end_binding_site#}
- |-
Task: {#Create|Design|Come up with!} a {#molecule|chemical|compound!} that binds to the given {#binding site|site|position|!} in the {#AA sequence|amino acid sequence|peptide sequence|protein!}.
{#AA sequence|Amino acid sequence|Peptide sequence|Protein!}: {sequence#}
Binding site{# position|!}: {start_binding_site#}{#-| to !}{end_binding_site#}
{#Output|Result!}:<EOI> {SMILES#}
{#Output|Result!}:<EOI>{SMILES#}
2 changes: 1 addition & 1 deletion data/tabular/uniprot_organisms/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,4 +43,4 @@ templates:
- |-
Task: {#Predict|Identify!} the organism in which {#the below|this!} {#protein|amino acid sequence|AA sequence|polypeptide!} can be found.
{#Amino acid sequence|Sequence|AA sequence!}: {other#}
Result:<EOI> {organisms#}
Result:<EOI>{organisms#}
4 changes: 2 additions & 2 deletions data/tabular/uniprot_reactions/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ templates:
- |-
Task: {#Predict|Identify!} a {#biochemical |chemical |!}reaction that can be catalyzed by {#this|the following!} {#protein|amino acid sequence|AA sequence|polypeptide!}.
{#Amino acid sequence |Sequence|AA sequence!}: {other#}
Result:<EOI> {reactions#}
Result:<EOI>{reactions#}
- |-
Task: {#Generate|Create|Come up with|Design!} a {#protein|amino acid sequence|AA sequence|polypeptide!} that can catalyze {#a|this!} specific {#biochemical |chemical |!}reaction.
Reaction: {reactions#}
{#Output|Result!}:<EOI> {other#}
{#Output|Result!}:<EOI>{other#}
4 changes: 2 additions & 2 deletions data/tabular/uniprot_sentences/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ templates:
- |-
Task: {#Generate|Create|Come up with!} a description {#of a few sentences |!}for the {#protein|amino acid sequence|AA sequence|polypeptide!}{# below|!}.
{#Protein|Amino acid sequence|AA sequence|Polypeptide!}: {sequence#}
{#Output|Result!}:<EOI> {sentences#}
{#Output|Result!}:<EOI>{sentences#}
- |-
Task: {#Generate|Create|Come up with!} a {#protein|amino acid sequence|AA sequence|polypeptide!} based on the description.
Description: {sentences#}
{#Output|Result!}:<EOI> {sequence#}
{#Output|Result!}:<EOI>{sequence#}
Loading

0 comments on commit 113cd7e

Please sign in to comment.