Provide feedback while using the assistant #3435

ghost · 2023-02-04T17:57:38Z

ghost
Feb 4, 2023

I would like to use the assistant and if I don't like the answer it gives be able to come back later to the same answer if I find a better answer and provide the new answer as feedback. And also rate the assistant's answer with quality, helpfulness, etc.

Right now replying as the assistant or classifying the assistant reply is very difficult because it's talking about topics I don't know about.

I don't know if RLHF is a process or you have to have all the data collected before using it for training. But if it's a process I think this is best and there would also be more people interested in contributing if they could use the assistant.

fozziethebeat · 2023-02-06T12:29:39Z

fozziethebeat
Feb 6, 2023
Maintainer

I interpret this as being dependent on using a real interactive model rather than the data creation tasks. This would require us logging the conversation flows between users and a live model and then letting someone go back through their interaction logs and evaluating results. I think it's a pretty good idea but is blocked on an interactive model.

0 replies

ghost · 2023-03-05T17:41:19Z

ghost
Mar 5, 2023

I don't know enough about this to understand the answer, but I've had another idea regarding this.

If the fine tuning process is not too expensive there could be multiple versions like ChatGPT is doing. That way people can start using the assistant with the points they gain by collaborating. While this approach would be more expensive due to the need for multiple fine-tuning processes, it would likely result in more engagement and feedback from users.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide feedback while using the assistant #3435

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

Provide feedback while using the assistant #3435

ghost Feb 4, 2023

Replies: 2 comments

fozziethebeat Feb 6, 2023 Maintainer

ghost Mar 5, 2023

ghost
Feb 4, 2023

fozziethebeat
Feb 6, 2023
Maintainer

ghost
Mar 5, 2023