Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

petals models local support using python env #435

Merged
merged 62 commits into from
Nov 4, 2023

Conversation

biswaroop1547
Copy link
Contributor

@biswaroop1547 biswaroop1547 commented Oct 27, 2023

Before merge, few issues that needs to be taken care of:

@biswaroop1547 biswaroop1547 marked this pull request as ready for review October 31, 2023 11:04
@casperdcl
Copy link
Contributor

casperdcl commented Nov 2, 2023

just merged #440; lemme know if you need any help rebasing @biswaroop1547 :)

@tiero
Copy link
Contributor

tiero commented Nov 2, 2023

Using latest commit I downloaded Stable Beluga, once I click Open it hangs and seems to never really start

src/controller_binaries.rs:97 2023-11-03T00:50:41 [INFO] - serve_command: setup-petals.sh --model-id petals-team/StableBeluga2 --model-path . --dht-prefix StableBeluga2-hf --port 8734
src/controller_binaries.rs:102 2023-11-03T00:50:41 [INFO] - binary_path: "/Users/tiero/Library/Application Support/io.premai.prem-app/models/stable-beluga-2/setup-petals.sh"
src/controller_binaries.rs:122 2023-11-03T00:50:41 [INFO] - args: ["--model-id", "petals-team/StableBeluga2", "--model-path", ".", "--dht-prefix", "StableBeluga2-hf", "--port", "8734"]
src/controller_binaries.rs:166 2023-11-03T00:50:42 [ERROR] - Failed to send request: error sending request for url (http://localhost:8734/v1): error trying to connect: tcp connect error: Connection refused (os error 61)
src/controller_binaries.rs:166 2023-11-03T00:50:43 [ERROR] - Failed to send request: error sending request for url (http://localhost:8734/v1): error trying to connect: tcp connect error: Connection refused (os error 61)
src/controller_binaries.rs:166 2023-11-03T00:50:43 [ERROR] - Failed to send request: error sending request for url (http://localhost:8734/v1): error trying to connect: tcp connect error: Connection refused (os error 61)
src/controller_binaries.rs:166 2023-11-03T00:50:44 [ERROR] - Failed to send request: error sending request for url (http://localhost:8734/v1): error trying to connect: tcp connect error: Connection refused (os error 61)
[TRUNCATED]

@biswaroop1547
Copy link
Contributor Author

biswaroop1547 commented Nov 3, 2023

@tiero actually it takes around ~30sec to startup the model server. To check if the server is up after that duration you can also do lsof -i :8734

@biswaroop1547
Copy link
Contributor Author

biswaroop1547 commented Nov 3, 2023

@tiero
Copy link
Contributor

tiero commented Nov 3, 2023

We should have a timeout, as I waited more than 5 minutes and it was keep haging to me. What can I do to debug?

@biswaroop1547
Copy link
Contributor Author

@tiero that's weird because it shouldn't take more than 30 secs (given you've ran the swarm before, because that creates the python env which'll be reused for petals), but if you're starting anew then it'd take around 3 mins as it also installs and sets up the python environment before starting the server (currently when the python env is being setup after you click "open" we are not showing any message), to debug can you remove ~/.config/prem once and then try again? (it'd take 2-3 mins for server to start)

@filopedraz
Copy link
Contributor

The problem is not in the env creation. Attached here the logs.

image

@casperdcl
Copy link
Contributor

casperdcl commented Nov 3, 2023

btw I guess dev registry needs to be reverted back to v1 before this can be merged, right?

@filopedraz
Copy link
Contributor

Yes, correct @casperdcl. It's just for testing. The service actually works, but it takes a huge amount of time. After the health request was successful it took 60 seconds to load the chat screen.

image

@biswaroop1547
Copy link
Contributor Author

biswaroop1547 commented Nov 3, 2023

@filopedraz yeah it takes longer if it's creating env from start, but if env is already present then it takes minimum 30 secs to maximum 1 min, do we want to show some kind of message when this is happening? (it's mentioned as one of the issues/todos in this PR desc)

This is the actual time it takes for the loading of model into memory after starting up the server, up for ideas here on what we can do to reduce this time though 🙏🏻

@filopedraz
Copy link
Contributor

Good for now. I am more worried about the time between the toast and the load of the chat. I don't know what's. It happens to me with Mistral too actually.

@filopedraz
Copy link
Contributor

Download doesn't even start now. Here a loom.

@filopedraz
Copy link
Contributor

filopedraz commented Nov 4, 2023

Now it seems to work. I also rebased with main, but there seems to be an issue in the generation:

image image

Seems related to a special token it'

@filopedraz
Copy link
Contributor

The PR looks good and it works well for me. I created a new issue here for what concerns the generation.

@casperdcl
Copy link
Contributor

I suggest we squash-merge because it's ultimately quite small & not worth rebasing/preserving history

@filopedraz filopedraz merged commit 4c35880 into bit-gpt:main Nov 4, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants