Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExApp context_chat_backend initialization timed out #36

Closed
xavierg1909 opened this issue Apr 8, 2024 · 29 comments
Closed

ExApp context_chat_backend initialization timed out #36

xavierg1909 opened this issue Apr 8, 2024 · 29 comments
Labels
bug Something isn't working stale

Comments

@xavierg1909
Copy link

Describe the bug
ExApp context_chat_backend initialization timed out by one click

To Reproduce
Steps to reproduce the behavior:

  1. Go to External Apps
  2. Search Context Chat Backend
  3. Click on Install and Desploy
  4. See error

Expected behavior
context_chat_backend enabled

Server logs (if applicable)
_nextcloud-aio-nextcloud_logs (2).txt

<paste logs here or attach a file>

Context Chat Backend logs (if applicable, from the docker container)
_nc_app_context_chat_backend_logs (5).txt

<paste logs here or attach a file>

Screenshots

image

Setup Details (please complete the following information):

  • Nextcloud Version: 28.0.4
  • AppAPI Version: 2.0.1
  • Context Chat PHP Version [e.g. 1.0.0]
  • Context Chat Backend Version 2.0.1

--


- Nextcloud deployment method: [eg. Apache bare-metal] - Context Chat Backend deployment method: [eg. manual, one-click]

Additional context
Add any other context about the problem here.

@xavierg1909 xavierg1909 added the bug Something isn't working label Apr 8, 2024
@kyteinsky
Copy link
Contributor

kyteinsky commented Apr 9, 2024

Hello, that is a not a big issue but definitely something to improve. What happens is that the context chat php app tries to index all the users' files but the backend isn't initialised yet, which is why the 503 response.
It would help to disable the php app (occ app:disable context_chat) until the backend is installed. It downloads more than 5 GB so please be patient there :)

How do you arrive at the conclusion that it timed out? Did something happen after this which is not in the logs?

@socialize-IT
Copy link

How do you arrive at the conclusion that it timed out? Did something happen after this which is not in the logs?

You can see the timeout message in the posted screenshot.

@xavierg1909
Copy link
Author

xavierg1909 commented Apr 25, 2024

Updated: I just did a clean installation, using AIO in the new version 29, without user files, the first thing I did was install the context_chat_backend from AppAPI now in version 2.5.0

Unfortunately the problem persists, it reaches 50% and stays. I was able to validate that it downloads the image and deploys the container but it doesn't work.

  • I used the command occ app_api:app:register context_chat_backend docker_aio --force-scopes and it registers it with the daemon

  • Then i used occ app_api:app:enable and enables it. occ app_api:app:list

  • I finally deployed and enabled the assistants but it still says that the backend is disabled.

backend

image

@kyteinsky
Copy link
Contributor

You can see the timeout message in the posted screenshot.

oops :)

@xavierg1909 Thanks for the update. Would you mind posting the full log of the docker container to see what exactly is going wrong?

@xavierg1909
Copy link
Author

Hello, inexplicably today the AI ​​already works, the strange thing is that the waiting time persists in the context_chat_backend, I leave the capture and the log in case you can correct that. I hope I have helped.

image
_nc_app_context_chat_backend_logs.txt
_nextcloud-aio-nextcloud_logs (1).txt

@kyteinsky
Copy link
Contributor

That is very helpful indeed. Thank you very much!

The models were downloading when you tried to launch a query with assistant, same with the loading of sources. The error messages definitely need improvement and the context chat php app will also need a check to see if the backend's initialisation has completed before spamming the backend with the documents.

I'll look into AppAPI's reporting of the timeout.

Also, thanks for the logs again, looks like a regression with odfpy being removed 🙈 (it is required to parse ods files).

Will fix all the issues on Monday. Have a nice weekend.

Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label May 27, 2024
@architectonio
Copy link

I got the context_chat_backend deployed, however isn't working.
The "ExApps" interface shows “Heartbeat check failed” and “Healthchecking” .... and this after more than 24 hours.
After 12 hours waiting, I restarted Docker and then the whole server. Unfortunately, the status remains the same.

@kyteinsky
Copy link
Contributor

hello @architectonio, your issue seems similar. Can you provide me the server logs and the docker container logs?

@architectonio
Copy link

hello @kyteinsky , the same happens even with the ExApp "Test Deploy".
Which log do you need, "docker_socket_proxy" or "content_chat_backend"?
Please let me know.

PS.
Since the "content_chat_backend" ExApp wasn't working I removed. Therefore I'll deploy again if you need its log.

@kyteinsky
Copy link
Contributor

The content_chat_backend's logs, please.

@architectonio
Copy link

I am trying to deploy it again but deployment process remains at 0%.
The docker_socket_proxy container is up and running, and reachable.
Now I am going to restart the the Web Server, PHP and docker daemon hoping I get the ExApp deployed.

@kyteinsky
Copy link
Contributor

Do you see any server error logs regarding this?

@architectonio
Copy link

Here the Log:

Detecting hardware...
Detected hardware: cuda
Config file already exists in the persistent storage ("/nc_app_context_chat_backend_data/config.yaml").
App config:
{
"debug": true,
"disable_aaa": false,
"httpx_verify_ssl": true,
"use_colors": true,
"uvicorn_workers": 1,
"disable_custom_model_download": false,
"model_download_uri": "https://download.nextcloud.com/server/apps/context_chat_backend",
"vectordb": [
"chroma",
{
"is_persistent": true
}
],
"embedding": [
"instructor",
{
"model_name": "hkunlp/instructor-base",
"model_kwargs": {
"device": "cuda"
}
}
],
"llm": [
"llama",
{
"model_path": "dolphin-2.2.1-mistral-7b.Q5_K_M.gguf",
"n_batch": 10,
"n_ctx": 8192,
"n_gpu_layers": -1,
"template": "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant, good at finding relevant context from documents to answer questions provided by the user. <|im_end|>\n<|im_start|> user\nUse the following documents as context to answer the question at the end. REMEMBER to excersice source critisicm as the documents are returned by a search provider that can return unrelated documents.\n\nSTART OF CONTEXT: \n{context} \n\nEND OF CONTEXT!\n\nIf you don't know the answer or are unsure, just say that you don't know, don't try to make up an answer. Don't mention the context in your answer but rather just answer the question directly. \nQuestion: {question} Let's think this step-by-step. \n<|im_end|>\n<|im_start|> assistant\n",
"no_ctx_template": "<|im_start|> system \nYou're an AI assistant named Nextcloud Assistant.<|im_end|>\n<|im_start|> user\n{question}<|im_end|>\n<|im_start|> assistant\n",
"end_separator": "<|im_end|>",
"model_kwargs": {
"device": "cuda"
}
}
]
}

App disabled at startup
INFO: Started server process [1]
INFO: Waiting for application startup.
TRACE: ASGI [1] Started scope={'type': 'lifespan', 'asgi': {'version': '3.0', 'spec_version': '2.0'}, 'state': {}}
TRACE: ASGI [1] Receive {'type': 'lifespan.startup'}
TRACE: ASGI [1] Send {'type': 'lifespan.startup.complete'}
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:23001 (Press CTRL+C to quit)

@kyteinsky
Copy link
Contributor

looks perfect. At this point app_api should've called the /init endpoint for model download. I'll check with AppAPI's devs about the issue.

@architectonio
Copy link

OK. Is there anything I still can do?

@kyteinsky
Copy link
Contributor

You may manually download the gguf and tar.gz files from here: https://download.nextcloud.com/server/apps/context_chat_backend/

Extract the tar.gz file and place both the extracted folder and the gguf file in the docker volume attached to the context chat's container (mounted at /nc_ something). docker cp should work for this purpose.

@architectonio
Copy link

OK, I'll try to figure it out and give you a feedback.
Thank you a lot

@architectonio
Copy link

architectonio commented Jun 7, 2024

Is this the Volume you are referring to?

docker volume inspect nc_app_context_chat_backend_data

[
{
"CreatedAt": "2024-05-29T07:19:08+02:00",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/nc_app_context_chat_backend_data/_data",
"Name": "nc_app_context_chat_backend_data",
"Options": null,
"Scope": "local"
}
]

and if yes, I just need to proceed as you wrote by coping the content of downloaded files in to the volume, and I guess then restarting the container?

@kyteinsky
Copy link
Contributor

yes and yes. Remember to extract the tar.gz file before moving it to the volume.

@kyteinsky
Copy link
Contributor

btw, since you said the Test Deploy doesn't work for you. Would you mind creating an issue with the relevant details (server, socket proxy logs and maybe the console logs) and screenshots here: https://github.com/cloud-py-api/app_api/issues ?

@architectonio
Copy link

architectonio commented Jun 7, 2024

I did it.
nextcloud/app_api#300
Please let me know if the created issue has all information you need

@kyteinsky
Copy link
Contributor

Thank you, replying there and closing this issue.

@architectonio
Copy link

architectonio commented Jun 7, 2024

@kyteinsky, me again.
Even after copying the files into the Docker Volume and restarting the Docker daemon, (unfortunately) the issue remains the same.
Here the content of the volume:
ls -l /var/lib/docker/volumes/nc_app_context_chat_backend_data/_data
total 5011180
-rw-r--r-- 1 root root 3207 May 29 07:19 config.yaml
-rwxr--r-- 1 root root 5131421440 Jun 7 13:14 dolphin-2.2.1-mistral-7b.Q5_K_M.gguf
drwxr-xr-x 4 root root 4096 Dec 21 12:41 hkunlp_instructor-base
drwxr-x--- 2 root root 4096 May 29 07:19 model_files
-rw-r--r-- 1 root root 32 Jun 7 15:13 repair.info
drwxr-x--- 2 root root 4096 May 29 07:19 vector_db_data

Since even the Test Deploy ExApp isn't working well, now I believe that something is wrong with the docker socket proxy.

docker ps
3d02fb1d6b04 ghcr.io/cloud-py-api/nextcloud-appapi-dsp:release "/bin/bash start.sh" 9 days ago Up 9 minutes (healthy) 0.0.0.0:2375->2375/tcp, :::2375->2375/tcp nextcloud-appapi-dsp

docker inspect nextcloud-appapi-dsp

[
{
"Id": "3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e",
"Created": "2024-05-29T05:12:18.405325929Z",
"Path": "/bin/bash",
"Args": [
"start.sh"
],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 1746048,
"ExitCode": 0,
"Error": "",
"StartedAt": "2024-06-07T13:13:09.681232788Z",
"FinishedAt": "2024-06-07T13:13:00.795691168Z",
"Health": {
"Status": "healthy",
"FailingStreak": 0,
"Log": [
{
"Start": "2024-06-07T15:22:28.698304527+02:00",
"End": "2024-06-07T15:22:28.773476851+02:00",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2024-06-07T15:22:38.824927991+02:00",
"End": "2024-06-07T15:22:38.888937822+02:00",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2024-06-07T15:22:48.942928435+02:00",
"End": "2024-06-07T15:22:49.014493303+02:00",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2024-06-07T15:22:59.177293415+02:00",
"End": "2024-06-07T15:22:59.244118844+02:00",
"ExitCode": 0,
"Output": ""
},
{
"Start": "2024-06-07T15:23:09.296206704+02:00",
"End": "2024-06-07T15:23:09.368839036+02:00",
"ExitCode": 0,
"Output": ""
}
]
}
},
"Image": "sha256:b18e1d6b96402528193c61812287c10464db01d47e6f29cdecd91f9745b9453f",
"ResolvConfPath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/hostname",
"HostsPath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/hosts",
"LogPath": "/var/lib/docker/containers/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e/3d02fb1d6b045ea989702058afc91663aac11aa088022cd32342475d3eeea67e-json.log",
"Name": "/nextcloud-appapi-dsp",
"RestartCount": 0,
"Driver": "overlay2",
"Platform": "linux",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "unconfined",
"ExecIDs": null,
"HostConfig": {
"Binds": [
"/var/run/docker.sock:/var/run/docker.sock"
],
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {}
},
"NetworkMode": "bridge",
"PortBindings": {
"2375/tcp": [
{
"HostIp": "",
"HostPort": "2375"
}
]
},
"RestartPolicy": {
"Name": "always",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"CapAdd": null,
"CapDrop": null,
"CgroupnsMode": "private",
"Dns": [],
"DnsOptions": [],
"DnsSearch": [],
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "private",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "",
"Privileged": true,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": [
"label=disable"
],
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": [],
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DeviceCgroupRules": null,
"DeviceRequests": null,
"KernelMemory": 0,
"KernelMemoryTCP": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": null,
"OomKillDisable": null,
"PidsLimit": null,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0,
"MaskedPaths": null,
"ReadonlyPaths": null
},
"GraphDriver": {
"Data": {
"LowerDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347-init/diff:/var/lib/docker/overlay2/ac50879b0255f066f84a0b47b3206e028c8af2e8e6503fdeb753b5b237cd68ef/diff:/var/lib/docker/overlay2/591d1b230c4b8d5f2ddd4172ebc91315460942fc4ad9da127af178a577f8c186/diff:/var/lib/docker/overlay2/59df22bf62d3de6be5d4c8130855dfb480783789b20ce8fbc2dff9589a8d6d50/diff:/var/lib/docker/overlay2/c7bf4c9593cb13d483dd3fba00f4ae3315b08624005f0e12229280c262abc020/diff:/var/lib/docker/overlay2/e5f5509fbdfe27b76beee06ad3525dcb41f2d57a4d11d81ff1132f601f2d09da/diff:/var/lib/docker/overlay2/06e8da6f1c2cd34e2abac9cc5da702709fbe93367b12a600b426322ea4f68f44/diff:/var/lib/docker/overlay2/a6fcf681693b8cc6e1945e064ee0ea150f417ed19eb2435bfc66e5e81e7475ad/diff:/var/lib/docker/overlay2/1740c94b57f2ac262c2da7227b98fa3c5d7e23ffb33ca22468aa5c46bdbb931e/diff:/var/lib/docker/overlay2/1f8575e8282f19d381d1e4e88d1de7bbf510047c61b3de2c20040a56b3d458ad/diff:/var/lib/docker/overlay2/fa767248d51a8f064e893d5731c64382eed07123a94e69e05a114c6fb70bf33f/diff",
"MergedDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347/merged",
"UpperDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347/diff",
"WorkDir": "/var/lib/docker/overlay2/d35b8b6da014cdf53a6df2d2494cafb004c1a4e0b87ad19e720e2c12d9d03347/work"
},
"Name": "overlay2"
},
"Mounts": [
{
"Type": "bind",
"Source": "/var/run/docker.sock",
"Destination": "/var/run/docker.sock",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
],
"Config": {
"Hostname": "nextcloud-appapi-dsp",
"Domainname": "",
"User": "root",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"ExposedPorts": {
"2375/tcp": {}
},
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"HAPROXY_VERSION=2.9.7",
"HAPROXY_URL=https://www.haproxy.org/download/2.9/src/haproxy-2.9.7.tar.gz",
"HAPROXY_SHA256=d1a0a56f008a8d2f007bc0c37df6b2952520d1f4dde33b8d3802710e5158c131",
"HAPROXY_PORT=2375",
"BIND_ADDRESS=*",
"EX_APPS_NET=localhost",
"EX_APPS_COUNT=30",
"TIMEOUT_CONNECT=10s",
"TIMEOUT_CLIENT=30s",
"TIMEOUT_SERVER=1800s"
],
"Cmd": null,
"Healthcheck": {
"Test": [
"CMD-SHELL",
"/healthcheck.sh"
],
"Interval": 10000000000,
"Timeout": 10000000000,
"Retries": 9
},
"Image": "ghcr.io/cloud-py-api/nextcloud-appapi-dsp:release",
"Volumes": null,
"WorkingDir": "/",
"Entrypoint": [
"/bin/bash",
"start.sh"
],
"OnBuild": null,
"Labels": {
"com.centurylinklabs.watchtower.enable": "false"
},
"StopSignal": "SIGUSR1"
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "98f89015eaaad8d3b92a8a9ce141f27c28595c734dfbe60327423e7bfc23c9ac",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {
"2375/tcp": [
{
"HostIp": "0.0.0.0",
"HostPort": "2375"
},
{
"HostIp": "::",
"HostPort": "2375"
}
]
},
"SandboxKey": "/var/run/docker/netns/98f89015eaaa",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "234dba6a71296e5c51a88d66de5bfacc1d545a09508fddf37bc9b1af36ca1f11",
"Gateway": "172.17.0.1",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "172.17.0.3",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"MacAddress": "02:42:ac:11:00:03",
"Networks": {
"bridge": {
"IPAMConfig": null,
"Links": null,
"Aliases": null,
"NetworkID": "18400b33df6b221a3abcff93e6bbb499aedc186f4ad5dc8c44b90d798464e69f",
"EndpointID": "234dba6a71296e5c51a88d66de5bfacc1d545a09508fddf37bc9b1af36ca1f11",
"Gateway": "172.17.0.1",
"IPAddress": "172.17.0.3",
"IPPrefixLen": 16,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:ac:11:00:03",
"DriverOpts": null
}
}
}
}
]

My Nextcloud instance talks with nextcloud-appapi-dsp through docker.sock (/var/run/docker.sock)

@kyteinsky
Copy link
Contributor

Ah my bad, you need to move dolphin-2.2.1-mistral-7b.Q5_K_M.gguf and hkunlp_instructor-base inside model_files folder. Restarting the container should enable it by default.

@architectonio
Copy link

OK I did it. I am going to do some test and then give you a feedback.
Thanks again

@architectonio
Copy link

After a lot of tests, including removing and reinstalling images and containers, and of course adding/copying the files above into the container, unfortunately nothing has changed.
I am going to wait until a running app is released and then give another try.

@architectonio
Copy link

architectonio commented Jul 1, 2024

Any news on this?

@m-schmoock
Copy link

Hi @architectonio
I had the same issue and solved it by putting the nextcloud container and the AppAPI proxy (which installs the external context_chat_backend) into the same network (by docker network create nextcloud and running the containers then with --network=nextcloud).

This is required, so nextcloud can do a hostname lookup (this won't work in network bridge!).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

5 participants