Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SiriDB errors on Docker Swarm with Glusterfs filesystem #46

Open
gianmarco-mameli opened this issue Dec 18, 2020 · 8 comments
Open

SiriDB errors on Docker Swarm with Glusterfs filesystem #46

gianmarco-mameli opened this issue Dec 18, 2020 · 8 comments

Comments

@gianmarco-mameli
Copy link

gianmarco-mameli commented Dec 18, 2020

Hi Nico640, first of all thank you for your work making UNMS running on arm architecture.
I installed a 3 node Raspberry PI cluster, running Docker Swarm, GlusterFS persistent storage and Traefik reverse proxy. All works like a charm if I configure the /config folder in bind mount on local storage of the raspberry pi, but I need to put the mount on the GlusterFS mount path, to have a persistent storage if the container change node where it runs.
In the log there's this lines repeated every two or three seconds.

[W 2020-12-18 20:52:40] Database directory not found, creating directory '/config/siridb/'.,
[E 2020-12-18 20:52:40] Cannot create directory '/config/siridb/'.,
[W 2020-12-18 20:52:40] Closing SiriDB Server (version: 2.0.34)

Any idea of what's happening?

Thanks in advance
GIanmarco

@Nico640
Copy link
Owner

Nico640 commented Dec 19, 2020

Hmm, not sure if this is a permission or a SiriDB issue. What happens if you manually create the /config/siridb directory? Does anything else than SiriDB show any errors?
You could also check if you get any different behaviour when using the nico640/docker-unms:testing image, as it uses a more recent SiriDB version.

@gianmarco-mameli
Copy link
Author

Other errors I found in the docker log:

In Kernel.php line 765:,
Unable to create the "logs" directory (/usr/src/ucrm/app/logs). ,
Makefile:12: recipe for target 'server_with_migrate' failed,
make: *** [server_with_migrate] Error 1,
2020/12/20 23:28:25 [error] 1621#1621: *1 connect() failed (111: Connection refused) while connecting to upstream, client: 10.0.1.3, server: , request: "GET // HTTP/1.1", upstream: "http://127.0.0.1:8082//", host: "unms.xxxxxx",
FATAL: role "root" does not exist,

I tried to create, recreate and delete the folder, with the container stopped and started, no change.

The testing image seems to work, next days I make some tests on it.
It's possible that the problem is related to the symbolic link on the siridb folder?

Thanks

@Nico640
Copy link
Owner

Nico640 commented Dec 21, 2020

I don't think it's a SiriDB specific issue, as UCRM also seems to have issues creating / accessing directories, but yes, it could be a symbolic link issue. Not sure why it works on the testing image though, as there are too many changes in the testing branch (e.g. change of the base image from Debian 9 to Alpine Linux 3.12)

@gianmarco-mameli
Copy link
Author

Hi nico,
It seems to have found the problem: the clusters are based on Raspbian Os, but I don't made a check if the glusterfs version in the official repository was old, in this case 5.5. I have configured the buster-backports repository, which contains glusterfs version 8, I have reinstalled and reconfigured the storage and now the UNMS container seems to be working fine.
With the old version of glusterfs, I found a "stale file handle" error in the log, and searching google is related with some old bugs.
Thanks for you time

@gianmarco-mameli
Copy link
Author

Hi Nico, sometimes the docker container spikes up on cpu usage, from logs seems the crm module writes on storage many file or log entries and the Glusterfs problem was not solved. Seems it works well if I disable the CRM module from settings. You know to disable completely the crm module on the container?

@mwinters-stuff
Copy link

I have made a version of the docker image which removes crm so that its not continuously creating processes checking to see if crm is enabled (crm didnt work for me and was not needed).
https://github.com/mwinters-stuff/docker-unms-no-ucrm

@Nico640
Copy link
Owner

Nico640 commented Jan 20, 2021

Hmm, I once had an issue with high cpu usage and a frozen container that was caused by the /config volume becoming unreadable temporarily as the network share that the volume was stored on was unreachable for a short period, causing symbolic link loops inside the container. Not sure if that's what was happening here. Did this happen using the latest or testing image?

If you want to completely disable the crm module, you could either remove it from the container like @mwinters-stuff did it, or you could try to only disable the service for the crm module (/etc/services.d/ucrm). Note that both approaches will break the unms backup functionality, as unms will be unable to communicate with the crm module. Cron jobs like logrotate will also not run.

@gianmarco-mameli
Copy link
Author

unfortunately the problems are the same, today the docker node where unms runs, spikes on the cpu making other containers unreachable and the glusterfs storage disconnected from the node, the top command reports spikes on cpu at 300% as seen, only a complete restart of all the three swarm nodes solves the problem. few days ago I modified the docker compose file of unms to limit the cpu al 50%, but nothing changes. The next step is to check if the solution removing CRM changes anything
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants