-
Notifications
You must be signed in to change notification settings - Fork 18
gitlab backup restore
- backup:
- restore:
备份与恢复,必须保持两个服务的版本完全一致
gitlab-rake gitlab:env:info
backup过程除了application的数据之外,也要备份config&secret数据
ref: use STRATEGY copy
# or you will get too many open files error
ulimit -u unlimited
sudo -u git gitlab-rake gitlab:backup:create STRATEGY=copy
将gitlab etc的配置进行备份
pushd /var/opt/gitlab/backups
umask 0077
tar -cf $(date "+etc-gitlab-%s_%Y_%m_%d.tar") -C / etc/gitlab
popd
ref: https://docs.gitlab.com/ce/raketasks/backup_restore.html#restore-for-omnibus-installations
将备份文件放置固定目录
cp 1493107454_2017_04_25_9.1.0_gitlab_backup.tar /var/opt/gitlab/backups/
停止与数据库连接的服务
gitlab-ctl stop unicorn
gitlab-ctl stop sidekiq
gitlab-ctl status
恢复
gitlab-rake gitlab:backup:restore BACKUP=1493107454_2017_04_25_9.1.0 force=yes
启动服务并检测完整性
gitlab-ctl start
gitlab-rake gitlab:check SANITIZE=true
将原有的配置数据备份,用新数据覆盖
mv /etc/gitlab /etc/gitlab.$(date "+%s_%Y_%m_%d")
tar -xf etc-gitlab-1504750531_2017_09_07.tar -C /
更改配置数据之后,记得重新配置
gitlab-ctl reconfigure
方法:
- 删除projects表中,namespace_id不存在于namespaces表中的数据
- 删除routes表中,source_type='Project'且source_id不存在于projects表中的数据
- 删除routes表中,source_type='Namespace'且source_id不存在于namespaces表中的数据
于2017-09-05-10-14的操作记录
select (select count(*) from projects) as n_pro, (select count(*) from routes where source_type='Project') as n_rou_pro, (select count(*) from namespaces) as n_name, (select count(*) from routes where source_type='Namespace') as n_rou_name, (select count(*) from routes) as n_rou;
5739 | 5739 | 4755 | 4769 | 10508
projects表与routes表保持一致,routes多承载了14条多余的namespace数据
-
删除projects表中,namespace_id不存在于namespaces表中的数据
-
删除routes表中,source_type='Project'且source_id不存在于projects表中的数据
select * from projects where namespace_id not in (select id from namespaces);
目前有7条数据,这些数据可以说是无用数据
将其从routes表和projects表一并删除
删除routes
select * from routes where source_type='Project' and source_id in (select id from projects where namespace_id not in (select id from namespaces));
-- 删除要谨慎
delete from routes where source_type='Project' and source_id in (select id from projects where namespace_id not in (select id from namespaces));
删除projects
select * from projects where namespace_id not in (select id from namespaces);
-- 删除要谨慎
delete from projects where namespace_id not in (select id from namespaces);
- 删除routes表中,source_type='Namespace'且source_id不存在于namespaces表中的数据
select * from routes where source_type='Namespace' and source_id not in (select id from namespaces);
一共有14条数据,可以说是无用数据
-- 删除要谨慎
delete from routes where source_type='Namespace' and source_id not in (select id from namespaces);
log below
Unpacking backup ... done
Before restoring the database we recommend removing all existing
tables to avoid future upgrade problems. Be aware that if you have
custom tables in the GitLab database these tables and all data will be
removed.
Do you want to continue (yes/no)? Removing all tables. Press `Ctrl-C` within 5 seconds to abort
Cleaning the database ...
done
Restoring database ...
Restoring PostgreSQL database gitlabhq_production ... SET
SET
SET
SET
SET
SET
SET
SET
SET
ERROR: relation "public.timelogs" does not exist
ERROR: relation "public.timelogs" does not exist
ERROR: relation "public.merge_requests_closing_issues" does not exist
ERROR: relation "public.protected_tag_create_access_levels" does not exist
ERROR: relation "public.label_priorities" does not exist
ERROR: relation "public.merge_request_metrics" does not exist
ERROR: relation "public.label_priorities" does not exist
ERROR: relation "public.subscriptions" does not exist
ERROR: relation "public.labels" does not exist
ERROR: relation "public.u2f_registrations" does not exist
ERROR: relation "public.protected_tag_create_access_levels" does not exist
ERROR: relation "public.web_hook_logs" does not exist
ERROR: relation "public.merge_request_diff_files" does not exist
ERROR: relation "public.issue_metrics" does not exist
ERROR: relation "public.merge_requests_closing_issues" does not exist
ERROR: relation "public.chat_teams" does not exist
ERROR: relation "public.merge_request_metrics" does not exist
ERROR: relation "public.container_repositories" does not exist
ERROR: relation "public.merge_request_diff_commits" does not exist
ERROR: relation "public.protected_tag_create_access_levels" does not exist
.....
.....
ref: https://gitlab.com/gitlab-org/gitlab-ce/issues/30755
原因在于,在restore之前,先清空了所有的表,导致对relation, index的删除出现了报错。
In sum, the warnings above are just noise because we clean the database before the restore, and the SQL restore doesn't know that.
ref: https://gitlab.com/gitlab-org/gitlab-ce/issues/16343
this will rebuild authorized_keys file, you will lose data stored in authorized_keys file, want to continue?
choose yes!
git用户的根目录为/var/opt/gitlab,其下有一个.ssh/authorized_keys文件,保存着用户主动保存的公钥
当我们从server-master备份,从server-slave恢复,可能会覆盖server-slave的authorized_keys。从恢复的角度来讲是合理的,保持一致性
gitlab本质是管理git repo的工具,其数据主要包括 git repo,用于管理帐户与仓库信息的 数据库,一些不属于git管理的静态资源。
在备份的时候,相当于委托各个部分都进行备份,最后将其整合到一个tar文件中。
- 数据库使用 pg_dump
- git repo与静态资源 使用文件
恢复的过程就是备份的反过程
- 导入数据库
- 根据数据库信息,导入用户仓库
- 恢复静态资源
gitlab使用rake脚本进行备份/恢复过程
path: /opt/gitlab/embedded/service/gitlab-rails/lib/tasks/gitlab/backup.rake
其中引用了Backup
模块,其定义于
path: /opt/gitlab/embedded/service/gitlab-rails/lib/backup/*.rb
gitlab中的postgresql数据库运行在本地,没有对外提供连接,只能在主机内部通过socket文件进行连接通信
ref: https://stackoverflow.com/questions/30282149/direct-access-to-to-gitlab-database
关键在于-h选项,指向本地路径
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql/ gitlabhq_production
第二种方式
gitlab-rails dbconsole
如果可以本地连接数据库,单独备份/恢复数据库就不是问题
ref:
- https://www.postgresql.org/docs/9.5/static/backup-dump.html
- https://docs.gitlab.com/omnibus/settings/database.html
备份数据库,注意-c选项,在重建表之前先删除表
sudo -u gitlab-psql /opt/gitlab/embedded/bin/pg_dump -c -h /var/opt/gitlab/postgresql/ gitlabhq_production | gzip > /tmp/dump.sql.gz
恢复数据库
gunzip -c /tmp/dump.sql.gz | sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql/ gitlabhq_production
可见最后的makefile文件
一般使用
gitlab-rake gitlab:backup:create
备份之后,在/var/opt/gitlab/backups
目录下会生成一个tar文件,包含了所有备份的数据
可以将这个文件移至测试环境,进行恢复的研究
前提
- 我们在一个新的gitlab测试环境,与备份gitlab的环境版本完全相同
- tar文件已在
/var/opt/gitlab/backups
目录 - tar文件权限为git:git
- makefile也在
/var/opt/gitlab/backups
目录
备份tar文件
make bak
解包tar文件,先内部有什么数据。数据解压在extract目录,数据库在extract/db
make unpack
在进行文件与数据库修改之后,再将其打包回来,作为恢复文件
make pack
gitlab恢复(理论上来讲,需要先停止sidekiq与unicorn服务,这里先忽略)
make restore
检测完整性
make sanitize
如果将tar文件修改的面目全非,再将其恢复回来
make origin
TARFILE=1503976223_2017_08_29_9.4.3_gitlab_backup.tar
all:
echo "select your action..."
bak:
cp -a $TARFILE $TARFILE.bak
unpack:
mkdir -p extract
tar -xvf $TARFILE -C extract/
(cd extract/db; sudo -u git gunzip -c database.sql.gz > database.sql)
pack:
(cd extract/db; sudo -u git gzip -c database.sql > database.sql.gz)
(cd extract; sudo -u git tar -cvf ../$TARFILE .)
restore:
gitlab-rake gitlab:backup:restore BACKUP=1503976223_2017_08_29_9.4.3
origin:
cp -a $TARFILE.bak $TARFILE
sanitize:
gitlab-rake gitlab:check SANITIZE=true
dumpsql:
sudo -u gitlab-psql /opt/gitlab/embedded/bin/pg_dump -c -h /var/opt/gitlab/postgresql/ gitlabhq_production | gzip > /tmp/dump.sql.gz
restoresql:
gunzip -c /tmp/dump.sql.gz | sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql/ gitlabhq_production