Skip to content

Commit

Permalink
fix: the rp are inconsistent before and after the data is migrated (#17)
Browse files Browse the repository at this point in the history
Signed-off-by: shilinlee <836160610@qq.com>
  • Loading branch information
shilinlee authored Dec 11, 2023
1 parent 268cd1d commit ae90caf
Show file tree
Hide file tree
Showing 12 changed files with 275 additions and 152 deletions.
121 changes: 96 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,11 @@ The dataMigrate directly reads data from the TSM file of InfluxDB and writes the

### requirements

Go version >1.16
Go version > 1.16

Setting Environment Variables

```
```bash
> export GOPATH=/path/to/dir
> export GO111MODULE=on
> export GONOSUMDB=*
Expand All @@ -22,39 +22,110 @@ Setting Environment Variables

### compile

```
```bash
> bash build.sh
```

### data migration
## data migration

Before migrating, you need to create the corresponding database and RP in openGemini. (This behavior may be fixed in a future release.)

```bash
> dataMigrate run --from dir/to/influxdb/data --to ip:port --database dbname
```
> dataMigrate --from path/to/tsm-file --to ip:port --database dbname

**WARNING**: When using this tool, please do not migrate data without shutting down InfluxDB if possible; otherwise, some
unknown problems may occur. To ensure that data is as complete as possible after migration, keep the empty write load
running before shutting down InfluxDB and wait for data in the cache to complete disk dumping (10 minutes by default).


### example 1: Migrate all databases

example influxdb data dir: `/var/lib/influxdb/data`

```bash
> ls -l /var/lib/influxdb/data
total 0
drwx------ 4 root root 128B 12 6 14:58 _internal
drwx------ 4 root root 128B 12 6 14:59 db0
drwx------ 4 root root 128B 12 8 09:01 db1
```

We migrate `internal` db

```bash
> ./dataMigrate run --from /var/lib/influxdb/data --to ip:port --database _internal

2023/12/08 14:17:48 Data migrate tool starting
2023/12/08 14:17:48 Debug mode is enabled
2023/12/08 14:17:48 Searching for tsm files to migrate
2023/12/08 14:17:48 Writing out data from shard _internal/monitor/1, [2/4]...
2023/12/08 14:17:48 Writing out data from shard db0/autogen/2, [4/4]...
2023/12/08 14:17:48 Writing out data from shard _internal/monitor/3, [3/4]...
2023/12/08 14:17:48 Writing out data from shard db1/autogen/5, [1/4]...
2023/12/08 14:17:48 Dealing file: /Users/shilinlee/.influxdb/data/_internal/monitor/1/000000001-000000001.tsm
2023/12/08 14:17:48 Dealing file: /Users/shilinlee/.influxdb/data/_internal/monitor/3/000000001-000000001.tsm
2023/12/08 14:17:48 Dealing file: /Users/shilinlee/.influxdb/data/db1/autogen/5/000000001-000000001.tsm
2023/12/08 14:17:48 Dealing file: /Users/shilinlee/.influxdb/data/db0/autogen/2/000000001-000000001.tsm
2023/12/08 14:17:48 Shard db0/autogen/2 takes 1.703084ms to migrate, with 1 tags, 2 fields, 2 rows read
2023/12/08 14:17:48 Shard db1/autogen/5 takes 2.076959ms to migrate, with 5 tags, 1 fields, 3 rows read
2023/12/08 14:17:48 Shard _internal/monitor/1 takes 467.09275ms to migrate, with 49 tags, 115 fields, 34098 rows read
2023/12/08 14:17:48 Shard _internal/monitor/3 takes 475.290791ms to migrate, with 49 tags, 115 fields, 22443 rows read
2023/12/08 14:17:48 Total: takes 477.482791ms to migrate, with 54 tags, 118 fields, 56546 rows read.
```
Usage: dataMigrate [flags]

-database string
Optional: the database to read
-end string
Optional: the end time to read (RFC3339 format)
-from string
Data storage path (default "/var/lib/Influxdb/data")
-retention string
Optional: the retention policy to read (requires -database)
-start string
Optional: the start time to read (RFC3339 format)
-batch int
Optional: specify batch size for inserting lines (default 1000)
-mode string
Optional: whether to enable debug log or not (set as "Debug" to enable it)
-to string
Destination host to write data to (default "127.0.0.1:8086",which is the openGemini service default address)
### example 2: Migrate the specified database

```bash
> ./dataMigrate run --from /var/lib/influxdb/data --to ip:port --database db0

2023/12/08 14:31:47 Data migrate tool starting
2023/12/08 14:31:47 Debug mode is enabled
2023/12/08 14:31:47 Searching for tsm files to migrate
2023/12/08 14:31:47 Writing out data from shard db0/autogen/2, [1/1]...
2023/12/08 14:31:47 Dealing file: /Users/shilinlee/.influxdb/data/db0/autogen/2/000000001-000000001.tsm
2023/12/08 14:31:47 Shard db0/autogen/2 takes 45.883209ms to migrate, with 1 tags, 2 fields, 2 rows read
2023/12/08 14:31:47 Total: takes 48.502792ms to migrate, with 1 tags, 2 fields, 2 rows read.
```

**Notice**: When using this tool, please do not migrate data without shutting down InfluxDB if possible; otherwise, some
unknown problems may occur. To ensure that data is as complete as possible after migration, keep the empty write load
running before shutting down InfluxDB and wait for data in the cache to complete disk dumping (10 minutes by default).
### example 3: Migrate the specified database with auth and https

```bash
> ./dataMigrate run --from /var/lib/influxdb/data --to ip:port --database db0 \
--ssl --unsafeSsl --username rwusr --password This@123

2023/12/08 14:31:47 Data migrate tool starting
2023/12/08 14:31:47 Debug mode is enabled
2023/12/08 14:31:47 Searching for tsm files to migrate
2023/12/08 14:31:47 Writing out data from shard db0/autogen/2, [1/1]...
2023/12/08 14:31:47 Dealing file: /Users/shilinlee/.influxdb/data/db0/autogen/2/000000001-000000001.tsm
2023/12/08 14:31:47 Shard db0/autogen/2 takes 45.883209ms to migrate, with 1 tags, 2 fields, 2 rows read
2023/12/08 14:31:47 Total: takes 48.502792ms to migrate, with 1 tags, 2 fields, 2 rows read.
```


## For more help

```bash
Reads TSM files into InfluxDB line protocol format and write into openGemini

Usage:
run [flags]

Flags:
--batch int Optional: specify batch size for inserting lines (default 1000)
--database string Optional: the database to read
--debug Optional: whether to enable debug log or not
--end string Optional: the end time to read (RFC3339 format)
-f, --from string Influxdb Data storage path. See your influxdb config item: data.dir (default "/var/lib/influxdb/data")
-h, --help help for run
-p, --password string Optional: The password to connect to the openGemini cluster.
--retention string Optional: the retention policy to read (required -database)
--ssl Optional: Use https for requests.
--start string Optional: the start time to read (RFC3339 format)
-t, --to string Destination host to write data to (default "127.0.0.1:8086")
--unsafeSsl Optional: Set this when connecting to the cluster using https and not use SSL verification.
-u, --username string Optional: The username to connect to the openGemini cluster.
```

**Welcome to add more features.**
44 changes: 44 additions & 0 deletions cmd/root.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
package cmd

import (
"github.com/openGemini/dataMigrate/src"
"github.com/spf13/cobra"
)

var (
RootCmd *cobra.Command // represents the cluster command
opt src.DataMigrateOptions
)

func init() {
RootCmd = &cobra.Command{
Use: "run",
Short: "Reads TSM files into InfluxDB line protocol format and write into openGemini",
SilenceUsage: true,
SilenceErrors: true,
RunE: func(cmd *cobra.Command, args []string) error {
migrateCmd := src.NewDataMigrateCommand(&opt)
if err := migrateCmd.Run(); err != nil {
return err
}
return nil
},
}

RootCmd.Flags().StringVarP(&opt.Username, "username", "u", "", "Optional: The username to connect to the openGemini cluster.")
RootCmd.Flags().StringVarP(&opt.Password, "password", "p", "", "Optional: The password to connect to the openGemini cluster.")
RootCmd.Flags().StringVarP(&opt.DataDir, "from", "f", "/var/lib/influxdb/data", "Influxdb Data storage path. See your influxdb config item: data.dir")
RootCmd.Flags().StringVarP(&opt.Out, "to", "t", "127.0.0.1:8086", "Destination host to write data to")
RootCmd.Flags().StringVarP(&opt.Database, "database", "", "", "Optional: the database to read")
RootCmd.Flags().StringVarP(&opt.RetentionPolicy, "retention", "", "", "Optional: the retention policy to read (required -database)")
RootCmd.Flags().StringVarP(&opt.Start, "start", "", "", "Optional: the start time to read (RFC3339 format)")
RootCmd.Flags().StringVarP(&opt.End, "end", "", "", "Optional: the end time to read (RFC3339 format)")
RootCmd.Flags().IntVarP(&opt.BatchSize, "batch", "", 1000, "Optional: specify batch size for inserting lines")
RootCmd.Flags().BoolVarP(&opt.Debug, "debug", "", false, "Optional: whether to enable debug log or not")
RootCmd.Flags().BoolVarP(&opt.Ssl, "ssl", "", false, "Optional: Use https for requests.")
RootCmd.Flags().BoolVarP(&opt.UnsafeSsl, "unsafeSsl", "", false, "Optional: Set this when connecting to the cluster using https and not use SSL verification.")
}

func Execute() error {
return RootCmd.Execute()
}
Binary file added dataMigrate
Binary file not shown.
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ require (
github.com/influxdata/influxdb v1.8.0
github.com/influxdata/influxdb1-client v0.0.0-20200827194710-b269163b24ab
github.com/pkg/errors v0.8.1
github.com/spf13/cobra v0.0.3
go.uber.org/atomic v1.3.2
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e
golang.org/x/sys v0.8.0 // indirect
Expand Down
3 changes: 3 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ github.com/gopherjs/gopherjs v0.0.0-20181017120253-0766667cb4d1/go.mod h1:wJfORR
github.com/hashicorp/golang-lru v0.5.0/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8=
github.com/hashicorp/golang-lru v0.5.1/go.mod h1:/m3WP610KZHVQ1SGc6re/UDhFvYD7pJ4Ao+sR/qLZy8=
github.com/ianlancetaylor/demangle v0.0.0-20181102032728-5e5cf60278f6/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
github.com/inconshreveable/mousetrap v1.0.0 h1:Z8tu5sraLXCXIcARxBp/8cbvlwVa7Z1NHg9XEKhtSvM=
github.com/inconshreveable/mousetrap v1.0.0/go.mod h1:PxqpIevigyE2G7u3NXJIT2ANytuPF1OarO4DADm73n8=
github.com/influxdata/flux v0.65.0 h1:57tk1Oo4gpGIDbV12vUAPCMtLtThhaXzub1XRIuqv6A=
github.com/influxdata/flux v0.65.0/go.mod h1:BwN2XG2lMszOoquQaFdPET8FRQfrXiZsWmcMO9rkaVY=
Expand Down Expand Up @@ -216,7 +217,9 @@ github.com/smartystreets/goconvey v1.6.4/go.mod h1:syvi0/a8iFYH4r/RixwvyeAJjdLS9
github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72 h1:qLC7fQah7D6K1B0ujays3HV9gkFtllcxhzImRR7ArPQ=
github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
github.com/spf13/cast v1.3.0/go.mod h1:Qx5cxh0v+4UWYiBimWS+eyWzqEqokIECu5etghLkUJE=
github.com/spf13/cobra v0.0.3 h1:ZlrZ4XsMRm04Fr5pSFxBgfND2EBVa1nLpiy1stUsX/8=
github.com/spf13/cobra v0.0.3/go.mod h1:1l0Ry5zgKvJasoi3XT1TypsSe7PqH0Sj9dhYf7v3XqQ=
github.com/spf13/pflag v1.0.3 h1:zPAT6CGy6wXeQ7NtTnaTerfKOsV6V6F8agHXFiazDkg=
github.com/spf13/pflag v1.0.3/go.mod h1:DYY7MBk1bdzusC3SYhjObp+wFpr4gzcvqqNjLnInEg4=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.1.1/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
Expand Down
18 changes: 4 additions & 14 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,16 @@ limitations under the License.
package main

import (
"os"

"github.com/openGemini/dataMigrate/cmd"
"github.com/openGemini/dataMigrate/src"
"os"
)

func main() {
defer src.Logger.Close()
src.Logger.LogString("Data migrate tool starting", src.TOCONSOLE, src.LEVEL_INFO)
if err := Run(os.Args[1:]...); err != nil {
err := cmd.Execute()
if err != nil {
src.Logger.LogError(err)
os.Exit(1)
}
}

func Run(args ...string) error {
if len(args) > 0 {
cmd := src.NewDataMigrateCommand()
if err := cmd.Run(args...); err != nil {
return err
}
}
return nil
}
5 changes: 3 additions & 2 deletions src/cursor.go
Original file line number Diff line number Diff line change
Expand Up @@ -341,8 +341,9 @@ func (s *Scanner) writeBatches(c client.Client, cmd Migrator) error {
for {
if flag {
bp, _ = client.NewBatchPoints(client.BatchPointsConfig{
Database: cmd.getDatabase(),
Precision: "ns",
Database: cmd.getDatabase(),
RetentionPolicy: cmd.getRetentionPolicy(),
Precision: "ns",
})
flag = false
}
Expand Down
Loading

0 comments on commit ae90caf

Please sign in to comment.