Experiment: avoid copying data back to write buffer from pending data #768

zliang-min · 2024-06-08T08:23:46Z

PR checklist:

Did you run ClangFormat ?
Did you separate headers to a different section in existing community code base ?
Did you surround proton: starts/ends for new code in existing community code base ?

Please write user-readable short description of the changes:

This PR implements a solution that avoids copying data from pending_data back to wb. The idea is use swap instead. When it detects there are data in pending_data, after wb reset (calling next()), it swaps pending_data and wb, so there is no extra memcpy.

zliang-min · 2024-06-08T08:38:49Z

I did a simple performance test:

Created a stream called users, and inserted 5 million records into it. Each row, after encoded into JSON, its is about 80 bytes, give or take.
Created a external stream called foo_json.
Insert data into foo_json by selecting all data from users by
```
INSERT INTO foo_json (name, favorite_number, favorite_color)
SETTINGS kafka_max_message_size = 100
SELECT
  name, favorite_number, favorite_color
FROM
  table(users)
```
This SQL will make it hit the pending_data copy logic triggered for every row. So that we can tell the differences between before and after the change. The SQL was run 5 rounds for each.

Results without the changes (develop branch)

Round 1

0 rows in set. Elapsed: 34.039 sec. Processed 5.00 million rows, 228.45 MB (146.89 thousand rows/s., 6.71 MB/s.)

Round 2

0 rows in set. Elapsed: 34.026 sec. Processed 5.00 million rows, 228.45 MB (146.95 thousand rows/s., 6.71 MB/s.)

Round 3

0 rows in set. Elapsed: 34.538 sec. Processed 5.00 million rows, 228.45 MB (144.77 thousand rows/s., 6.61 MB/s.)

Round 4

0 rows in set. Elapsed: 34.026 sec. Processed 5.00 million rows, 228.45 MB (146.95 thousand rows/s., 6.71 MB/s.)

Round 5

0 rows in set. Elapsed: 34.523 sec. Processed 5.00 million rows, 228.45 MB (144.83 thousand rows/s., 6.62 MB/s.)

Results with the changes (this PR's branch)

Round 1

0 rows in set. Elapsed: 35.021 sec. Processed 5.00 million rows, 228.45 MB (142.77 thousand rows/s., 6.52 MB/s.)

Round 2

0 rows in set. Elapsed: 35.028 sec. Processed 5.00 million rows, 228.45 MB (142.74 thousand rows/s., 6.52 MB/s.)

Round 3

0 rows in set. Elapsed: 35.524 sec. Processed 5.00 million rows, 228.45 MB (140.75 thousand rows/s., 6.43 MB/s.)

Round 4

0 rows in set. Elapsed: 35.023 sec. Processed 5.00 million rows, 228.45 MB (142.76 thousand rows/s., 6.52 MB/s.)

Round 5

0 rows in set. Elapsed: 35.526 sec. Processed 5.00 million rows, 228.45 MB (140.74 thousand rows/s., 6.43 MB/s.)

Summary

From the above results, the performance was even better with the original implementation (with memcpy).

zliang-min · 2024-06-10T22:03:09Z

I have done another round a test with 1,000,000 records, and the records' sizes are around 1KiB ( something between 900 bytes to 1500 bytes ).

The results are still look similar. The performance is better without the changes in this PR

INSERT INTO users_big_json SELECT
  * EXCEPT _tp_time
FROM
  table(users_big)
LIMIT 1000000
SETTINGS kafka_max_message_size = 1800

Results without the changes ( develop branch )

Round 1

0 rows in set. Elapsed: 60.544 sec. Processed 1.01 million rows, 1.25 GB (16.71 thousand rows/s., 20.71 MB/s.)

Round 2

0 rows in set. Elapsed: 58.045 sec. Processed 1.01 million rows, 1.25 GB (17.43 thousand rows/s., 21.60 MB/s.)

Round 3

0 rows in set. Elapsed: 58.039 sec. Processed 1.01 million rows, 1.25 GB (17.44 thousand rows/s., 21.60 MB/s.)

Results with the changes ( this PR's branch )

Round 1

0 rows in set. Elapsed: 63.542 sec. Processed 1.01 million rows, 1.25 GB (15.93 thousand rows/s., 19.73 MB/s.)

Round 2

0 rows in set. Elapsed: 64.046 sec. Processed 1.01 million rows, 1.25 GB (15.80 thousand rows/s., 19.58 MB/s.)

Round 3

0 rows in set. Elapsed: 62.545 sec. Processed 1.01 million rows, 1.25 GB (16.18 thousand rows/s., 20.05 MB/s.)

zliang-min added 2 commits June 8, 2024 01:04

enhancement: avoid copying data from pending_data in normal cases

fc02f23

fixed: oversized messages

d2a5cb8

zliang-min requested a review from chenziliang June 8, 2024 08:23

zliang-min self-assigned this Jun 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment: avoid copying data back to write buffer from pending data #768

Experiment: avoid copying data back to write buffer from pending data #768

zliang-min commented Jun 8, 2024

zliang-min commented Jun 8, 2024 •

edited

Loading

zliang-min commented Jun 10, 2024

Experiment: avoid copying data back to write buffer from pending data #768

Are you sure you want to change the base?

Experiment: avoid copying data back to write buffer from pending data #768

Conversation

zliang-min commented Jun 8, 2024

zliang-min commented Jun 8, 2024 • edited Loading

Results without the changes (develop branch)

Round 1

Round 2

Round 3

Round 4

Round 5

Results with the changes (this PR's branch)

Round 1

Round 2

Round 3

Round 4

Round 5

Summary

zliang-min commented Jun 10, 2024

Results without the changes ( develop branch )

Round 1

Round 2

Round 3

Results with the changes ( this PR's branch )

Round 1

Round 2

Round 3

zliang-min commented Jun 8, 2024 •

edited

Loading