Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment: avoid copying data back to write buffer from pending data #768

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

zliang-min
Copy link
Collaborator

PR checklist:

  • Did you run ClangFormat ?
  • Did you separate headers to a different section in existing community code base ?
  • Did you surround proton: starts/ends for new code in existing community code base ?

Please write user-readable short description of the changes:

This PR implements a solution that avoids copying data from pending_data back to wb. The idea is use swap instead. When it detects there are data in pending_data, after wb reset (calling next()), it swaps pending_data and wb, so there is no extra memcpy.

@zliang-min zliang-min requested a review from chenziliang June 8, 2024 08:23
@zliang-min zliang-min self-assigned this Jun 8, 2024
@zliang-min
Copy link
Collaborator Author

zliang-min commented Jun 8, 2024

I did a simple performance test:

  1. Created a stream called users, and inserted 5 million records into it. Each row, after encoded into JSON, its is about 80 bytes, give or take.
  2. Created a external stream called foo_json.
  3. Insert data into foo_json by selecting all data from users by
    INSERT INTO foo_json (name, favorite_number, favorite_color)
    SETTINGS kafka_max_message_size = 100
    SELECT
      name, favorite_number, favorite_color
    FROM
      table(users)
    This SQL will make it hit the pending_data copy logic triggered for every row. So that we can tell the differences between before and after the change. The SQL was run 5 rounds for each.

Results without the changes (develop branch)

Round 1

0 rows in set. Elapsed: 34.039 sec. Processed 5.00 million rows, 228.45 MB (146.89 thousand rows/s., 6.71 MB/s.)

Round 2

0 rows in set. Elapsed: 34.026 sec. Processed 5.00 million rows, 228.45 MB (146.95 thousand rows/s., 6.71 MB/s.)

Round 3

0 rows in set. Elapsed: 34.538 sec. Processed 5.00 million rows, 228.45 MB (144.77 thousand rows/s., 6.61 MB/s.)

Round 4

0 rows in set. Elapsed: 34.026 sec. Processed 5.00 million rows, 228.45 MB (146.95 thousand rows/s., 6.71 MB/s.)

Round 5

0 rows in set. Elapsed: 34.523 sec. Processed 5.00 million rows, 228.45 MB (144.83 thousand rows/s., 6.62 MB/s.)

Results with the changes (this PR's branch)

Round 1

0 rows in set. Elapsed: 35.021 sec. Processed 5.00 million rows, 228.45 MB (142.77 thousand rows/s., 6.52 MB/s.)

Round 2

0 rows in set. Elapsed: 35.028 sec. Processed 5.00 million rows, 228.45 MB (142.74 thousand rows/s., 6.52 MB/s.)

Round 3

0 rows in set. Elapsed: 35.524 sec. Processed 5.00 million rows, 228.45 MB (140.75 thousand rows/s., 6.43 MB/s.)

Round 4

0 rows in set. Elapsed: 35.023 sec. Processed 5.00 million rows, 228.45 MB (142.76 thousand rows/s., 6.52 MB/s.)

Round 5

0 rows in set. Elapsed: 35.526 sec. Processed 5.00 million rows, 228.45 MB (140.74 thousand rows/s., 6.43 MB/s.)

Summary

From the above results, the performance was even better with the original implementation (with memcpy).

@zliang-min
Copy link
Collaborator Author

I have done another round a test with 1,000,000 records, and the records' sizes are around 1KiB ( something between 900 bytes to 1500 bytes ).

The results are still look similar. The performance is better without the changes in this PR

INSERT INTO users_big_json SELECT
  * EXCEPT _tp_time
FROM
  table(users_big)
LIMIT 1000000
SETTINGS kafka_max_message_size = 1800

Results without the changes ( develop branch )

Round 1

0 rows in set. Elapsed: 60.544 sec. Processed 1.01 million rows, 1.25 GB (16.71 thousand rows/s., 20.71 MB/s.)

Round 2

0 rows in set. Elapsed: 58.045 sec. Processed 1.01 million rows, 1.25 GB (17.43 thousand rows/s., 21.60 MB/s.)

Round 3

0 rows in set. Elapsed: 58.039 sec. Processed 1.01 million rows, 1.25 GB (17.44 thousand rows/s., 21.60 MB/s.)

Results with the changes ( this PR's branch )

Round 1

0 rows in set. Elapsed: 63.542 sec. Processed 1.01 million rows, 1.25 GB (15.93 thousand rows/s., 19.73 MB/s.)

Round 2

0 rows in set. Elapsed: 64.046 sec. Processed 1.01 million rows, 1.25 GB (15.80 thousand rows/s., 19.58 MB/s.)

Round 3

0 rows in set. Elapsed: 62.545 sec. Processed 1.01 million rows, 1.25 GB (16.18 thousand rows/s., 20.05 MB/s.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant