Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DP-1774] - topicctl get partitions to display under replicated and offline #155

Merged
merged 7 commits into from
Oct 9, 2023

Conversation

ssingudasu
Copy link
Contributor

@ssingudasu ssingudasu commented Sep 20, 2023

Description:

topicctl get partitions currently displays the partitions status information as below:

Out-of-sync: This mean either under-replicated or offline partition
Wrong Leader: This means partition does not have preferred leader replica
Ok: This means partition is in a good state (inSync)

We need to know whether a partition is ok (inSync), under-replicated, offline and also check if the partition has a preferred replica leader.

Current topicctl get partitions problems/gaps:

  • A partition can be under-replicated(out-of-sync) and also Wrong Leader. current topicctl just shows out-of-sync
    • Status should be partition status - [ok, offline, under-replicated] -> refer Status column
    • Preferred leader should be - [ok, wrong-leader] -> refer Leader column
  • If a partition is offline, get partitions is showing partition as ok. This is wrong
  • get partitions can display information for only one topic. There is a need to display all partitions for all/specific topics
  • Summary for partitions to know just the partition IDs along with status count
  • Ability to filter partitions by parition status (Allowed: Ok, Offline, Under-replicated)

topicctl get partitions Expectations:

This PR will add following features to topicctl

  • topicctl get partitions -> gets all topics partitions information
  • topicctl get partitions --status <ok|offline|under-replicated> -> gets all topics partitions information for specified status
  • topicctl get partitions --summary -> Get the summary of all topics partitions status
  • topicctl get partitions should take 0 or any number of topic arguments. (Before, get partitions was taking only 1 topic argument)
  • repl> get partitions <topic> [--summary] (refer below for repl expectations)

topicctl:repl> get partitions Expectations:

From terminal, topicctl get partitions can take more than one argument.

From repl, filtering multiple topics can get tricky. Current repl implementation takes only fixed number of words (command.args).

Hence in repl, we will make get partitions work with only one argument (topic) and PartitionStatus as "" implying all status

  • repl get partitions expect minimum of 3 arguments and maximum of 4
  • repl> get partitions [topic] -> this works
  • repl> get partitions [topic] --summary -> this works
  • repl> get partitions [topic1] [topic2] -> this works only for [topic1]
  • repl> get partitions [topic1] [topic2] --summary -> this will not work

Partition Status:

  • Under replicated: partition is in under replicated if the number of ISR are less than the Replicas available for the partition
  • Offline: partition is in offline state if partition leader broker ID is not found. (i.e) kafka-go metadata call with partitions having ListenerNotFound Error observed for leader partition
  • Ok: partition does not have under-replicated or offline

Preferred Leader:

  • Preferred Leader Replica: if the leader partition broker id is equal to first available Replicas broker id
  • Un Preferred Leader Replica: if the leader partition broker id is not equal to first available Replicas broker id

Considerations:

  • We leverage kafka-go metadata for all the above partitions statuses
  • Unlike sarama which gives the offline leader broker id as -1, kafka-go gives the offline broker id as 0 (also Port is 0 with Partition Error information etc)
  • To make readability better for users, we will modify broker ID from:
    • any ISR broker ID that does not have valid Host or Port from 0 to -1
    • any Replica broker ID that does not have valid Host or Port from 0 to -1
    • (Offline) Leader Broker ID that does not have a valid Host or Port from 0 to -1

Local Set UP Details:

  • Set up type: Docker
  • Number of Kafka Nodes: 3 (one in each rack). Broker IDs - 0,1,2 running on ports 19092, 29092, 39092. Kafka containers are kafka-1 (broker ID-0), kafka-2(broker ID-1), kafka-3(broker ID-2)
  • Number of zookeeper Nodes: 1, running on port 12181
  • A Mock Producer produces to topic: test-1
  • A Mock Consumer reads from topic: test-1
  • topic: __consumer_offsets - partitions: 50, replication: 3, strategy - any
  • topic: test-1 - partitions 1, replication 3, strategy - any
  • topic: threepartition - partitions 3, replication 2, strategy - Fixed placement. We will have partitions 0,1 spread in brokers 1,2, partition 2 in brokers 0,1

Build

  • Build to run locally - go build -o ./build/topicctl -a ./cmd/topicctl
    • this is to check if we are able to achieve same results leveraging --broker-addr
    • usage: topicctl get partitions --broker-addr localhost:19092
  • Build to run inside container GOOS=linux GOARCH=amd64 go build -o ./build/topicctl-linux-amd64 -a ./cmd/topicctl
    • this is to check if we are able to achieve same results leveraging --zk-addr
    • copy to zk container: docker cp ./build/topicctl-linux-amd64 zookeeper:/tmp
  • usage: docker exec -u root -it zookeeper /tmp/topicctl-linux-amd64 get partitions --zk-addr localhost:2181

topicctl get partitions --help

# ./build/topicctl get partitions --help

Screenshot 2023-09-20 at 19 57 43

Partitions Information

Created a topic threepartition

Screenshot 2023-09-20 at 20 04 59

topic: threepartition

  • partitions 0,1 replicas are in brokers 1, 2
  • partition 2 replicas are in brokers 1, 0
Screenshot 2023-09-20 at 20 05 19

Get all partitions status

# ./build/topicctl get partitions --broker-addr localhost:19092
# docker exec -u root -it zookeeper /tmp/topicctl-linux-amd64 get partitions --zk-addr localhost:2181

Screenshot 2023-09-20 at 19 59 36 Screenshot 2023-09-20 at 19 59 55 Screenshot 2023-09-20 at 20 01 56 Screenshot 2023-09-20 at 20 02 15

Get all partitions status summary

# ./build/topicctl get partitions --broker-addr localhost:19092 --summary
# docker exec -u root -it zookeeper /tmp/topicctl-linux-amd64 get partitions --zk-addr localhost:2181 --summary

Screenshot 2023-09-20 at 20 02 57 Screenshot 2023-09-20 at 20 03 22

Kill Broker IDs 1, 2

# docker stop kafka-2; docker stop kafka-3


Getting current all partition status

# ./build/topicctl get partitions --broker-addr localhost:19092
# docker exec -u root -it zookeeper /tmp/topicctl-linux-amd64 get partitions --zk-addr localhost:2181

Screenshot 2023-09-20 at 20 08 08 Screenshot 2023-09-20 at 20 08 23 Screenshot 2023-09-20 at 20 08 55 Screenshot 2023-09-20 at 20 09 09

Get current all partitions status --summary

# ./build/topicctl get partitions --broker-addr localhost:19092 --summary
# docker exec -u root -it zookeeper /tmp/topicctl-linux-amd64 get partitions --zk-addr localhost:2181 --summary

Screenshot 2023-09-20 at 20 10 52 Screenshot 2023-09-20 at 20 11 08

start broker ID 1

docker start kafka-2


Getting current all partition status

# ./build/topicctl get partitions --broker-addr localhost:19092

Screenshot 2023-09-20 at 20 12 40 Screenshot 2023-09-20 at 20 12 51

repl and get partitions info for topic: __consumer_offsets

# ./build/topicctl repl --broker-addr localhost:19092
# get partitions __consumer_offsets
# get partitions __consumer_offsets --summary

NOTE: repl can take only one topic argument and optional flag --summary

Screenshot 2023-09-20 at 20 14 24 Screenshot 2023-09-20 at 20 14 37 Screenshot 2023-09-20 at 20 14 53

start broker ID 2

# docker start kafka-3


Getting current all partition status and filter for test-1, threepartition

# ./build/topicctl get partitions --broker-addr localhost:29092
# ./build/topicctl get partitions test-1 threepartition --broker-addr localhost:29092 --summary
# ./build/topicctl get partitions test-1 threepartition --broker-addr localhost:29092 --summary --status under-replicated

Screenshot 2023-09-20 at 20 18 11 Screenshot 2023-09-20 at 20 18 31 Screenshot 2023-09-20 at 20 18 49 Screenshot 2023-09-20 at 20 20 18

Getting stable all partition status

# ./build/topicctl get partitions --broker-addr localhost:29092

Screenshot 2023-09-20 at 20 21 21 Screenshot 2023-09-20 at 20 21 34

running preferred leader for topic: partitionthree fixes the wrong leader

Before preferred leader:
# ./build/topicctl get partitions threepartition --broker-addr localhost:29092

Screenshot 2023-09-20 at 20 23 04

After preferred leader:
# ./build/topicctl get partitions threepartiton --broker-addr localhost:29092

Screenshot 2023-09-20 at 20 23 54

threepartition summary:
# ./build/topicctl get partitions threepartition --broker-addr localhost:29092 --summary
Screenshot 2023-09-20 at 20 24 38

threepartition topic summary filter for status ok:
# ./build/topicctl get partitions threepartition --broker-addr localhost:29092 --summary --status ok

Screenshot 2023-09-20 at 20 25 23

fetching partition information for topic that does not displays error message

# ./build/topicctl get partitions topicdoesnotexist --broker-addr localhost:29092

Screenshot 2023-09-20 at 20 27 50

@ssingudasu ssingudasu requested a review from a team as a code owner September 20, 2023 16:52

client := c.GetConnector().KafkaClient
req := kafka.MetadataRequest{
Topics: allTopics,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just pass through nil here instead of fetching all the topics to get metadata for the all topics. Also, we already have a getMetadata method in this file. Should we use that instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

func getMetadata is in brokerclient and not in zkclient.

For consistency in code, I made following modification - 9a9b80f

This has been tested in local as well

If this looks good, Can you please resolve this conversation.

@ssingudasu
Copy link
Contributor Author

All comments have been resolved. Merging the PR

@ssingudasu ssingudasu merged commit 01c619c into master Oct 9, 2023
12 checks passed
@ssingudasu ssingudasu deleted the ssingudasu/dp-1774-get-partitions-changes branch October 9, 2023 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants