-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Cursors through API Query Params #14110
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14110 +/- ##
============================================
- Coverage 61.75% 55.36% -6.40%
- Complexity 207 791 +584
============================================
Files 2436 2078 -358
Lines 133233 109585 -23648
Branches 20636 17341 -3295
============================================
- Hits 82274 60668 -21606
+ Misses 44911 44073 -838
+ Partials 6048 4844 -1204
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Setup CursorRequestHandlerDelegate Checkpoint - compiles and sanity test works. Setup CursorRequestHandlerDelegate Checkpoint - compiles and sanity test works. add spi for cursors Checkpoint: testCursorWorkflow works. Checkpoint: most tests work. Fix checkstyle Register ResultStoreCleaner Redesign SPI. Make cursor integration test work Add Test for FsResultStore Add Test for Auth Remove unnecessary files. Checkpoint. Fix tests to use URLs Undo unnecessary changes.
Add comments and fix failures.
CommonConstants.CursorConfigs.DEFAULT_QUERY_RESULT_SIZE); | ||
} | ||
|
||
if (numRows > CommonConstants.CursorConfigs.MAX_QUERY_RESULT_SIZE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to introduce the MAX_QUERY_RESULT_SIZE
here if there is no such restriction on the generic queries?
@@ -100,6 +103,9 @@ | |||
public class PinotClientRequest { | |||
private static final Logger LOGGER = LoggerFactory.getLogger(PinotClientRequest.class); | |||
|
|||
@Inject | |||
PinotConfiguration _brokerConf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't have to inject the broker config here as it's already in the BaseBrokerRequestHandler
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no getter right now to get the config. I can add that. However is that a better option than injecting the config ?
private static final Logger LOGGER = LoggerFactory.getLogger(ResponseStoreResource.class); | ||
|
||
@Inject | ||
private PinotConfiguration _brokerConf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, you might not need to inject the broker config here.
} | ||
|
||
response.setResultTable( | ||
new ResultTable(resultTable.getDataSchema(), resultTable.getRows().subList(offset, sliceEnd))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean every time a read request will fetch the whole result table into memory first and then pick the page from the memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation does read the whole result table into memory. Does LinkedIn implementation support seeking within the result table ? Do you have suggestions on how can the interface be changed to also support seek ?
* Starts from 0. | ||
* @return current offset. | ||
*/ | ||
int getOffset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: put the methods for the same object together, i.e. swapping the setNumRows()
and getOffset()
methods
x -> new InstanceInfo(x.getInstanceName(), x.getHostName(), Integer.parseInt(x.getPort())))); | ||
|
||
try { | ||
Map<String, List<CursorResponseNative>> brokerResponses = getAllQueryResults(brokers, Collections.emptyMap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, it seems every controller will try to fetch all the results from all the brokers, is it kind of a waste of resources to do that? Could we distribute the cleanup workloads to controllers instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that only the lead controller will run the periodic tasks. Right now, the controller does not have any information about ResponseStores. Only the broker has information.
The major point to discuss - is it a requirement that a ResponseStore should be accessible from all nodes ? Then the default implementation of using the broker local filesystem has to be changed.
The current perioidc task and cleaner assumes that a only a specific broker knows about the responses it has stored.
Cursor support will allow Pinot clients to consume query results in smaller chunks. This feature allows clients to work with lesser resources esp. memory. Application logic is simpler with cursors. For example an app UI paginates through results in a table or a graph. Cursor support has been implemented using APIs.
Design Doc
Implementation for #13185
API
POST /query/sql
A new broker API parameter has been added to trigger pagination.
The API accepts the following new optional query parameters:
The response contains the following extra fields:
GET /resultStore/{requestId}/results
This is a broker API that can be used to iterate over the result set of a query submitted using the above API.
The API accepts the following query parameters:
GET /resultStore/{requestId}/
Returns the BrokerResponse metadata of the query.
GET /resultStore
Lists all the requestIds of all the query results available in the response store.
DELETE /resultStore/{requestId}/
Delete the results of a query.
The API accepts the following query parameters:
SPI
The PR implements a FileSystem ResponseStore and a JSON ResponseSerde.
The feature provides two SPIs to extend the feature to support other implementations:
Both SPIs use Java SPI and the default ServiceLoader to find implementation of the SPIs. All implementation should be annotated with AutoService to help generate files for discovering the implementations.
Configuration
ResponseStore
File Response Store
Miscellaneous
tags: feature, multi-stage, release-notes