Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #264: Update qviz #401

Closed
wants to merge 30 commits into from
Closed

Conversation

jorgeMarin1
Copy link
Contributor

@jorgeMarin1 jorgeMarin1 commented Sep 6, 2024

Description

Fixes #264 .

With the new qbeast format, when we try to run the OTree Index Visualization (qviz) it crashes, so we need to update it.

Type of change

Bug fix.

Checklist:

  • New feature / bug fix has been committed following the Contribution guide.
  • Add tests.
  • Your branch is updated to the main branch (dependent changes have been merged).

How Has This Been Tested? (Optional)

The OTree Visualization (qviz) was run locally on my PC.

Also, a pyspark shell was created in order to do some tests at the implemented fix. Then, a huge table of 4.000.000 rows was created, full of fake data. From this table, three Qbeast tables were created and read as Delta tables.

From these tables, we compared each of their total number of elements with the sumatory of the element_count of their respective cubes. Since these values matched, we concluded it worked as it should.

@jorgeMarin1 jorgeMarin1 added the type: bug Something isn't working label Sep 6, 2024
@jorgeMarin1 jorgeMarin1 self-assigned this Sep 6, 2024
@Jiaweihu08 Jiaweihu08 changed the title Issue 264: Update qviz Issue #264: Update qviz Sep 9, 2024
@jorgeMarin1
Copy link
Contributor Author

jorgeMarin1 commented Sep 10, 2024

Cubes are obtained by using: cubes = df["tags.blocks"]

These cubes are in a panda Series and have the following structure (this is one of these cubes):
[{"cubeId":"wggg","minWeight":-103008559,"maxWeight":2147483647,"elementCount":9521,"replicated":false},{"cubeId":"wggQ","minWeight":-100165008,"maxWeight":2147483647,"elementCount":880,"replicated":false},{"cubeId":"wgg","minWeight":-1401341541,"maxWeight":-103459089,"elementCount":9759,"replicated":false},{"cubeId":"wggA","minWeight":-102673026,"maxWeight":2147483647,"elementCount":5982,"replicated":false},{"cubeId":"wggw","minWeight":-91591261,"maxWeight":2147483647,"elementCount":1242,"replicated":false}]

Each of these are blocks, which are related to a cube through the cubeId.

@jorgeMarin1
Copy link
Contributor Author

jorgeMarin1 commented Sep 12, 2024

This is the structure of the metadata for revision_id = 1 :

Metadata: {'revisionID': 1, 'timestamp': 1726134712571, 'tableID': '/Users/jorgemarin/Documents/qviz-bug/qbeast-spark/utils/visualizer/tests/resources/test_table_delta_multiples_writes/', 'desiredCubeSize': 10000, 'columnTransformers': [{'className': 'io.qbeast.core.transform.LinearTransformer', 'columnName': 'user_id', 'dataType': 'IntegerDataType'}, {'className': 'io.qbeast.core.transform.StringHistogramTransformer', 'columnName': 'brand', 'dataType': 'StringDataType'}], 'transformations': [{'className': 'io.qbeast.core.transform.LinearTransformation', 'minNumber': 274969076, 'maxNumber': 566347286, 'nullValue': 391116828, 'orderedDataType': 'IntegerDataType'}, {'className': 'io.qbeast.core.transform.StringHistogramTransformation', 'histogram': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']}]}

@Jiaweihu08
Copy link
Member

Can you add ruff for code formatting? @jorgeMarin1

@Jiaweihu08
Copy link
Member

Jiaweihu08 commented Sep 17, 2024

Also, remove the redundant test tables. You can use the most complex one to cover all the test cases.
You don't need the JSON log files that end with crc.

Copy link
Member

@osopardo1 osopardo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small things that I noticed while reviewing !

utils/visualizer/qviz/qviz.py Outdated Show resolved Hide resolved
utils/visualizer/tests/content_loader_test.py Outdated Show resolved Hide resolved
utils/visualizer/tests/drawing_elements_test.py Outdated Show resolved Hide resolved
utils/visualizer/pyproject.toml Outdated Show resolved Hide resolved
@Qbeast-io Qbeast-io deleted a comment from jorgeMarin1 Oct 17, 2024
@Jiaweihu08
Copy link
Member

I'm closing this PR as I'm not allowed to update the branch. We will be working on this issue with #437 instead.

@Jiaweihu08 Jiaweihu08 closed this Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update qviz
3 participants