Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading graph limited by single CPU core, parallelize? #119

Open
KonradHoeffner opened this issue Sep 6, 2022 · 1 comment
Open

Loading graph limited by single CPU core, parallelize? #119

KonradHoeffner opened this issue Sep 6, 2022 · 1 comment
Labels
enhancement New feature or request
Milestone

Comments

@KonradHoeffner
Copy link
Contributor

KonradHoeffner commented Sep 6, 2022

Loading 16 million triples from a 3.6 GB N-Triples file takes 40 seconds with a LightGraph with a single CPU core maxed throughout on an Intel Core i9 12900k with 24 threads.
Is it possible to parallelize this somehow? Given that N-Triples files can be arbitrarily split, they could be partitioned into n blocks, which are then loaded in parallel.
Or the SPO wrapper could be initialized in parallel to the OPS wrapper when using a FastGraph.

@pchampin pchampin added the enhancement New feature or request label Sep 7, 2022
@pchampin
Copy link
Owner

pchampin commented Sep 7, 2022

Re. LightGraph, there is not much that can be parallelized from my point of view (I may be wrong). One thing that could be explored would be to not block on IO, but rather do the indexing during IO latency (e.g. using async code). As I understand @Tpt is working on a new parser infrastructure that would make this possible.

Re. FastGraph, yes, the creation of all indexes could indeed be parallelized.

@pchampin pchampin added this to the later milestone Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants