Project: GRAIL

Group # 27

students: Giovanni Tangredi (s276086), Francesco Xia (s276509)

Suggested action for compiling and executing the project*

* this project has been tested on a Linux machine.

To compile and run the program:

make
./bin/main <graph_file_path> <num_intervals> <query_file_path> [-l] [-q]

OPTIONS
	-l     saves generated labels to "test/output/labels_out.txt"
	-q     saves query results to "test/output/queries_out.txt"

To run the test suites:

make test_node
./bin/test

The other rules available rules are: test_label, test_graph, test_query, test_bitmap

NOTE: the TEST flag must be set to 0 when the program is executed in normal mode. Otherwise, the labels are not generated randomly. Set TEST to 1 when running the test suites.

Project Stucture

SDPProjectQ2
├── README.md
├── bin
├── gen_query.c
├── include
│   ├── bitmap.h
│   ├── constants.h
│   ├── graph.h
│   ├── label.h
│   ├── menu.h
│   ├── query.h
│   └── time_tracker.h
├── makefile
├── src
│   ├── bitmap.c
│   ├── graph.c
│   ├── label.c
│   ├── main.c
│   ├── menu.c
│   ├── query.c
│   └── time_tracker.c
└── test
    ├── input
    │   ├── grafo20.gra
    │   ├── grafo20.png
    │   ├── grafo20_25.que
    │   └── grafo_con_rango.png
    │ 
    │       
    ├── output
    │   ├── labels_out.txt
    │   └── queries_out.txt
    └── tests
        ├── bitmap_test.c
        ├── graph_test.c
        ├── label_test.c
        ├── node_test.c
        └── query_test.c

This repository has been divided as follows:

the bin folder contains the executables.
the src folder contains the source files.
the test folder is further divided in:
- the input folder contains an example of a graph and a set of queries.
- the output folder contains the output generated by the program (i.e., the labels, the query results, and copy of the graph).
- the tests folder contains the test source files.
the include folder contains the header files.

The user can customize the program's behavior by modifying the values of the flags found in constant.h:

DEBUG: verbose mode.
TEST: if set to 1, the labels are not randomly generated.
MAX_THREADS_GRAPH: number of running threads used during the graph generation phase. MUST be > 0.
MAX_THREADS_QUERY: number of running threads used to solve the queries. MUST be > 0.
ALL_NODES: prints the graph nodes on standard output.

The tests were created using the Check.h framework.

Program output

If the option -q is passed, the program creates a file called queries_out.txt, where each line represents the result of a query: source_id destination_id {0/1} (unreachable/reachable).

If the option -q is missing, the user can interactively query the reachability of two nodes.

Similarly passing the option -l, the program creates a file named labels_out.txt, with the following format: node_id: [l1_l, l1_r] [l2_l, l2_r] ...

Data Structures

The graph is represented using a data structure similar to an adjacency list. Specifically, we used a vector of Node pointers to depict the graph, where each node can be indexed through its node_id. Furthermore, each node stores:

a list of its outgoing edges.
a vector of intervals of size num_intervals

The queries are stored in a static struct called queries in query.c. Specifically:

a single query is represented as a route (route.src and route.dst);
The result of a particular query is stored in a fixed position of a bitmap, whose position is determined by the location of the query inside the file query_file_path. Each bit indicates either non-reachability/reachability between two nodes (0/1). The i-th bit of the bitmap represents the state of i-th query (i.e., the query found at i-th lines of the file <query_file>);

Strategy

The generation of the graph follows these main ideas:

the generation is done concurrently by running MAX_THREADS_GRAPH number of threads
at each iteration, each thread
- reads x lines
- increment shared counter of read lines (see p_curr_iteration)
- parse those lines (e.g. find the node_id of the current node and of its children)
- allocates a single chunk of memory, big enough to store x Nodes (see node_create_multiple)
- save each created Node at graph->nodes[<node_id>] = node;

repeat until end of file.

The intervals are generated concurrently by running n threads, where n is equal to the parameter num_intervals. The interval generated by i-th thread is saved in the corresponding Node.intervals[i-th];

The label generator is an implementation of the algorithm 1 found in [1].

Regarding the queries, they are equally divided in n blocks, where n is empirically set to 4 (see constants.h). Each block is then assigned to a thread, which task is to find the non-reachability/reachability of every query in its block. The query solver is an implementation of algorithm 2 found in [1].

References:

[1]: Yıldırım, H., Chaoji, V. & Zaki, M.J. GRAIL: a scalable index for reachability queries in very large graphs. The VLDB Journal 21, 509–534 (2012). https://doi.org/10.1007/s00778-011-0256-4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project: GRAIL

Suggested action for compiling and executing the project*

Project Stucture

Program output

Data Structures

Strategy

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
bin		bin
docs		docs
include		include
src		src
test		test
.gitignore		.gitignore
README.md		README.md
gen_query.c		gen_query.c
makefile		makefile

RocinanteExp/GRAIL

Folders and files

Latest commit

History

Repository files navigation

Project: GRAIL

Suggested action for compiling and executing the project*

Project Stucture

Program output

Data Structures

Strategy

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages