libgrape-lite
A C++ library for parallel graph processing
libgrape-lite Documentation


libgrape-lite

A C++ library for parallel graph processing

libgrape-lite is a C++ library from Alibaba for parallel graph processing. It differs from prior systems in its ability to parallelize sequential graph algorithms as a whole by following the PIE programming model from GRAPE. Sequential algorithms can be easily "plugged into" libgrape-lite with only minor changes and get parallelized to handle large graphs efficiently. In addition to the ease of programming, libgrape-lite is designed to be highly efficient and flexible, to cope the scale, variety and complexity from real-life graph applications.

Building libgrape-lite

Dependencies

libgrape-lite is developed and tested on CentOS 7. It should also work on other unix-like distributions. Building libgrape-lite requires the following softwares installed as dependencies.

Here are the dependencies for optional features:

Extra dependencies are required by examples:

Building libgrape-lite and examples

Once the required dependencies have been installed, go to the root directory of libgrape-lite and do a out-of-source build using CMake.

mkdir build && cd build
cmake ..
make -j

The building targets include a shared/static library, and two sets of examples: analytical_apps and a gnn_sampler.

Alternatively, you can build a particular target with command:

make libgrape-lite # or
make analytical_apps # or
make gnn_sampler

Running libgrape-lite applications

Graph format

The input of libgrape-lite is formatted following the LDBC Graph Analytics benchmark, with two files for each graph, a .v file for vertices with 1 or 2 columns, which are a vertex_id and optionally followed by the data assigned to the vertex; and a .e file for edges with 2 or 3 columns, representing source, destination and optionally the data on the edge, correspondingly. See sample files p2p-31.v and p2p-31.e under the dataset directory.

Example applications

libgrape-lite provides six algorithms from the LDBC benchmark as examples. The deterministic algorithms are, single-source shortest path(SSSP), connected component(WCC), PageRank, local clustering coefficient(LCC), community detection of label propagation(CDLP), and breadth first search(BFS).

To run a specific analytical application, users may use command like this:

# run single-source shortest path with 4 workers in local.
mpirun -n 4 ./run_app --vfile ../dataset/p2p-31.v --efile ../dataset/p2p-31.e --application sssp --sssp_source 6 --out_prefix ./output_sssp --directed
# or run connected component with 4 workers on a cluster.
# HOSTFILE provides a list of hosts where MPI processes are launched.
mpirun -n 4 -hostfile HOSTFILE ./run_app --application=wcc --vfile ../dataset/p2p-31.v --efile ../dataset/p2p-31.e --out_prefix ./output_wcc
# see more flags info.
./run_app --help

LDBC benchmarking

The analytical applications support the LDBC Analytical Benchmark suite with the provided ldbc_driver. Please refer to ldbc_driver for more details. The benchmark results for libgrape-lite and other state-of-the-art systems could be found here.

GNN sampler

In addition to offline graph analytics, libgrape-lite could also be utilized to handle more complex graph tasks. A sampler for GNN training/inference on dynamic graphs (taking graph changes and queries, and producing results via Kafka) is included as an example. Please refer to examples/gnn_sampler for more details.

Documentation

Documentation is generated using Doxygen. Users can build doxygen documentation in the build directory using:

cd build
make doc
# open docs/index.html

The latest version of online documentation can be found at https://alibaba.github.io/libgrape-lite

License

libgrape-lite is distributed under Apache License 2.0. Please note that third-party libraries may not have the same license as libgrape-lite.

Acknowledgements

Publications

Getting involved

Thank you in advance for your contributions!