Skip to content

Commit

Permalink
[release] 0.5.0
Browse files Browse the repository at this point in the history
[release] 0.5.0
  • Loading branch information
Kevin Gómez committed Aug 21, 2019
2 parents c169694 + 161883c commit eeacd78
Show file tree
Hide file tree
Showing 1,046 changed files with 12,585 additions and 23,589 deletions.
93 changes: 28 additions & 65 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,27 @@
[![Apache License, Version 2.0, January 2004](https://img.shields.io/github/license/apache/maven.svg?label=License)](https://www.apache.org/licenses/LICENSE-2.0)
[![Maven Central](https://img.shields.io/badge/Maven_Central-0.4.5-blue.svg?label=Maven%20Central)](http://search.maven.org/#search%7Cga%7C1%7Cgradoop)
[![Maven Central](https://img.shields.io/badge/Maven_Central-0.5.0-blue.svg?label=Maven%20Central)](http://search.maven.org/#search%7Cga%7C1%7Cgradoop)
[![Build Status](https://travis-ci.org/dbs-leipzig/gradoop.svg?branch=master)](https://travis-ci.org/dbs-leipzig/gradoop)
[![Code Quality: Java](https://img.shields.io/lgtm/grade/java/g/dbs-leipzig/gradoop.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/dbs-leipzig/gradoop/context:java)
[![Total Alerts](https://img.shields.io/lgtm/alerts/g/dbs-leipzig/gradoop.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/dbs-leipzig/gradoop/alerts)

## Gradoop: Distributed Graph Analytics on Hadoop

[Gradoop](http://www.gradoop.com) is an open source (ALv2) research framework for scalable
graph analytics built on top of [Apache Flink™](http://flink.apache.org/). It offers a graph data model which
graph analytics built on top of [Apache Flink](http://flink.apache.org/). It offers a graph data model which
extends the widespread [property graph model](https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model)
by the concept of logical graphs and further provides operators that can be applied
on single logical graphs and collections of logical graphs. The combination of these
operators allows the flexible, declarative definition of graph analytical workflows.
Gradoop can be easily integrated in a workflow which already uses Flink™ operators
and Flink™ libraries (i.e. Gelly, ML and Table).
Gradoop can be easily integrated in a workflow which already uses Flink® operators
and Flink® libraries (i.e. Gelly, ML and Table).

Gradoop is **work in progress** which means APIs may change. It is currently used
as a proof of concept implementation and far from production ready.

The project's documentation can be found in our [Wiki](https://github.com/dbs-leipzig/gradoop/wiki).
The Wiki also contains a [tutorial](https://github.com/dbs-leipzig/gradoop/wiki/Getting-started) to
help getting started using Gradoop.

##### Further Information (articles and talks)

* [Declarative and distributed graph analytics with GRADOOP, VLDB Demo, August 2018](http://www.vldb.org/pvldb/vol11/p2006-junghanns.pdf)
Expand Down Expand Up @@ -50,54 +54,7 @@ properties even if they have the same label.

The EPGM provides operators for both single logical graphs as well as collections
of logical graphs; operators may also return single graphs or graph collections.
The following tables contains an overview (GC = Graph Collection, G = Logical Graph).

#### Unary logical graph operators (one graph as input):

| Operator | Output | Output description | Impl |
|:--------------|:-------|:-------------------------------------------------------------|:----:|
| Aggregation | G | Graph with result of an aggregate function as a new property | Yes |
| Matching | GC | Graphs that match a given graph pattern | Yes |
| Transformation| G | Graph with transformed (graph, vertex, edge) data | Yes |
| Grouping | G | Structural condense of the input graph | Yes |
| Subgraph | G | Subgraph that fulfils given vertex and edge predicates | Yes |

#### Binary logical graph operators (two graphs as input):

| Operator | Output | Output description | Impl |
|:--------------|:--------------|:-----------------------------------------------------------------------|:----:|
| Combination | G | Graph with vertices and edges from both input graphs | Yes |
| Overlap | G | Graph with vertices and edges that exist in both input graphs | Yes |
| Exclusion | G | Graph with vertices and edges that exist only in the first graph | Yes |
| Equality | {true, false} | Compare graphs in terms of identity or equality of contained elements | Yes |
| VertexFusion | G | The second graph is fused to a single vertex within the first graph | Yes |

#### Unary graph collection operators (one collection as input):

| Operator | Output | Output description | Impl |
|:--------------|:--------|:--------------------------------------------------------------------|:----:|
| Matching | GC | Graphs that match a given graph pattern | Yes |
| Selection | GC | Filter graphs based on their attached data (i.e. label, properties) | Yes |
| Distinct | GC | Collection with no duplicate graphs | Yes |
| SortBy | GC | Collection sorted by values of a given property key | No |
| Limit | GC | The first n arbitrary elements of the input collection | Yes |

#### Binary graph collection operators (two collections as input):

| Operator | Output | Output description | Impl |
|:--------------|:--------------|:---------------------------------------------------------------------------|:----:|
| Union | GC | All graphs from both input collections | Yes |
| Intersection | GC | Only graphs that exist in both collections | Yes |
| Difference | GC | Only graphs that exist only in the first collection | Yes |
| Equality | {true, false} | Compare collections in terms of identity or equality of contained elements | Yes |

#### Auxiliary operators:

| Operator | In | Out | Output description | Impl |
|:--------------|:-----|:-----|:------------------------------------------------------------------------|:----:|
| Apply | GC | GC | Applies unary operator (e.g. aggregate) on each graph in the collection | Yes |
| Reduce | GC | G | Reduces collection to single graph using binary operator (e.g. combine) | Yes |
| Call | GC/G | GC/G | Applies external algorithm on graph or graph collection | Yes |
An overview and detailed descriptions of the implemented operators can be found in the [Gradoop Wiki](https://github.com/dbs-leipzig/gradoop/wiki/List-of-Operators).

## Setup

Expand All @@ -107,16 +64,16 @@ The following tables contains an overview (GC = Graph Collection, G = Logical Gr

Stable:

```
```xml
<dependency>
<groupId>org.gradoop</groupId>
<artifactId>gradoop-flink</artifactId>
<version>0.4.5</version>
<version>0.5.0</version>
</dependency>
```

Latest nightly build (additional repository is required):
```
```xml
<repositories>
<repository>
<id>oss.sonatype.org-snapshot</id>
Expand All @@ -126,16 +83,17 @@ Latest nightly build (additional repository is required):
</repository>
</repositories>
```
```

```xml
<dependency>
<groupId>org.gradoop</groupId>
<artifactId>gradoop-flink</artifactId>
<version>0.5.0-SNAPSHOT</version>
<version>0.6.0-SNAPSHOT</version>
</dependency>

```
In any case you also need Apache Flink (version 1.7.2):
```
```xml
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
Expand Down Expand Up @@ -166,21 +124,27 @@ In any case you also need Apache Flink (version 1.7.2):
### gradoop-common

The main contents of that module are the EPGM data model and a corresponding POJO
implementation which is used in Flink&trade;. The persistent representation of the EPGM
implementation which is used in Flink&reg;. The persistent representation of the EPGM
is also contained in gradoop-common and together with its mapping to HBase&trade;.

### gradoop-data-integration

Provides functionalities to support graph data integration.
This includes minimal CSV and JSON importers as well as graph transformation operators
(e.g. connect neighbors or conversion of edges to vertices and vice versa).

### gradoop-accumulo

Input and output formats for reading and writing graph collections from [Apache Accumulo](https://accumulo.apache.org/).
Input and output formats for reading and writing graph collections from [Apache Accumulo&reg;](https://accumulo.apache.org/).

### gradoop-hbase

Input and output formats for reading and writing graph collections from [Apache HBase](https://hbase.apache.org/).
Input and output formats for reading and writing graph collections from [Apache HBase&trade;](https://hbase.apache.org/).

### gradoop-flink

This module contains reference implementations of the EPGM operators. The
EPGM is mapped to Flink&trade; DataSets while the operators are implemented
EPGM is mapped to Flink&reg; DataSets while the operators are implemented
using DataSet transformations. The module also contains implementations of
general graph algorithms (e.g. Label Propagation, Frequent Subgraph Mining)
adapted to be used with the EPGM model.
Expand All @@ -192,7 +156,6 @@ Contains example pipelines showing use cases for Gradoop.
* Graph grouping example (build structural aggregates of property graphs)
* Social network examples (composition of multiple operators to analyze social networks graphs)
* Input/Output examples (usage of DataSource and DataSink implementations)
* Benchmarks used for cluster evaluations

### gradoop-checkstyle

Expand All @@ -204,8 +167,8 @@ See the [Changelog](https://github.com/dbs-leipzig/gradoop/wiki/Changelog) at th

### Disclaimer

Apache®, Apache Flink&trade;, Flink&trade;, Apache HBase&trade; and HBase&trade;
are either registered trademarks or trademarks of the Apache Software Foundation
Apache&reg;, Apache Accumulo&reg;, Apache Flink, Flink&reg;, Apache HBase&trade; and
HBase&trade; are either registered trademarks or trademarks of the Apache Software Foundation
in the United States and/or other countries.


Expand Down
2 changes: 1 addition & 1 deletion dev-support/gradoop-idea-codestyle.xml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<code_scheme name="Gradoop">
<option name="CLASS_COUNT_TO_USE_IMPORT_ON_DEMAND" value="10" />
<option name="RIGHT_MARGIN" value="100" />
<option name="RIGHT_MARGIN" value="110" />
<option name="WRAP_WHEN_TYPING_REACHES_RIGHT_MARGIN" value="true" />
<XML>
<option name="XML_LEGACY_SETTINGS_IMPORTED" value="true" />
Expand Down
16 changes: 1 addition & 15 deletions gradoop-checkstyle/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<parent>
<groupId>org.gradoop</groupId>
<artifactId>gradoop-parent</artifactId>
<version>0.4.5</version>
<version>0.5.0</version>
</parent>

<artifactId>gradoop-checkstyle</artifactId>
Expand Down Expand Up @@ -64,23 +64,9 @@
<id>javadoc</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<executions>
<execution>
<phase>none</phase>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-site-plugin</artifactId>
<executions>
<execution>
<phase>none</phase>
</execution>
</executions>
</plugin>
</plugins>
</build>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
<suppress checks="IllegalCatch"
files="PrintTableSink"
lines="90-100"/>
<suppress checks="LineLength"
files="SocialNetworkGraph.java"
lines="0-68"/>

<!-- less restrictive checkstyle for tests -->
<suppress checks="JavadocMethod"
Expand Down
11 changes: 5 additions & 6 deletions gradoop-checkstyle/src/main/resources/gradoop/checkstyle.xml
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,16 @@
-->

<!DOCTYPE module PUBLIC
"-//Puppy Crawl//DTD Check Configuration 1.3//EN"
"http://www.puppycrawl.com/dtds/configuration_1_3.dtd">
"-//Checkstyle//DTD Checkstyle Configuration 1.3//EN"
"https://checkstyle.org/dtds/configuration_1_3.dtd">

<module name="checker">
<property name="localeLanguage" value="en"/>
<property name="cacheFile" value="target/checkstyle-cachefile"/>

<!-- Checks for the correct license header in *.java files -->
<module name="Header">
<property name="headerFile" value="${checkstyle.header.file}"/>
<property name="headerFile" value="/gradoop/LICENSE.txt"/>
<property name="fileExtensions" value="java"/>
</module>

Expand All @@ -44,8 +45,6 @@
<module name="FileTabCharacter"/>

<module name="TreeWalker">
<property name="cacheFile" value="target/checkstyle-cachefile"/>

<!-- Checks for blocks. -->
<!-- See http://checkstyle.sf.net/config_blocks.html -->
<module name="EmptyBlock">
Expand Down Expand Up @@ -201,7 +200,7 @@
<!-- See http://checkstyle.sf.net/config_sizes.html -->
<!-- Lines cannot exceed 100 chars -->
<module name="LineLength">
<property name="max" value="100"/>
<property name="max" value="110"/>
<property name="ignorePattern" value="^[import|package]|@see|@link"/>
</module>
<!-- Over time, we will revised this down -->
Expand Down
13 changes: 13 additions & 0 deletions gradoop-common/gradoop-common-testng.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
<!DOCTYPE suite SYSTEM "http://testng.org/testng-1.0.dtd" >

<suite name="Gradoop Common Suite" verbose="1" >
<test name="Gradoop Common Tests">
<packages>
<package name="org.gradoop.common.model.impl.properties"/>
<package name="org.gradoop.common.util"/>
<package name="org.gradoop.common.model.impl.pojo"/>
<package name="org.gradoop.common.model.impl.metadata"/>
<package name="org.gradoop.common.model.impl.id"/>
</packages>
</test>
</suite>
25 changes: 18 additions & 7 deletions gradoop-common/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
<parent>
<groupId>org.gradoop</groupId>
<artifactId>gradoop-parent</artifactId>
<version>0.4.5</version>
<version>0.5.0</version>
</parent>

<artifactId>gradoop-common</artifactId>
Expand Down Expand Up @@ -66,10 +66,6 @@
<id>javadoc</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-site-plugin</artifactId>
Expand All @@ -90,8 +86,8 @@
<artifactId>maven-checkstyle-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>findbugs-maven-plugin</artifactId>
<groupId>com.github.spotbugs</groupId>
<artifactId>spotbugs-maven-plugin</artifactId>
</plugin>
<!-- Creates an extra *-tests.jar which can be used as dependency -->
<plugin>
Expand All @@ -109,6 +105,15 @@
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<suiteXmlFiles>
<suiteXmlFile>gradoop-common-testng.xml</suiteXmlFile>
</suiteXmlFiles>
</configuration>
</plugin>
</plugins>
</build>

Expand Down Expand Up @@ -162,6 +167,12 @@
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
Expand Down
Loading

0 comments on commit eeacd78

Please sign in to comment.