Name		Name	Last commit message	Last commit date
parent directory ..
src		src
README.md		README.md
pom.xml		pom.xml

README.md

Hadoop MapReduce InputFormat/OutputFormat for TFRecords

This directory contains a Apache Hadoop MapReduce InputFormat/OutputFormat implementation for TensorFlow's TFRecords format. This can also be used with Apache Spark.

Prerequisites

Apache Maven
Tested with Hadoop 2.6.0. Patches are welcome if there are incompatibilities with your Hadoop version.

Breaking changes

08/20/2018 - Reverted artifactId back to org.tensorflow.tensorflow-hadoop
05/29/2018 - Changed the artifactId from org.tensorflow.tensorflow-hadoop to org.tensorflow.hadoop

Build and install

Compile the code
```
mvn clean package
```
Alternatively, if you would like to build jars for a different version of TensorFlow, e.g., 1.5.0:
```
mvn versions:set -DnewVersion=1.5.0
mvn clean package
```

Optionally install (or deploy) the jars

mvn install

After installation (or deployment), the package can be used with the following dependency:

<dependency>
  <groupId>org.tensorflow</groupId>
  <artifactId>tensorflow-hadoop</artifactId>
  <version>1.10.0</version>
</dependency>

Use with MapReduce

The Hadoop MapReduce example can be found here.

Use with Apache Spark

The Spark-TensorFlow-Connector uses TensorFlow Hadoop to load and save TensorFlow's TFRecords format using Spark DataFrames.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hadoop

hadoop

README.md

Hadoop MapReduce InputFormat/OutputFormat for TFRecords

Prerequisites

Breaking changes

Build and install

Use with MapReduce

Use with Apache Spark

Files

hadoop

Directory actions

More options

Directory actions

More options

Latest commit

History

hadoop

Folders and files

parent directory

README.md

Hadoop MapReduce InputFormat/OutputFormat for TFRecords

Prerequisites

Breaking changes

Build and install

Use with MapReduce

Use with Apache Spark