Environment Setup

Prerequisites

Install Vagrant: http://www.vagrantup.com
Install Virtual Box: https://www.virtualbox.org/wiki/Downloads
Download XDATA VM (xdata-0.1.box): http://sotera.github.io/xdata-vm/
Note: Use IE / Firefox. Chrome fails at the end of the download.

See the XDATA VM wiki for baseline software if installing or setting up on your own machine.

Load VM into Vagrant

Note: Version is currently === 0.1

Add the XDATA VM box definition to Vagrant.

    $ vagrant box add  xdata-vm-[version]  [path_to_file]\xdata-0.1.box

Create a location for hosting your VM files.

    $ mkdir [path_to_virtual_machines_home]\xdata-vm-[version]

Initialize a new VM based on the XDATA VM box configuration.

    $ cd [path_to_virtual_machines_home]\xdata-vm-[version]
    $ vagrant init xdata-vm-[version]

Install Project Components

Note: xdata-vm-0.2.box release should have these pre-loaded for you.

Note: xdata-vm-0.2.1.box release should also include Track Communities pre-installed.

Option A: Use Vagrant Provisioning

There is a folder called 'vagrant' within this project. Copy the contents of the folder to the location of your vagrant initialization (above).

Option B: Use Manual Installation

SSH into the VM as bigdata/bigdata and execute the following series of commands:

Install Aggregate Micro Paths:

$ cd /srv/software/
$ git clone https://github.com/Sotera/aggregate-micro-paths.git

Additional Configurations

Start your virtual machine.

    $ vagrant up

SSH into the VM as bigdata/bigdata, then edit the following configuration file to add additional properties. These configuration changes should allow you to protect your single VM machine from memory and node processing issues that may crop up in later steps.

    $ sudo vi /etc/hadoop/conf/mapred-site.xml
    
    <property>
      <name>mapred.child.java.opts</name>
      <value>-Xmx1024m</value>
    </property>

    <property>
        <name>mapred.tasktracker.map.tasks.maximum</name>
        <value>3</value>
    </property>
    
    <property>
        <name>mapred.tasktracker.reducer.tasks.maximum</name>
        <value>3</value>
    </property>

Stop your virtual machine.

    $ vagrant halt

Testing the System

Start your virtual machine.

$ vagrant up

SSH into the VM as bigdata/bigdata, then test the following commands to ensure system is appropriately configured:

$ hadoop fs -ls /
$ hive -e "show tables"

Stop your virtual machine.

$ vagrant halt

Additional Resources

XDATA VM Wiki

Home
What is Aggregate Micro Path?

Environment Setup

Running Hive Implementation
Customizing Hive Implementation
Using Your Own Data

Visualizing the Results

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Environment Setup

Prerequisites

Load VM into Vagrant

Install Project Components

Option A: Use Vagrant Provisioning

Option B: Use Manual Installation

Additional Configurations

Testing the System

Additional Resources

Clone this wiki locally