pig fails to use lzo as compression for temp files #115

jefimm · 2016-05-30T17:28:42Z

The following setup fails using hadoop 2.7.2 and pig 0.15.0 (Google cloud dataproc)
The same job completes fine without lzo comression for temp files and fails with lzo compression for temp files (pig.tmpfilecompression=true pig.tmpfilecompression.codec=lzo)
setup on all nodes during startup:

sudo apt-get install liblzo2-dev
sudo ln -s /lib/x86_64-linux-gnu/liblzo2.so.2 /usr/lib/hadoop/lib/native/

copied hadoop-lzo-0.4.20-SNAPSHOT.jar to /usr/lib/hadoop-mapreduce/

edited core-site.xml and added

<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec</value> </property> <property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property>

Error: java.lang.RuntimeException: java.io.IOException: Not a valid BCFile. at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.init(WeightedRangePartitioner.java:155) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:75) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPartition(WeightedRangePartitioner.java:58) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:715) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:135) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:281) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:274) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: Not a valid BCFile. at org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927) at org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628) at org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804) at org.apache.pig.impl.io.TFileRecordReader.initialize(TFileRecordReader.java:64) at org.apache.pig.impl.io.ReadToEndLoader.initializeReader(ReadToEndLoader.java:212) at org.apache.pig.impl.io.ReadToEndLoader.getNextHelper(ReadToEndLoader.java:250) at org.apache.pig.impl.io.ReadToEndLoader.getNext(ReadToEndLoader.java:231) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.init(WeightedRangePartitioner.java:129) ... 17 more

The text was updated successfully, but these errors were encountered:

jrottinghuis · 2016-05-31T14:51:11Z

Is that a Pig issue, or a problem with hadoop-lzo?

Sent from my iPhone

On May 30, 2016, at 10:28 AM, Jefim Matskin notifications@github.com wrote:

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

jefimm · 2016-05-31T15:02:58Z

I really don't know, the problem is that enabling temp file compression with pig to be lzo does not work

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pig fails to use lzo as compression for temp files #115

pig fails to use lzo as compression for temp files #115

jefimm commented May 30, 2016 •

edited

Loading

jrottinghuis commented May 31, 2016

jefimm commented May 31, 2016

pig fails to use lzo as compression for temp files #115

pig fails to use lzo as compression for temp files #115

Comments

jefimm commented May 30, 2016 • edited Loading

jrottinghuis commented May 31, 2016

jefimm commented May 31, 2016

jefimm commented May 30, 2016 •

edited

Loading