Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lzo with gradle #130

Open
Arnold1 opened this issue Oct 8, 2017 · 0 comments
Open

lzo with gradle #130

Arnold1 opened this issue Oct 8, 2017 · 0 comments

Comments

@Arnold1
Copy link

Arnold1 commented Oct 8, 2017

hi,

i use apache spark, scala and use gradle to build my app. for testing i need lzo codec installed...

i still run into th following error. any idea? the error is not directly related to this repo. but still i would like to know how to use lzo codec within my test

i followed these steps:
https://gist.github.com/zedar/c43cbc7ff7f98abee885

here is how i edited my build.gradle:

repositories {
  mavenLocal()
  mavenCentral()
  maven { url "https://repository.cloudera.com/artifactory/cloudera-repos/" }

  maven {
    url "http://maven.twttr.com/"
  }
}

dependencies {
  compile "org.scala-lang:scala-library:$versions.scala_full"
  compile "org.scala-lang:scala-compiler:$versions.scala_full"
  compile "ch.qos.logback:logback-classic:$versions.logback"
  compile "ch.qos.logback:logback-core:$versions.logback"
  compile "com.typesafe.scala-logging:scala-logging_$versions.scala:$versions.scala_logging"
  compile "com.github.scopt:scopt_$versions.scala:$versions.scopt"
  compile "org.apache.spark:spark-core_$versions.scala:$versions.spark"
  compile "org.apache.spark:spark-sql_$versions.scala:$versions.spark"
  compile "org.apache.spark:spark-streaming_$versions.scala:$versions.spark"
  compile "org.apache.spark:spark-hive_$versions.scala:$versions.spark"
  compile "org.slf4j:log4j-over-slf4j:$versions.log4j_over_slf4j"
  compile "com.typesafe:config:$versions.typesafe_config"
  testCompile "com.holdenkarau:spark-testing-base_$versions.scala:${versions.spark}_$versions.spark_testing_base"
  testCompile "org.mockito:mockito-core:$versions.mockito"
  scoverage "org.scoverage:scalac-scoverage-plugin_$versions.scala:$versions.scoverage", "org.scoverage:scalac-scoverage-runtime_$versions.scala:$versions.scoverage"
  testRuntime "org.pegdown:pegdown:$versions.pegdown"
  testCompile "org.scalatest:scalatest_$versions.scala:$versions.scalatest"
  testCompile group: 'com.hadoop.gplcompression', name: 'hadoop-lzo', version: '0.4.17'
  testCompile group: 'org.apache.zookeeper', name: 'zookeeper', version: '3.4.10'
}

it seems spark.read.csv can still not read my .txt.lzo file:
error:

- Should load from raw data *** FAILED *** (739 milliseconds)
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 0.0 failed 1 times, most recent failure: Lost task 3.0 in stage 0.0 (TID 3, localhost, executor driver): java.lang.NumberFormatException: For input string: ":�:h}P~j09��}10827#��)80.02,�w0:45:5"�n+<�79<790127827391"
   at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant