-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job aborted due to stage failure while reading a simple Text File from HDFS #49
Comments
As an additional information, I had done the same test connecting directly to spark-master container and it work well: `scala> val textFile = sc.textFile("/user/root/vannbehandlingsanlegg.csv") scala> textFile.count Probably the issue is in spark notebook configuration. |
Hi @radianv, sorry for late reply, I had a lot of issues with spark notebook and has switched to Apache Zeppelin in the end. The issue you had is most likely version mismatch of Spark between spark notebook and Spark Master. |
I have the same issue! Any solution? |
This is also an error inside spark-master container for From the adundance of errors in the issues related to HDFS and nodes/workers it seems like something in configuration is definately missing. It is also worth noting that the walk-through blog steps do not work: https://www.big-data-europe.eu/scalable-sparkhdfs-workbench-using-docker/ Can anyone successfuly do the following steps in this ^^^^ blog post? |
I working with spark notebooks, regarding to Scalable Spark/HDFS Workbench using Docker
val textFile = sc.textFile("/user/root/vannbehandlingsanlegg.csv")
textFile:
org.apache.spark.rdd.RDD[String] = /user/root/vannbehandlingsanlegg.csv MapPartitionsRDD[1] at textFile at<console>:67
It will show the execution time and the number of lines in the csv file, but I got the next error:
cannot
assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD`I have been searching and I saw it could be about executor dependencies, any idea?
The text was updated successfully, but these errors were encountered: