Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala 2.13 support #14203

Open
thirstler opened this issue Mar 12, 2024 · 4 comments
Open

Scala 2.13 support #14203

thirstler opened this issue Mar 12, 2024 · 4 comments
Assignees

Comments

@thirstler
Copy link

Description

Scala 2.12 seems to be de-facto across a lot of spark packages but I'm using packages that require spark w/scala-2.13 specifically and it's obviating working with Spark NLP (also, my understanding is that the Spark project will be deprecating scala-2.12 in 3.6). Is a Spark NLP build using Scala 2.13 possible?

Preferred Solution

a spark package for Scala 2.13 (e.g.:
conf.set("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.13:5.3.1")

Additional Context

None

@maziyarpanahi
Copy link
Member

Unfortunately, we are going to wait until Apache Spark introduces support for Scala 3.x. Moving to any major Scala version for Spark NLP means re-doing all of our saved pipelines and 99% of our models. (they have Java object, if they were saved in one major version they just cannot be used in another)

We tried to find solutions for this, but like many other libraries built on top of Apache Spark natively we also suffer from the saved models not being compatible in newer versions of Scala.

That said, if we have to go to a newer version of Scala, given the 3.0 has been out for while, we would rather wait until we can do this once for Scala 3.x support. (I believe before deprecating 2.12, they will introduce 3.x support)

Copy link

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions github-actions bot added the Stale label Sep 10, 2024
@SemyonSinchenko
Copy link

I believe before deprecating 2.12, they will introduce 3.x support

As I can understand, Spark 4.0 works only with 2.13: https://issues.apache.org/jira/browse/SPARK-45314

@maziyarpanahi
Copy link
Member

I believe before deprecating 2.12, they will introduce 3.x support

As I can understand, Spark 4.0 works only with 2.13: https://issues.apache.org/jira/browse/SPARK-45314

Yes, and we are actually waiting for that release. Once it's out:

  • we will support Scala 3.x (not 2.13)
  • will most definitely retire Java 8 in favor of 11 as a default (no cloud provider including Databricks launches Spark instances with Java 11 or higher! we can't do this until they do that)

@github-actions github-actions bot removed the Stale label Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants