Skip to content

How to accelerate inference speed with LightPipeline? #13921

Discussion options

You must be logged in to vote

Hi,

  • Here is an article about LightPipelines https://medium.com/spark-nlp/spark-nlp-101-lightpipeline-a544e93f20f1
  • You are still using .trasnform which is DataFrame
  • You are still passing a DataFrame data - in LightPipelines to reduce DataFrame latency, you can pass a string or a list of strings.
  • You are using .collect() which is very bad. It brings all the data into the Driver's memory
from sparknlp.base import LightPipeline

data = spark.createDataFrame(points).toDF("text")
model = pipeline.fit(data)
light_model = LightPipeline(pipelineModel = model, parse_embeddings = False)


result = light_model.annotate("Here is a text that must be summarized ....")

# now result is a dict you can a…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@AayushSameerShah
Comment options

Answer selected by AayushSameerShah
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants