Hi, My s are throwing errors when running on clusters: <p

There is a hardcoded timeout value here: <a href="https://github.com

Error When Calling toPandas() about spark-sklearn HOT 6 CLOSED

sdjksdafji commented on June 1, 2024

Error When Calling toPandas()

from spark-sklearn.

Comments (6)

xhengstb commented on June 1, 2024 2

This error frequently occurs to me. My solutions include
1) write a retry function and try it 2-3 times. When it fails for the 1st time, it would ALWAYS work in 2nd trail.
2) increase your driver memory

from spark-sklearn.

schulbe commented on June 1, 2024

Thanks! This was helpful. Retry function works fine for some reason...

from spark-sklearn.

SemanticBeeng commented on June 1, 2024

There is a hardcoded timeout value here:

https://github.com/apache/spark/blob/628c7b517969c4a7ccb26ea67ab3dd61266073ca/core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala#L403

  def serveIterator(items: Iterator[_], threadName: String): Array[Any] = {
    val serverSocket = new ServerSocket(0, 1, InetAddress.getByName("localhost"))
    // Close the socket if no connection in 15 seconds
    serverSocket.setSoTimeout(15000)

from spark-sklearn.

zouqian2468 commented on June 1, 2024

Enable Arrow-based columnar data transfers

spark.conf.set("spark.sql.execution.arrow.enabled", "true")

from spark-sklearn.

jasstionzyf commented on June 1, 2024

i solved by change rdd.py file , for res in socket.getaddrinfo("localhost", port, socket.AF_UNSPEC, socket.SOCK_STREAM): i change localhost to my ip , it works.

from spark-sklearn.

aytida commented on June 1, 2024

I faced a similar issue. Nothing worked until i changed the database completely. I did not understand but it seems like the pyspark dataframe created from SQL server was throwing this error but not the one from Redshift.

from spark-sklearn.

Recommend Projects

Error When Calling toPandas() about spark-sklearn HOT 6 CLOSED

Comments (6)

Enable Arrow-based columnar data transfers

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent