Skip to content

[SPARK-51927][SPARK-52213][BUILD] Upgrade jackson to 2.19.0 and upgrade kubernetes-client to version 7.3.0 #50730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Apr 27, 2025

What changes were proposed in this pull request?

The primary objective of this pr is to upgrade Jackson from 2.18.2 to 2.19.0, and simultaneously upgrade the kubernetes-client from 7.2.0 to 7.3.0 to ensure compatibility with Jackson 2.19.0.

Why are the changes needed?

The new version of Jackson brings several bug fixes:

The full release notes as follow:

The release of kubernetes-client 7.3.0 is solely for the purpose of ensuring compatibility with Jackson 2.19.0:

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Pass GitHub Actions
  • manual check:
build/sbt clean package -Phive
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables=python3.11
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables=python3.11
Running PySpark tests. Output is in /Users/yangjie01/SourceCode/git/spark-sbt/python/unit-tests.log
Will test against the following Python executables: ['python3.11']
Will test the following Python tests: ['pyspark.sql.tests.arrow.test_arrow_python_udf']
python3.11 python_implementation is CPython
python3.11 version is: Python 3.11.12
Starting test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (temp output: /Users/yangjie01/SourceCode/git/spark-sbt/python/target/83c10dc1-64a2-4b7d-80b4-4977dadd26fa/python3.11__pyspark.sql.tests.arrow.test_arrow_python_udf___gzv2b5u.log)
Finished test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (62s) ... 8 tests were skipped
Tests passed in 62 seconds

Skipped tests in pyspark.sql.tests.arrow.test_arrow_python_udf with python3.11:
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_datasource_with_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_datasource_with_udf) ... skip (0.001s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_datasource_with_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_datasource_with_udf) ... skip (0.001s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang LuciferYang marked this pull request as draft April 27, 2025 07:14
@github-actions github-actions bot added the BUILD label Apr 27, 2025
@LuciferYang
Copy link
Contributor Author

Test first

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for testing.

@LuciferYang
Copy link
Contributor Author

[info] - Start pod creation from template *** FAILED *** (4 seconds, 669 milliseconds)
[info]   java.lang.AssertionError: assertion failed: Failed to execute -- /home/runner/work/spark/spark/bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkPi --master k8s://https://192.168.49.2:8443/ --conf spark.master=k8s://https://192.168.49.2:8443/ --conf spark.kubernetes.executor.request.cores=0.2 --conf spark.testing=false --conf spark.kubernetes.executor.podTemplateFile=/home/runner/work/spark/spark/resource-managers/kubernetes/integration-tests/target/scala-2.13/test-classes/executor-template.yml --conf spark.kubernetes.submission.waitAppCompletion=false --conf spark.app.name=spark-test-app --conf spark.executor.cores=1 --conf spark.authenticate=true --conf spark.kubernetes.namespace=spark-b70a830a8e104ceca3d01b6beb08bd8c --conf spark.kubernetes.driver.request.cores=0.2 --conf spark.kubernetes.container.image=docker.io/kubespark/spark:dev --conf spark.kubernetes.authenticate.driver.serviceAccountName=default --conf spark.kubernetes.executor.label.spark-app-locator=5d2f15d22423482fae1ba563d4631426 --conf spark.kubernetes.driver.pod.name=spark-test-app-abcd9c097a204aaea9b7ef8111e34db6-driver --conf spark.kubernetes.driver.podTemplateFile=/home/runner/work/spark/spark/resource-managers/kubernetes/integration-tests/target/scala-2.13/test-classes/driver-template.yml --conf spark.executor.instances=1 --conf spark.kubernetes.driver.label.spark-app-locator=5d2f15d22423482fae1ba563d4631426 --conf spark.ui.enabled=true local:///opt/spark/examples/jars/spark-examples_2.13-4.1.0-SNAPSHOT.jar --
[info] WARNING: Using incubator modules: jdk.incubator.vector
[info] Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
[info] 	at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:109)
[info] 	at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:102)
[info] 	at io.fabric8.kubernetes.client.utils.KubernetesSerialization.asJson(KubernetesSerialization.java:174)
[info] 	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:338)
[info] 	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:754)
[info] 	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:98)
[info] 	at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)
[info] 	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1155)
[info] 	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:98)
[info] 	at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:154)
[info] 	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:258)
[info] 	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:252)
[info] 	at org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48)
[info] 	at org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46)
[info] 	at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:100)
[info] 	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:252)
[info] 	at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:225)
[info] 	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1027)
[info] 	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:204)
[info] 	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:227)
[info] 	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:96)
[info] 	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1132)
[info] 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1141)
[info] 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[info] Caused by: com.fasterxml.jackson.databind.JsonMappingException: Cannot invoke "com.fasterxml.jackson.databind.JsonSerializer.serialize(Object, com.fasterxml.jackson.core.JsonGenerator, com.fasterxml.jackson.databind.SerializerProvider)" because "keySerializer" is null (through reference chain: io.fabric8.kubernetes.api.model.Pod["getAdditionalProperties"]->java.util.LinkedHashMap["Kind"])
[info] 	at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:400)
[info] 	at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:359)
[info] 	at com.fasterxml.jackson.databind.ser.std.StdSerializer.wrapAndThrow(StdSerializer.java:324)
[info] 	at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:810)
[info] 	at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeWithoutTypeInfo(MapSerializer.java:763)
[info] 	at com.fasterxml.jackson.databind.ser.AnyGetterWriter.getAndSerialize(AnyGetterWriter.java:81)
[info] 	at com.fasterxml.jackson.databind.ser.AnyGetterWriter.serializeAsField(AnyGetterWriter.java:89)
[info] 	at io.fabric8.kubernetes.model.jackson.BeanPropertyWriterDelegate.serializeAsField(BeanPropertyWriterDelegate.java:68)
[info] 	at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:760)
[info] 	at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:183)
[info] 	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:503)
[info] 	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:342)
[info] 	at com.fasterxml.jackson.databind.ObjectMapper._writeValueAndClose(ObjectMapper.java:4859)
[info] 	at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:4079)
[info] 	at io.fabric8.kubernetes.client.utils.KubernetesSerialization.asJson(KubernetesSerialization.java:172)
[info] 	... 21 more
[info] Caused by: java.lang.NullPointerException: Cannot invoke "com.fasterxml.jackson.databind.JsonSerializer.serialize(Object, com.fasterxml.jackson.core.JsonGenerator, com.fasterxml.jackson.databind.SerializerProvider)" because "keySerializer" is null
[info] 	at com.fasterxml.jackson.databind.ser.std.MapSerializer.serializeFields(MapSerializer.java:796)
[info] 	... 32 more
[info]   at scala.Predef$.assert(Predef.scala:279)
[info]   at org.apache.spark.deploy.k8s.integrationtest.ProcessUtils$.executeProcess(ProcessUtils.scala:54)
[info]   at org.apache.spark.deploy.k8s.integrationtest.SparkAppLauncher$.launch(KubernetesTestComponents.scala:137)
[info]   at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.runSparkApplicationAndVerifyCompletion(KubernetesSuite.scala:469)
[info]   at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.runSparkPiAndVerifyCompletion(KubernetesSuite.scala:240)
[info]   at org.apache.spark.deploy.k8s.integrationtest.PodTemplateSuite.$anonfun$$init$$1(PodTemplateSuite.scala:33)
[info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
[info]   at org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)
[info]   at org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)
[info]   at org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)
[info]   at org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)
[info]   at org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)
[info]   at org.apache.spark.SparkFunSuite.$anonfun$test$2(SparkFunSuite.scala:155)
[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
[info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:227)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)
[info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:69)
[info]   at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
[info]   at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
[info]   at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.org$scalatest$BeforeAndAfter$$super$runTest(KubernetesSuite.scala:45)
[info]   at org.scalatest.BeforeAndAfter.runTest(BeforeAndAfter.scala:213)
[info]   at org.scalatest.BeforeAndAfter.runTest$(BeforeAndAfter.scala:203)
[info]   at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.runTest(KubernetesSuite.scala:45)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)
[info]   at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
[info]   at scala.collection.immutable.List.foreach(List.scala:334)
[info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
[info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
[info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268)
[info]   at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564)
[info]   at org.scalatest.Suite.run(Suite.scala:1114)
[info]   at org.scalatest.Suite.run$(Suite.scala:1096)
[info]   at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273)
[info]   at org.scalatest.SuperEngine.runImpl(Engine.scala:535)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273)
[info]   at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272)
[info]   at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:69)
[info]   at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
[info]   at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
[info]   at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
[info]   at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.org$scalatest$BeforeAndAfter$$super$run(KubernetesSuite.scala:45)
[info]   at org.scalatest.BeforeAndAfter.run(BeforeAndAfter.scala:273)
[info]   at org.scalatest.BeforeAndAfter.run$(BeforeAndAfter.scala:271)
[info]   at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.run(KubernetesSuite.scala:45)
[info]   at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321)
[info]   at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517)
[info]   at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:414)
[info]   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[info]   at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[info]   at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[info]   at java.base/java.lang.Thread.run(Thread.java:840)

It appears there might be compatibility issues with the Kubernetes client. I need to further verify this.

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Apr 27, 2025

@LuciferYang
Copy link
Contributor Author

test Jackson 2.19.0 with kubernetes-client 7.3.0

https://github.com/fabric8io/kubernetes-client/releases/tag/v7.3.0

image

@LuciferYang LuciferYang changed the title [SPARK-51927][BUILD] Upgrade jackson to 2.19.0 [SPARK-51927][SPARK-52213][BUILD] Upgrade jackson to 2.19.0 and upgrade kubernetes-client to version 7.3.0 May 19, 2025
@LuciferYang LuciferYang marked this pull request as ready for review May 19, 2025 12:26
@LuciferYang
Copy link
Contributor Author

cc @dongjoon-hyun I'm not sure if waiting for a while will lead to a better solution.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @LuciferYang .

@dongjoon-hyun
Copy link
Member

Merged to master for Apache Spark 4.1.0.

@HyukjinKwon
Copy link
Member

Hmmm .. this actually breaks SBT build with PySpark tests for some resasons:

build/sbt clean package
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf'
Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.NoSuchMethodError: 'com.fasterxml.jackson.annotation.OptBoolean com.fasterxml.jackson.annotation.JsonProperty.isRequired()' [in thread "Executor task launch worker for task 1.0 in stage 3.0 (TID 4)"]
        at com.fasterxml.jackson.databind.introspect.JacksonAnnotationIntrospector.hasRequiredMarker(JacksonAnnotationIntrospector.java:466)
        at com.fasterxml.jackson.databind.introspect.POJOPropertyBuilder.getMetadata(POJOPropertyBuilder.java:225)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._anyIndexed(POJOPropertiesCollector.java:1687)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._sortProperties(POJOPropertiesCollector.java:1579)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:499)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getPotentialCreators(POJOPropertiesCollector.java:230)
        at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.getPotentialCreators(BasicBeanDescription.java:348)

Let me revert it out for now 🙏

@dongjoon-hyun
Copy link
Member

Oh, thank you for reporting, @HyukjinKwon .

@LuciferYang
Copy link
Contributor Author

Thank you @HyukjinKwon

@LuciferYang
Copy link
Contributor Author

LuciferYang commented May 20, 2025

Hmmm .. this actually breaks SBT build with PySpark tests for some resasons:

build/sbt clean package
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf'
Caused by: java.lang.ExceptionInInitializerError: Exception java.lang.NoSuchMethodError: 'com.fasterxml.jackson.annotation.OptBoolean com.fasterxml.jackson.annotation.JsonProperty.isRequired()' [in thread "Executor task launch worker for task 1.0 in stage 3.0 (TID 4)"]
        at com.fasterxml.jackson.databind.introspect.JacksonAnnotationIntrospector.hasRequiredMarker(JacksonAnnotationIntrospector.java:466)
        at com.fasterxml.jackson.databind.introspect.POJOPropertyBuilder.getMetadata(POJOPropertyBuilder.java:225)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._anyIndexed(POJOPropertiesCollector.java:1687)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._sortProperties(POJOPropertiesCollector.java:1579)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:499)
        at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getPotentialCreators(POJOPropertiesCollector.java:230)
        at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.getPotentialCreators(BasicBeanDescription.java:348)

Let me revert it out for now 🙏

The root cause is that during the packaging process of the spark-protobuf module using sbt's assembly task, redundant components, such as version 2.18.1 of jackson-annotations, are included. This results in multiple versions of Jackson being present in the classpath when running the aforementioned test commands.

I'll first go and fix the packaging issue with spark-protobuf to ensure consistency in the packaging results between sbt and Maven. Additionally, if we run the following command for testing with maven, it will pass even when using Jackson 2.19.0:

build/mvn clean package -DskipTests -Phive 
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables "python3.11"
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables "python3.11"
Running PySpark tests. Output is in /Users/yangjie01/SourceCode/git/spark-maven/python/unit-tests.log
Will test against the following Python executables: ['python3.11']
Will test the following Python tests: ['pyspark.sql.tests.arrow.test_arrow_python_udf']
python3.11 python_implementation is CPython
python3.11 version is: Python 3.11.12
Starting test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (temp output: /Users/yangjie01/SourceCode/git/spark-maven/python/target/4490ada2-de2e-41a3-b0fb-d5361171cfd5/python3.11__pyspark.sql.tests.arrow.test_arrow_python_udf__w7hkmvmi.log)
Finished test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (138s) ... 6 tests were skipped
Tests passed in 138 seconds

Skipped tests in pyspark.sql.tests.arrow.test_arrow_python_udf with python3.11:
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)

@LuciferYang LuciferYang reopened this May 20, 2025
@LuciferYang LuciferYang marked this pull request as draft May 20, 2025 13:03
@LuciferYang
Copy link
Contributor Author

@HyukjinKwon I rebased this pr and manually checked the cases you provided after . The previous errors no longer exist. Are there any other scenarios that require manual verification?

build/sbt clean package -Phive
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables=python3.11
python/run-tests --testnames 'pyspark.sql.tests.arrow.test_arrow_python_udf' --python-executables=python3.11
Running PySpark tests. Output is in /Users/yangjie01/SourceCode/git/spark-sbt/python/unit-tests.log
Will test against the following Python executables: ['python3.11']
Will test the following Python tests: ['pyspark.sql.tests.arrow.test_arrow_python_udf']
python3.11 python_implementation is CPython
python3.11 version is: Python 3.11.12
Starting test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (temp output: /Users/yangjie01/SourceCode/git/spark-sbt/python/target/83c10dc1-64a2-4b7d-80b4-4977dadd26fa/python3.11__pyspark.sql.tests.arrow.test_arrow_python_udf___gzv2b5u.log)
Finished test(python3.11): pyspark.sql.tests.arrow.test_arrow_python_udf (62s) ... 8 tests were skipped
Tests passed in 62 seconds

Skipped tests in pyspark.sql.tests.arrow.test_arrow_python_udf with python3.11:
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_datasource_with_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_datasource_with_udf) ... skip (0.001s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.ArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)
      test_broadcast_in_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_broadcast_in_udf) ... skip (0.000s)
      test_datasource_with_udf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_datasource_with_udf) ... skip (0.001s)
      test_register_java_function (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_function) ... skip (0.000s)
      test_register_java_udaf (pyspark.sql.tests.arrow.test_arrow_python_udf.AsyncArrowPythonUDFTests.test_register_java_udaf) ... skip (0.000s)

@LuciferYang LuciferYang marked this pull request as ready for review May 21, 2025 03:56
Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's merge it back and see how it goes. What was the cause btw?

@LuciferYang
Copy link
Contributor Author

let's merge it back and see how it goes. What was the cause btw?

I provided an answer to this question here: #50730 (comment)

The root cause is that during the packaging process of the spark-protobuf module using sbt's assembly task, redundant components, such as version 2.18.1 of jackson-annotations, are included. This results in multiple versions of Jackson being present in the classpath when running the aforementioned test commands.

Then, I fixed the issue yesterday at #50951.

@dongjoon-hyun
Copy link
Member

Thank you, @LuciferYang and @HyukjinKwon !

@LuciferYang
Copy link
Contributor Author

Already merged into master. Thanks @HyukjinKwon and @dongjoon-hyun ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants