pyspark.testing.assertSchemaEqual¶
- 
pyspark.testing.assertSchemaEqual(actual: pyspark.sql.types.StructType, expected: pyspark.sql.types.StructType)[source]¶
- A util function to assert equality between DataFrame schemas actual and expected. - New in version 3.5.0. - Parameters
- actualStructType
- The DataFrame schema that is being compared or tested. 
- expectedStructType
- The expected schema, for comparison with the actual schema. 
 
 - Notes - When assertSchemaEqual fails, the error message uses the Python difflib library to display a diff log of the actual and expected schemas. - Examples - >>> from pyspark.sql.types import StructType, StructField, ArrayType, IntegerType, DoubleType >>> s1 = StructType([StructField("names", ArrayType(DoubleType(), True), True)]) >>> s2 = StructType([StructField("names", ArrayType(DoubleType(), True), True)]) >>> assertSchemaEqual(s1, s2) # pass, schemas are identical - >>> df1 = spark.createDataFrame(data=[(1, 1000), (2, 3000)], schema=["id", "number"]) >>> df2 = spark.createDataFrame(data=[("1", 1000), ("2", 5000)], schema=["id", "amount"]) >>> assertSchemaEqual(df1.schema, df2.schema) Traceback (most recent call last): ... PySparkAssertionError: [DIFFERENT_SCHEMA] Schemas do not match. --- actual +++ expected - StructType([StructField('id', LongType(), True), StructField('number', LongType(), True)]) ? ^^ ^^^^^ + StructType([StructField('id', StringType(), True), StructField('amount', LongType(), True)]) ? ^^^^ ++++ ^