pyspark.sql.DataFrameStatFunctions¶
- 
class pyspark.sql.DataFrameStatFunctions(df: pyspark.sql.dataframe.DataFrame)[source]¶
- Functionality for statistic functions with - DataFrame.- New in version 1.4.0. - Changed in version 3.4.0: Supports Spark Connect. - Methods - approxQuantile(col, probabilities, relativeError)- Calculates the approximate quantiles of numerical columns of a - DataFrame.- corr(col1, col2[, method])- Calculates the correlation of two columns of a - DataFrameas a double value.- cov(col1, col2)- Calculate the sample covariance for the given columns, specified by their names, as a double value. - crosstab(col1, col2)- Computes a pair-wise frequency table of the given columns. - freqItems(cols[, support])- Finding frequent items for columns, possibly with false positives. - sampleBy(col, fractions[, seed])- Returns a stratified sample without replacement based on the fraction given on each stratum.