Returns a stratified sample without replacement
sampleBy.RdReturns a stratified sample without replacement based on the fraction given on each stratum.
Usage
sampleBy(x, col, fractions, seed)
# S4 method for SparkDataFrame,character,list,numeric
sampleBy(x, col, fractions, seed)Arguments
- x
- A SparkDataFrame 
- col
- column that defines strata 
- fractions
- A named list giving sampling fraction for each stratum. If a stratum is not specified, we treat its fraction as zero. 
- seed
- random seed 
See also
Other stat functions: 
approxQuantile(),
corr(),
cov(),
crosstab(),
freqItems()
Examples
if (FALSE) {
df <- read.json("/path/to/file.json")
sample <- sampleBy(df, "key", fractions, 36)
}