pyspark.RDD.pipe¶
- 
RDD.pipe(command: str, env: Optional[Dict[str, str]] = None, checkCode: bool = False) → pyspark.rdd.RDD[str][source]¶
- Return an RDD created by piping elements to a forked external process. - New in version 0.7.0. - Parameters
- commandstr
- command to run. 
- envdict, optional
- environment variables to set. 
- checkCodebool, optional
- whether to check the return value of the shell command. 
 
- Returns
 - Examples - >>> sc.parallelize(['1', '2', '', '3']).pipe('cat').collect() ['1', '2', '', '3']