pyspark.pandas.DataFrame.drop¶
- 
DataFrame.drop(labels: Union[Any, Tuple[Any, …], List[Union[Any, Tuple[Any, …]]], None] = None, axis: Union[int, str, None] = 0, index: Union[Any, Tuple[Any, …], List[Union[Any, Tuple[Any, …]]]] = None, columns: Union[Any, Tuple[Any, …], List[Union[Any, Tuple[Any, …]]]] = None) → pyspark.pandas.frame.DataFrame[source]¶
- Drop specified labels from columns. - Remove rows and/or columns by specifying label names and corresponding axis, or by specifying directly index and/or column names. Drop rows of a MultiIndex DataFrame is not supported yet. - Parameters
- labelssingle label or list-like
- Column labels to drop. 
- axis{0 or ‘index’, 1 or ‘columns’}, default 0
- Changed in version 3.3: Set dropping by index is default. 
- indexsingle label or list-like
- Alternative to specifying axis ( - labels, axis=0is equivalent to- index=columns).- Changed in version 3.3: Added dropping rows by ‘index’. 
- columnssingle label or list-like
- Alternative to specifying axis ( - labels, axis=1is equivalent to- columns=labels).
 
- Returns
- droppedDataFrame
 
 - See also - Notes - Currently, dropping rows of a MultiIndex DataFrame is not supported yet. - Examples - >>> df = ps.DataFrame(np.arange(12).reshape(3, 4), columns=['A', 'B', 'C', 'D']) >>> df A B C D 0 0 1 2 3 1 4 5 6 7 2 8 9 10 11 - Drop columns - >>> df.drop(['B', 'C'], axis=1) A D 0 0 3 1 4 7 2 8 11 - >>> df.drop(columns=['B', 'C']) A D 0 0 3 1 4 7 2 8 11 - Drop a row by index - >>> df.drop([0, 1]) A B C D 2 8 9 10 11 - >>> df.drop(index=[0, 1], columns='A') B C D 2 9 10 11 - Also support dropping columns for MultiIndex - >>> df = ps.DataFrame({'x': [1, 2], 'y': [3, 4], 'z': [5, 6], 'w': [7, 8]}, ... columns=['x', 'y', 'z', 'w']) >>> columns = [('a', 'x'), ('a', 'y'), ('b', 'z'), ('b', 'w')] >>> df.columns = pd.MultiIndex.from_tuples(columns) >>> df a b x y z w 0 1 3 5 7 1 2 4 6 8 >>> df.drop(labels='a', axis=1) b z w 0 5 7 1 6 8