pyspark.pandas.DataFrame.idxmin#
- DataFrame.idxmin(axis=0)[source]#
Return index of first occurrence of minimum over requested axis. NA/null values are excluded.
Note
This API collect all rows with minimum value using to_pandas() because we suppose the number of rows with min values are usually small in general.
- Parameters
- axis{0 or ‘index’, 1 or ‘columns’}, default 0
The axis to use. 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise.
- Returns
- Series
See also
Examples
>>> psdf = ps.DataFrame({'a': [1, 2, 3, 2], ... 'b': [4.0, 2.0, 3.0, 1.0], ... 'c': [300, 200, 400, 200]}) >>> psdf a b c 0 1 4.0 300 1 2 2.0 200 2 3 3.0 400 3 2 1.0 200
>>> psdf.idxmin() a 0 b 3 c 1 dtype: int64
For axis=1, return the column label of the minimum value in each row:
>>> psdf.idxmin(axis=1) 0 a 1 a 2 a 3 b dtype: object
For Multi-column Index
>>> psdf = ps.DataFrame({'a': [1, 2, 3, 2], ... 'b': [4.0, 2.0, 3.0, 1.0], ... 'c': [300, 200, 400, 200]}) >>> psdf.columns = pd.MultiIndex.from_tuples([('a', 'x'), ('b', 'y'), ('c', 'z')]) >>> psdf a b c x y z 0 1 4.0 300 1 2 2.0 200 2 3 3.0 400 3 2 1.0 200
>>> psdf.idxmin() a x 0 b y 3 c z 1 dtype: int64