Pandas Tricks -Part 1: Get First and Second Max Column for Each Row
PANDAS
7/21/20251 min read
import pandas as pd
import numpy as np
# Sample DataFrame
df = pd.DataFrame({
'A': [10, 25, 30],
'B': [20, 15, 10],
'C': [15, 30, 25]
})
print("Original DataFrame:")
print(df)
# First max column name per row
first_max_col = df.idxmax(axis=1)
# Mask the first max to find second max
masked_df = df.mask(df.eq(df.max(axis=1), axis=0))
# Second max column name per row
second_max_col = masked_df.idxmax(axis=1)
# Combine results
result = pd.DataFrame({
'First Max Column': first_max_col,
'Second Max Column': second_max_col
})
print("\nColumn names with first and second max values:")
print(result)
Explanation:
df.idxmax(axis=1) โ gets the column name with the maximum value per row.
df.max(axis=1) โ gets the max value per row.
df.eq(df.max(axis=1), axis=0) โ returns a boolean DataFrame where True indicates the first max in each row.
df.mask(...) replaces the max values with NaN, so the second max can be found.
masked_df.idxmax(axis=1) โ now gives the column of the second max.
If you want column indexes (integers) instead of names:
first_max_idx = df.values.argmax(axis=1)
# Mask the max values
masked = df.where(df.ne(df.max(axis=1), axis=0))
second_max_idx = masked.values.argmax(axis=1)
print("\nColumn indexes:")
for i, (first, second) in enumerate(zip(first_max_idx, second_max_idx)):
print(f"Row {i}: First max at index {first}, Second max at index {second}")
sorting column names programmatically
import pandas as pd
df = pd.DataFrame({
'SKU': [1, 2],
'Demand': [100, 200],
'RDC': ['A', 'B'],
'Class': ['A', 'B']
})
# Sort columns alphabetically
df = df[sorted(df.columns)]
print(df)