Real Pandas for Industrial Applications: How to Find Common Columns Between Two DataFrames in Pandas
7/3/20251 min read
Working with multiple DataFrames is a common task in data analysis and data science. Sometimes, you need to compare DataFrames to find out which columns they have in common. This can be especially useful before merging, joining, or comparing datasets.
In this blog post, weโll explore how to find common columns between two pandas DataFrames in Python.
๐ Why Find Common Columns?
Finding common columns is useful when:
You want to perform a merge or join operation.
You're comparing schema structures between datasets.
You're identifying overlaps for data quality checks.
๐ ๏ธ The Setup
Letโs create two simple DataFrames with overlapping and unique column names:
import pandas as pd
# Define the first DataFrame
df1 = pd.DataFrame(columns=['A', 'B', 'C', 'D'])
# Define the second DataFrame
df2 = pd.DataFrame(columns=['B', 'C', 'E', 'F'])
โ Method 1: Using Set Intersection
The simplest and most efficient way to find common columns is by using Python's set operations.
common_cols = list(set(df1.columns) & set(df2.columns))
print("Common columns:", common_cols)
output:
Common columns: ['C', 'B']
This method works by:
Converting the columns of each DataFrame to a set.
Using the & operator to find the intersection.
Converting the result back to a list.
โ Method 2: Preserve Column Order
If preserving the order of columns from one of the DataFrames (e.g., df1) is important, you can use a list comprehension:
common_cols = [col for col in df1.columns if col in df2.columns]
print("Ordered common columns:", common_cols)
output:
Ordered common columns: ['B', 'C']
This ensures the order of the common columns matches their order in df1.
๐ Use Case Example
Letโs say you want to merge df1 and df2, but only on common columns:
merged_df = pd.merge(df1, df2, on=common_cols)
Finding common columns first helps avoid errors due to mismatched keys.
๐ง Summary
Use set(df1.columns) & set(df2.columns) to quickly find common columns.
Use list comprehensions to preserve order if needed.
This technique helps with merging, schema comparison, and more.
๐ฌ Final Thoughts
Working with large or unfamiliar datasets? Always check for column overlaps before performing merge operations. It can save you time and prevent hard-to-debug errors.
Happy coding with pandas! ๐ผ