Suppose those 2 DataFrames has identical column names.
import pandas as pd df_a = pd.read_csv("df_a.tsv", sep="\t") df_a = df_a.set_index("xxx").sort_index() df_b = pd.read_csv("df_b.tsv", sep="\t") df_b = df_b.set_index("xxx").sort_index()
>>> df_a.equals(df_b) >>> True >>> all(df_a == df_b) >>> True
sort_index()is a MUST because
DataFrame.equals()is weak in that it won’t compare records with the same index automatically! Instead it seems to compare row-wise brutally.
df_a == df_balso performs row-wise comparison but if the indices of those 2 DataFrames were not exactly the same (in values and orders), it will throw
ValueError: Can only compare identically-labeled DataFrame objects.
blog comments powered by Disqus