On this page
pandas.DataFrame.diff
- DataFrame.diff(periods=1, axis=0)[source]
-
First discrete difference of element.
Calculates the difference of a DataFrame element compared with another element in the DataFrame (default is element in previous row).
- Parameters
-
- periods:int, default 1
-
Periods to shift for calculating difference, accepts negative values.
- axis:{0 or ‘index’, 1 or ‘columns’}, default 0
-
Take difference over rows (0) or columns (1).
- Returns
-
- DataFrame
-
First differences of the Series.
See also
-
DataFrame.pct_change -
Percent change over given number of periods.
-
DataFrame.shift -
Shift index by desired number of periods with an optional time freq.
-
Series.diff -
First discrete difference of object.
Notes
For boolean dtypes, this uses
operator.xor()rather thanoperator.sub(). The result is calculated according to current dtype in DataFrame, however dtype of the result is always float64.Examples
Difference with previous row
>>> df = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6], ... 'b': [1, 1, 2, 3, 5, 8], ... 'c': [1, 4, 9, 16, 25, 36]}) >>> df a b c 0 1 1 1 1 2 1 4 2 3 2 9 3 4 3 16 4 5 5 25 5 6 8 36>>> df.diff() a b c 0 NaN NaN NaN 1 1.0 0.0 3.0 2 1.0 1.0 5.0 3 1.0 1.0 7.0 4 1.0 2.0 9.0 5 1.0 3.0 11.0Difference with previous column
>>> df.diff(axis=1) a b c 0 NaN 0 0 1 NaN -1 3 2 NaN -1 7 3 NaN -1 13 4 NaN 0 20 5 NaN 2 28Difference with 3rd previous row
>>> df.diff(periods=3) a b c 0 NaN NaN NaN 1 NaN NaN NaN 2 NaN NaN NaN 3 3.0 2.0 15.0 4 3.0 4.0 21.0 5 3.0 6.0 27.0Difference with following row
>>> df.diff(periods=-1) a b c 0 -1.0 0.0 -3.0 1 -1.0 -1.0 -5.0 2 -1.0 -1.0 -7.0 3 -1.0 -2.0 -9.0 4 -1.0 -3.0 -11.0 5 NaN NaN NaNOverflow in input dtype
>>> df = pd.DataFrame({'a': [1, 0]}, dtype=np.uint8) >>> df.diff() a 0 NaN 1 255.0
© 2008–2022, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/1.5.0/reference/api/pandas.DataFrame.diff.html