pandas.DataFrame.update

DataFrame.update(other, join='left', overwrite=True, filter_func=None, errors='ignore') [source]

Modify in place using non-NA values from another DataFrame.

Aligns on indices. There is no return value.

Parameters:	`other : DataFrame, or object coercible into a DataFrame` Should have at least one matching index/column label with the original DataFrame. If a Series is passed, its name attribute must be set, and that will be used as the column name to align with the original DataFrame. `join : {‘left’}, default ‘left’` Only left join is implemented, keeping the index and columns of the original object. `overwrite : bool, default True` How to handle non-NA values for overlapping keys: True: overwrite original DataFrame’s values with values from `other`. False: only update values that are NA in the original DataFrame. `filter_func : callable(1d-array) -> bool 1d-array, optional` Can choose to replace values other than NA. Return True for values that should be updated. `errors : {‘raise’, ‘ignore’}, default ‘ignore’` If ‘raise’, will raise a ValueError if the DataFrame and `other` both contain non-NA data in the same place. Changed in version 0.24.0: Changed from `raise_conflict=False\|True` to `errors=’ignore’\|’raise’`.
Returns:	`None : method directly changes calling object`
Raises:	ValueError When `errors=’raise’` and there’s overlapping non-NA data. When `errors` is not either `‘ignore’` or `‘raise’` NotImplementedError If `join != ‘left’`

Examples

>>> df = pd.DataFrame({'A': [1, 2, 3],
...                    'B': [400, 500, 600]})
>>> new_df = pd.DataFrame({'B': [4, 5, 6],
...                        'C': [7, 8, 9]})
>>> df.update(new_df)
>>> df
   A  B
0  1  4
1  2  5
2  3  6

The DataFrame’s length does not increase as a result of the update, only values at matching index/column labels are updated.

>>> df = pd.DataFrame({'A': ['a', 'b', 'c'],
...                    'B': ['x', 'y', 'z']})
>>> new_df = pd.DataFrame({'B': ['d', 'e', 'f', 'g', 'h', 'i']})
>>> df.update(new_df)
>>> df
   A  B
0  a  d
1  b  e
2  c  f

For Series, it’s name attribute must be set.

>>> df = pd.DataFrame({'A': ['a', 'b', 'c'],
...                    'B': ['x', 'y', 'z']})
>>> new_column = pd.Series(['d', 'e'], name='B', index=[0, 2])
>>> df.update(new_column)
>>> df
   A  B
0  a  d
1  b  y
2  c  e
>>> df = pd.DataFrame({'A': ['a', 'b', 'c'],
...                    'B': ['x', 'y', 'z']})
>>> new_df = pd.DataFrame({'B': ['d', 'e']}, index=[1, 2])
>>> df.update(new_df)
>>> df
   A  B
0  a  x
1  b  d
2  c  e

If other contains NaNs the corresponding values are not updated in the original dataframe.

>>> df = pd.DataFrame({'A': [1, 2, 3],
...                    'B': [400, 500, 600]})
>>> new_df = pd.DataFrame({'B': [4, np.nan, 6]})
>>> df.update(new_df)
>>> df
   A      B
0  1    4.0
1  2  500.0
2  3    6.0

© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.24.2/reference/api/pandas.DataFrame.update.html

Docs

Docs4dev

Title here

pandas.DataFrame.update

Examples