On this page
pandas.DataFrame.update
DataFrame.update(other, join='left', overwrite=True, filter_func=None, errors='ignore')
[source]-
Modify in place using non-NA values from another DataFrame.
Aligns on indices. There is no return value.
Parameters: -
other : DataFrame, or object coercible into a DataFrame
-
Should have at least one matching index/column label with the original DataFrame. If a Series is passed, its name attribute must be set, and that will be used as the column name to align with the original DataFrame.
-
join : {‘left’}, default ‘left’
-
Only left join is implemented, keeping the index and columns of the original object.
-
overwrite : bool, default True
-
How to handle non-NA values for overlapping keys:
- True: overwrite original DataFrame’s values with values from
other
. - False: only update values that are NA in the original DataFrame.
- True: overwrite original DataFrame’s values with values from
-
filter_func : callable(1d-array) -> bool 1d-array, optional
-
Can choose to replace values other than NA. Return True for values that should be updated.
-
errors : {‘raise’, ‘ignore’}, default ‘ignore’
-
If ‘raise’, will raise a ValueError if the DataFrame and
other
both contain non-NA data in the same place.Changed in version 0.24.0: Changed from
raise_conflict=False|True
toerrors=’ignore’|’raise’
.
Returns: -
None : method directly changes calling object
Raises: - ValueError
-
- When
errors=’raise’
and there’s overlapping non-NA data. - When
errors
is not either‘ignore’
or‘raise’
- When
- NotImplementedError
-
- If
join != ‘left’
- If
See also
dict.update
- Similar method for dictionaries.
DataFrame.merge
- For column(s)-on-columns(s) operations.
Examples
>>> df = pd.DataFrame({'A': [1, 2, 3], ... 'B': [400, 500, 600]}) >>> new_df = pd.DataFrame({'B': [4, 5, 6], ... 'C': [7, 8, 9]}) >>> df.update(new_df) >>> df A B 0 1 4 1 2 5 2 3 6
The DataFrame’s length does not increase as a result of the update, only values at matching index/column labels are updated.
>>> df = pd.DataFrame({'A': ['a', 'b', 'c'], ... 'B': ['x', 'y', 'z']}) >>> new_df = pd.DataFrame({'B': ['d', 'e', 'f', 'g', 'h', 'i']}) >>> df.update(new_df) >>> df A B 0 a d 1 b e 2 c f
For Series, it’s name attribute must be set.
>>> df = pd.DataFrame({'A': ['a', 'b', 'c'], ... 'B': ['x', 'y', 'z']}) >>> new_column = pd.Series(['d', 'e'], name='B', index=[0, 2]) >>> df.update(new_column) >>> df A B 0 a d 1 b y 2 c e >>> df = pd.DataFrame({'A': ['a', 'b', 'c'], ... 'B': ['x', 'y', 'z']}) >>> new_df = pd.DataFrame({'B': ['d', 'e']}, index=[1, 2]) >>> df.update(new_df) >>> df A B 0 a x 1 b d 2 c e
If
other
contains NaNs the corresponding values are not updated in the original dataframe.>>> df = pd.DataFrame({'A': [1, 2, 3], ... 'B': [400, 500, 600]}) >>> new_df = pd.DataFrame({'B': [4, np.nan, 6]}) >>> df.update(new_df) >>> df A B 0 1 4.0 1 2 500.0 2 3 6.0
-
© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.24.2/reference/api/pandas.DataFrame.update.html