{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Examples**
\n",
"**Scalar `to_replace` and `value`**"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 6\n",
"1 2\n",
"2 3\n",
"3 4\n",
"4 5\n",
"dtype: int64"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s = pd.Series([0, 2, 3, 4, 5])\n",
"s.replace(0, 6)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
" R | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 6 | \n",
" 6 | \n",
" p | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 7 | \n",
" q | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 8 | \n",
" r | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 9 | \n",
" s | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" 10 | \n",
" t | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q R\n",
"0 6 6 p\n",
"1 2 7 q\n",
"2 3 8 r\n",
"3 4 9 s\n",
"4 5 10 t"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame({'P': [0, 2, 3, 4, 5],\n",
" 'Q': [6, 7, 8, 9, 10],\n",
" 'R': ['p', 'q', 'r', 's', 't']})\n",
"df.replace(0, 6)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**List-like `to_replace`**"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
" R | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 5 | \n",
" 6 | \n",
" p | \n",
"
\n",
" \n",
" 1 | \n",
" 5 | \n",
" 7 | \n",
" q | \n",
"
\n",
" \n",
" 2 | \n",
" 5 | \n",
" 8 | \n",
" r | \n",
"
\n",
" \n",
" 3 | \n",
" 5 | \n",
" 9 | \n",
" s | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" 10 | \n",
" t | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q R\n",
"0 5 6 p\n",
"1 5 7 q\n",
"2 5 8 r\n",
"3 5 9 s\n",
"4 5 10 t"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace([0, 2, 3, 4], 5)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
" R | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 5 | \n",
" 6 | \n",
" p | \n",
"
\n",
" \n",
" 1 | \n",
" 4 | \n",
" 7 | \n",
" q | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 8 | \n",
" r | \n",
"
\n",
" \n",
" 3 | \n",
" 2 | \n",
" 9 | \n",
" s | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" 10 | \n",
" t | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q R\n",
"0 5 6 p\n",
"1 4 7 q\n",
"2 3 8 r\n",
"3 2 9 s\n",
"4 5 10 t"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace([0, 2, 3, 4], [5, 4, 3, 2])"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 0\n",
"1 4\n",
"2 4\n",
"3 4\n",
"4 5\n",
"dtype: int64"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s.replace([2, 3], method='bfill')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**dict-like `to_replace`**"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
" R | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 20 | \n",
" 6 | \n",
" p | \n",
"
\n",
" \n",
" 1 | \n",
" 200 | \n",
" 7 | \n",
" q | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 8 | \n",
" r | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 9 | \n",
" s | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" 10 | \n",
" t | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q R\n",
"0 20 6 p\n",
"1 200 7 q\n",
"2 3 8 r\n",
"3 4 9 s\n",
"4 5 10 t"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace({0: 20, 2: 200})"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
" R | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 100 | \n",
" 100 | \n",
" p | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 7 | \n",
" q | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 8 | \n",
" r | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 9 | \n",
" s | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" 10 | \n",
" t | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q R\n",
"0 100 100 p\n",
"1 2 7 q\n",
"2 3 8 r\n",
"3 4 9 s\n",
"4 5 10 t"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace({'P': 0, 'Q': 6}, 100)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
" R | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 200 | \n",
" 6 | \n",
" p | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 7 | \n",
" q | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 8 | \n",
" r | \n",
"
\n",
" \n",
" 3 | \n",
" 400 | \n",
" 9 | \n",
" s | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" 10 | \n",
" t | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q R\n",
"0 200 6 p\n",
"1 2 7 q\n",
"2 3 8 r\n",
"3 400 9 s\n",
"4 5 10 t"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace({'P': {0: 200, 4: 400}})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Regular expression `to_replace`**"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" new | \n",
" mno | \n",
"
\n",
" \n",
" 1 | \n",
" ffa | \n",
" new | \n",
"
\n",
" \n",
" 2 | \n",
" bfg | \n",
" xyz | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q\n",
"0 new mno\n",
"1 ffa new\n",
"2 bfg xyz"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.DataFrame({'P': ['bar', 'ffa', 'bfg'],\n",
" 'Q': ['mno', 'bat', 'xyz']})\n",
"df.replace(to_replace=r'^ba.$', value='new', regex=True)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" new | \n",
" mno | \n",
"
\n",
" \n",
" 1 | \n",
" ffa | \n",
" bat | \n",
"
\n",
" \n",
" 2 | \n",
" bfg | \n",
" xyz | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q\n",
"0 new mno\n",
"1 ffa bat\n",
"2 bfg xyz"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace({'P': r'^ba.$'}, {'P': 'new'}, regex=True)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" new | \n",
" mno | \n",
"
\n",
" \n",
" 1 | \n",
" ffa | \n",
" new | \n",
"
\n",
" \n",
" 2 | \n",
" bfg | \n",
" xyz | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q\n",
"0 new mno\n",
"1 ffa new\n",
"2 bfg xyz"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace(regex=r'^ba.$', value='new')"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" new | \n",
" mno | \n",
"
\n",
" \n",
" 1 | \n",
" xyz | \n",
" new | \n",
"
\n",
" \n",
" 2 | \n",
" bfg | \n",
" xyz | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q\n",
"0 new mno\n",
"1 xyz new\n",
"2 bfg xyz"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace(regex={r'^ba.$': 'new', 'ffa': 'xyz'})"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" P | \n",
" Q | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" new | \n",
" mno | \n",
"
\n",
" \n",
" 1 | \n",
" new | \n",
" new | \n",
"
\n",
" \n",
" 2 | \n",
" bfg | \n",
" xyz | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" P Q\n",
"0 new mno\n",
"1 new new\n",
"2 bfg xyz"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.replace(regex=[r'^ba.$', 'ffa'], value='new')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that when replacing multiple bool or datetime64 objects, the data types in the to_replace
\n",
"parameter must match the data type of the value being replaced:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"s = pd.Series([10, 'p', 'p', 'q', 'p'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When one uses a dict as the to_replace value, it is like the value(s) in the dict are equal
\n",
"to the value parameter. s.replace({'p': None}) is equivalent to s.replace(to_replace={'p': None},
\n",
"value=None, method=None):"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 10\n",
"1 None\n",
"2 None\n",
"3 q\n",
"4 None\n",
"dtype: object"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s.replace({'p': None})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When value=None and to_replace is a scalar, list or tuple, replace uses the method parameter
\n",
"(default ‘pad’) to do the replacement. So this is why the ‘p’ values are being replaced by 10 in
\n",
"rows 1 and 2 and ‘q’ in row 4 in this case. The command s.replace('p', None) is actually equivalent
\n",
"to s.replace(to_replace='p', value=None, method='pad'):"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 10\n",
"1 10\n",
"2 10\n",
"3 q\n",
"4 q\n",
"dtype: object"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"s.replace('p', None)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}