diff --git a/Readme.rst b/Readme.rst index aab5992..8ef3453 100644 --- a/Readme.rst +++ b/Readme.rst @@ -50,8 +50,9 @@ Uplift modeling estimates a causal effect of treatment and uses it to effectivel Read more about uplift modeling problem in `User Guide `__. -Articles in russian on habr.com: `Part 1 `__ -and `Part 2 `__. +Articles in russian on habr.com: `Part 1 `__ , +`Part 2 `__ +and `Part 3 `__. **Features**: diff --git a/docs/_static/images/client_types.png b/docs/_static/images/client_types.png deleted file mode 100644 index a896bc9..0000000 Binary files a/docs/_static/images/client_types.png and /dev/null differ diff --git a/docs/_static/images/client_types_RU.png b/docs/_static/images/client_types_RU.png deleted file mode 100644 index d9492d4..0000000 Binary files a/docs/_static/images/client_types_RU.png and /dev/null differ diff --git a/docs/changelog.md b/docs/changelog.md index 5d9a6a8..4c78e4e 100644 --- a/docs/changelog.md +++ b/docs/changelog.md @@ -8,6 +8,24 @@ * πŸ”¨ something that previously didn’t work as documentated – or according to reasonable expectations – should now work. * ❗️ you will need to change your code to have the same effect in the future; or a feature will be removed in the future. +## Version 0.3.2 + +### [sklift.datasets](https://www.uplift-modeling.com/en/v0.3.1/api/datasets/index.html) + +* πŸ”¨ Fix bug in [fetch_x5](https://www.uplift-modeling.com/en/v0.3.1/api/datasets/fetch_x5.html) function by [@Muhamob](https://github.com/Muhamob). + +### [sklift.metrics](https://www.uplift-modeling.com/en/v0.3.1/api/index/metrics.html) + +* πŸ“ Fix docstring in [uplift_by_percentile](https://www.uplift-modeling.com/en/v0.3.1/api/metrics/uplift_by_percentile.html) function by [@ElisovaIra](https://github.com/ElisovaIra). + +### [sklift.viz](https://www.uplift-modeling.com/en/v0.3.1/api/viz/index.html) + +* πŸ”¨ Fix bug in [plot_uplift_preds](https://www.uplift-modeling.com/en/v0.3.1/api/viz/plot_uplift_preds.html) function by [@bwbelljr](https://github.com/bwbelljr). + +### Miscellaneous + +* πŸ“ Change some images in ["RetailHero tutorial"](https://nbviewer.jupyter.org/github/maks-sh/scikit-uplift/blob/master/notebooks/RetailHero_EN.ipynb). + ## Version 0.3.1 ### [sklift.datasets](https://www.uplift-modeling.com/en/v0.3.1/api/datasets/index.html) diff --git a/docs/index.rst b/docs/index.rst index e728b1f..96ebe1d 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -24,8 +24,9 @@ The main idea is to provide easy-to-use and fast python package for uplift model Read more about *uplift modeling* problem in `User Guide `__, -Articles in russian on habr.com: `Part 1 `__ -and `Part 2 `__. +Articles in russian on habr.com: `Part 1 `__ , +`Part 2 `__ +and `Part 3 `__. Features ######### diff --git a/notebooks/RetailHero.ipynb b/notebooks/RetailHero.ipynb index 67d66d3..35a0286 100644 --- a/notebooks/RetailHero.ipynb +++ b/notebooks/RetailHero.ipynb @@ -19,10 +19,11 @@ " SCIKIT-UPLIFT DOCS | \n", " USER GUIDE\n", "
\n", - " ENGLISH VERSION\n", + " ENGLISH VERSION\n", "
\n", " БВАВЬЯ НА HABR ЧАБВЬ 1 | \n", - " БВАВЬЯ НА HABR ЧАБВЬ 2\n", + " БВАВЬЯ НА HABR ЧАБВЬ 2 | \n", + " БВАВЬЯ НА HABR ЧАБВЬ 3\n", "\n", "" ] @@ -61,10 +62,10 @@ "Π˜ΡΡ‚ΠΎΡ€ΠΈΡ‡Π΅ΡΠΊΠΈ, ΠΏΠΎ Π²ΠΎΠ·Π΄Π΅ΠΉΡΡ‚Π²ΠΈΡŽ ΠΊΠΎΠΌΠΌΡƒΠ½ΠΈΠΊΠ°Ρ†ΠΈΠΈ ΠΌΠ°Ρ€ΠΊΠ΅Ρ‚ΠΎΠ»ΠΎΠ³ΠΈ Ρ€Π°Π·Π΄Π΅Π»ΡΡŽΡ‚ всСх ΠΊΠ»ΠΈΠ΅Π½Ρ‚ΠΎΠ² Π½Π° 4 ΠΊΠ°Ρ‚Π΅Π³ΠΎΡ€ΠΈΠΈ:\n", "\n", "

\n", - " \"ΠšΠ°Ρ‚Π΅Π³ΠΎΡ€ΠΈΠΈ\n", + " \"ΠšΠ°Ρ‚Π΅Π³ΠΎΡ€ΠΈΠΈ\n", "

\n", "\n", - "1. **`Бпящая собака`** - Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊ, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Π±ΡƒΠ΄Π΅Ρ‚ Ρ€Π΅Π°Π³ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Π½Π΅Π³Π°Ρ‚ΠΈΠ²Π½ΠΎ, Ссли с Π½ΠΈΠΌ ΠΏΡ€ΠΎΠΊΠΎΠΌΠΌΡƒΠ½ΠΈΡ†ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ. Π―Ρ€ΠΊΠΈΠΉ ΠΏΡ€ΠΈΠΌΠ΅Ρ€: ΠΊΠ»ΠΈΠ΅Π½Ρ‚Ρ‹, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ Π·Π°Π±Ρ‹Π»ΠΈ ΠΏΡ€ΠΎ ΠΏΠ»Π°Ρ‚Π½ΡƒΡŽ подписку. ΠŸΠΎΠ»ΡƒΡ‡ΠΈΠ² Π½Π°ΠΏΠΎΠΌΠΈΠ½Π°Π½ΠΈΠ΅ ΠΎΠ± этом, ΠΎΠ½ΠΈ ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ Π΅Π΅ ΠΎΡ‚ΠΊΠ»ΡŽΡ‡Π°Ρ‚. Но Ссли ΠΈΡ… Π½Π΅ Ρ‚Ρ€ΠΎΠ³Π°Ρ‚ΡŒ, Ρ‚ΠΎ ΠΊΠ»ΠΈΠ΅Π½Ρ‚Ρ‹ ΠΏΠΎ-ΠΏΡ€Π΅ΠΆΠ½Π΅ΠΌΡƒ Π±ΡƒΠ΄ΡƒΡ‚ ΠΏΡ€ΠΈΠ½ΠΎΡΠΈΡ‚ΡŒ дСньги. Π’ Ρ‚Π΅Ρ€ΠΌΠΈΠ½Π°Ρ… ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠΈ: $W_i = 1, Y_i = 0$ ΠΈΠ»ΠΈ $W_i = 0, Y_i = 1$.\n", + "1. **`НС Π±Π΅ΡΠΏΠΎΠΊΠΎΠΈΡ‚ΡŒ`** - Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊ, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Π±ΡƒΠ΄Π΅Ρ‚ Ρ€Π΅Π°Π³ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ Π½Π΅Π³Π°Ρ‚ΠΈΠ²Π½ΠΎ, Ссли с Π½ΠΈΠΌ ΠΏΡ€ΠΎΠΊΠΎΠΌΠΌΡƒΠ½ΠΈΡ†ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ. Π―Ρ€ΠΊΠΈΠΉ ΠΏΡ€ΠΈΠΌΠ΅Ρ€: ΠΊΠ»ΠΈΠ΅Π½Ρ‚Ρ‹, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ Π·Π°Π±Ρ‹Π»ΠΈ ΠΏΡ€ΠΎ ΠΏΠ»Π°Ρ‚Π½ΡƒΡŽ подписку. ΠŸΠΎΠ»ΡƒΡ‡ΠΈΠ² Π½Π°ΠΏΠΎΠΌΠΈΠ½Π°Π½ΠΈΠ΅ ΠΎΠ± этом, ΠΎΠ½ΠΈ ΠΎΠ±ΡΠ·Π°Ρ‚Π΅Π»ΡŒΠ½ΠΎ Π΅Π΅ ΠΎΡ‚ΠΊΠ»ΡŽΡ‡Π°Ρ‚. Но Ссли ΠΈΡ… Π½Π΅ Ρ‚Ρ€ΠΎΠ³Π°Ρ‚ΡŒ, Ρ‚ΠΎ ΠΊΠ»ΠΈΠ΅Π½Ρ‚Ρ‹ ΠΏΠΎ-ΠΏΡ€Π΅ΠΆΠ½Π΅ΠΌΡƒ Π±ΡƒΠ΄ΡƒΡ‚ ΠΏΡ€ΠΈΠ½ΠΎΡΠΈΡ‚ΡŒ дСньги. Π’ Ρ‚Π΅Ρ€ΠΌΠΈΠ½Π°Ρ… ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠΈ: $W_i = 1, Y_i = 0$ ΠΈΠ»ΠΈ $W_i = 0, Y_i = 1$.\n", "2. **`ΠŸΠΎΡ‚Π΅Ρ€ΡΠ½Π½Ρ‹ΠΉ`** - Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊ, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Π½Π΅ ΡΠΎΠ²Π΅Ρ€ΡˆΠΈΡ‚ Ρ†Π΅Π»Π΅Π²ΠΎΠ΅ дСйствиС нСзависимо ΠΎΡ‚ ΠΊΠΎΠΌΠΌΡƒΠ½ΠΈΠΊΠ°Ρ†ΠΈΠΉ. ВзаимодСйствиС с Ρ‚Π°ΠΊΠΈΠΌΠΈ ΠΊΠ»ΠΈΠ΅Π½Ρ‚Π°ΠΌΠΈ Π½Π΅ приносит Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΠ³ΠΎ Π΄ΠΎΡ…ΠΎΠ΄Π°, Π½ΠΎ создаСт Π΄ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Π΅ Π·Π°Ρ‚Ρ€Π°Ρ‚Ρ‹. Π’ Ρ‚Π΅Ρ€ΠΌΠΈΠ½Π°Ρ… ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠΈ: $W_i = 1, Y_i = 0$ ΠΈΠ»ΠΈ $W_i = 0, Y_i = 0$.\n", "3. **`Π›ΠΎΡΠ»ΡŒΠ½Ρ‹ΠΉ`** - Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊ, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ Π±ΡƒΠ΄Π΅Ρ‚ Ρ€Π΅Π°Π³ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ ΠΏΠΎΠ»ΠΎΠΆΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎ, нСсмотря Π½ΠΈ Π½Π° Ρ‡Ρ‚ΠΎ - самый Π»ΠΎΡΠ»ΡŒΠ½Ρ‹ΠΉ Π²ΠΈΠ΄ ΠΊΠ»ΠΈΠ΅Π½Ρ‚ΠΎΠ². По Π°Π½Π°Π»ΠΎΠ³ΠΈΠΈ с ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰ΠΈΠΌ ΠΏΡƒΠ½ΠΊΡ‚ΠΎΠΌ, Ρ‚Π°ΠΊΠΈΠ΅ ΠΊΠ»ΠΈΠ΅Π½Ρ‚Ρ‹ Ρ‚Π°ΠΊΠΆΠ΅ Ρ€Π°ΡΡ…ΠΎΠ΄ΡƒΡŽΡ‚ рСсурсы. Однако Π² Π΄Π°Π½Π½ΠΎΠΌ случаС расходы Π³ΠΎΡ€Π°Π·Π΄ΠΎ большС, Ρ‚Π°ΠΊ ΠΊΠ°ΠΊ **Π»ΠΎΡΠ»ΡŒΠ½Ρ‹Π΅** Π΅Ρ‰Π΅ ΠΈ ΠΏΠΎΠ»ΡŒΠ·ΡƒΡŽΡ‚ΡΡ ΠΌΠ°Ρ€ΠΊΠ΅Ρ‚ΠΈΠ½Π³ΠΎΠ²Ρ‹ΠΌ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΈΠ΅ΠΌ (скидками, ΠΊΡƒΠΏΠΎΠ½Π°ΠΌΠΈ ΠΈ Π΄Ρ€ΡƒΠ³ΠΎΠ΅). Π’ Ρ‚Π΅Ρ€ΠΌΠΈΠ½Π°Ρ… ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠΈ: $W_i = 1, Y_i = 1$ ΠΈΠ»ΠΈ $W_i = 0, Y_i = 1$.\n", "4. **`Π£Π±Π΅ΠΆΠ΄Π°Π΅ΠΌΡ‹ΠΉ`** - это Ρ‡Π΅Π»ΠΎΠ²Π΅ΠΊ, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹ΠΉ ΠΏΠΎΠ»ΠΎΠΆΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎ Ρ€Π΅Π°Π³ΠΈΡ€ΡƒΠ΅Ρ‚ Π½Π° ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΈΠ΅, Π½ΠΎ ΠΏΡ€ΠΈ Π΅Π³ΠΎ отсутствии Π½Π΅ Π²Ρ‹ΠΏΠΎΠ»Π½ΠΈΠ» Π±Ρ‹ Ρ†Π΅Π»Π΅Π²ΠΎΠ³ΠΎ дСйствия. Π­Ρ‚ΠΎ Ρ‚Π΅ люди, ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Ρ… ΠΌΡ‹ Ρ…ΠΎΡ‚Π΅Π»ΠΈ Π±Ρ‹ ΠΎΠΏΡ€Π΅Π΄Π΅Π»ΠΈΡ‚ΡŒ нашСй модСлью, Ρ‡Ρ‚ΠΎΠ±Ρ‹ с Π½ΠΈΠΌΠΈ ΠΏΡ€ΠΎΠΊΠΎΠΌΠΌΡƒΠ½ΠΈΡ†ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒ. Π’ Ρ‚Π΅Ρ€ΠΌΠΈΠ½Π°Ρ… ΠΌΠ°Ρ‚Π΅ΠΌΠ°Ρ‚ΠΈΠΊΠΈ: $W_i = 0, Y_i = 0$ ΠΈΠ»ΠΈ $W_i = 1, Y_i = 1$.\n", diff --git a/notebooks/RetailHero_EN.ipynb b/notebooks/RetailHero_EN.ipynb index f702dc1..981e424 100644 --- a/notebooks/RetailHero_EN.ipynb +++ b/notebooks/RetailHero_EN.ipynb @@ -53,7 +53,7 @@ "Historically, according to the impact of communication, marketers divide all customers into 4 categories:\n", "\n", "

\n", - " \"Customer\n", + " \"Customer\n", "

\n", "\n", "- **`Do-Not-Disturbs`** *(a.k.a. Sleeping-dogs)* have a strong negative response to a marketing communication. They are going to purchase if *NOT* treated and will *NOT* purchase *IF* treated. It is not only a wasted marketing budget but also a negative impact. For instance, customers targeted could result in rejecting current products or services. In terms of math: $W_i = 1, Y_i = 0$ or $W_i = 0, Y_i = 1$.\n", diff --git a/sklift/__init__.py b/sklift/__init__.py index e1424ed..73e3bb4 100644 --- a/sklift/__init__.py +++ b/sklift/__init__.py @@ -1 +1 @@ -__version__ = '0.3.1' +__version__ = '0.3.2' diff --git a/sklift/datasets/datasets.py b/sklift/datasets/datasets.py index 4af27f0..6610512 100644 --- a/sklift/datasets/datasets.py +++ b/sklift/datasets/datasets.py @@ -228,7 +228,7 @@ def fetch_x5(data_home=None, dest_subdir=None, download_if_missing=True): dest_filename=file_clients, download_if_missing=download_if_missing) clients = pd.read_csv(csv_clients_path) - clients_features = list(clients.column) + clients_features = list(clients.columns) url_purchases = 'https://timds.s3.eu-central-1.amazonaws.com/purchases.csv.gz' file_purchases = url_purchases.split('/')[-1] diff --git a/sklift/metrics/metrics.py b/sklift/metrics/metrics.py index 63566ef..8d33cf8 100644 --- a/sklift/metrics/metrics.py +++ b/sklift/metrics/metrics.py @@ -569,7 +569,7 @@ def uplift_by_percentile(y_true, uplift, treatment, strategy='overall', std (bool): If True, add columns with the uplift standard deviation and the response rate standard deviation. Default is False. total (bool): If True, add the last row with the total values. Default is False. - The total uplift is a weighted average uplift. See :func:`.weighted_average_uplift`. + The total uplift computes as a total response rate treatment - a total response rate control. The total response rate is a response rate on the full data amount. bins (int): Determines the number of bins (and the relative percentile) in the data. Default is 10. string_percentiles (bool): type of percentiles in the index: float or string. Default is True (string). diff --git a/sklift/viz/base.py b/sklift/viz/base.py index 6da2cc1..8959e1b 100644 --- a/sklift/viz/base.py +++ b/sklift/viz/base.py @@ -28,7 +28,6 @@ def plot_uplift_preds(trmnt_preds, ctrl_preds, log=False, bins=100): # TODO: Add k as parameter: vertical line on plots check_consistent_length(trmnt_preds, ctrl_preds) - check_is_binary(treatment) if not isinstance(bins, int) or bins <= 0: raise ValueError(