{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n# Cumulative distributions\n\nThis example shows how to plot the empirical cumulative distribution function\n(ECDF) of a sample. We also show the theoretical CDF.\n\nIn engineering, ECDFs are sometimes called \"non-exceedance\" curves: the y-value\nfor a given x-value gives probability that an observation from the sample is\nbelow that x-value. For example, the value of 220 on the x-axis corresponds to\nabout 0.80 on the y-axis, so there is an 80% chance that an observation in the\nsample does not exceed 220. Conversely, the empirical *complementary*\ncumulative distribution function (the ECCDF, or \"exceedance\" curve) shows the\nprobability y that an observation from the sample is above a value x.\n\nA direct method to plot ECDFs is `.Axes.ecdf`. Passing ``complementary=True``\nresults in an ECCDF instead.\n\nAlternatively, one can use ``ax.hist(data, density=True, cumulative=True)`` to\nfirst bin the data, as if plotting a histogram, and then compute and plot the\ncumulative sums of the frequencies of entries in each bin. Here, to plot the\nECCDF, pass ``cumulative=-1``. Note that this approach results in an\napproximation of the E(C)CDF, whereas `.Axes.ecdf` is exact.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\nimport numpy as np\n\nnp.random.seed(19680801)\n\nmu = 200\nsigma = 25\nn_bins = 25\ndata = np.random.normal(mu, sigma, size=100)\n\nfig = plt.figure(figsize=(9, 4), layout=\"constrained\")\naxs = fig.subplots(1, 2, sharex=True, sharey=True)\n\n# Cumulative distributions.\naxs[0].ecdf(data, label=\"CDF\")\nn, bins, patches = axs[0].hist(data, n_bins, density=True, histtype=\"step\",\n cumulative=True, label=\"Cumulative histogram\")\nx = np.linspace(data.min(), data.max())\ny = ((1 / (np.sqrt(2 * np.pi) * sigma)) *\n np.exp(-0.5 * (1 / sigma * (x - mu))**2))\ny = y.cumsum()\ny /= y[-1]\naxs[0].plot(x, y, \"k--\", linewidth=1.5, label=\"Theory\")\n\n# Complementary cumulative distributions.\naxs[1].ecdf(data, complementary=True, label=\"CCDF\")\naxs[1].hist(data, bins=bins, density=True, histtype=\"step\", cumulative=-1,\n label=\"Reversed cumulative histogram\")\naxs[1].plot(x, 1 - y, \"k--\", linewidth=1.5, label=\"Theory\")\n\n# Label the figure.\nfig.suptitle(\"Cumulative distributions\")\nfor ax in axs:\n ax.grid(True)\n ax.legend()\n ax.set_xlabel(\"Annual rainfall (mm)\")\n ax.set_ylabel(\"Probability of occurrence\")\n ax.label_outer()\n\nplt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ".. tags:: plot-type: ecdf, plot-type: histogram, domain: statistics\n\n.. admonition:: References\n\n The use of the following functions, methods, classes and modules is shown\n in this example:\n\n - `matplotlib.axes.Axes.hist` / `matplotlib.pyplot.hist`\n - `matplotlib.axes.Axes.ecdf` / `matplotlib.pyplot.ecdf`\n\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.2" } }, "nbformat": 4, "nbformat_minor": 0 }