<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Toban Wiebe - Writing</title>
    <link>https://tobanwiebe.com/writing/</link>
    <description>Essays by Toban Wiebe on economics, math, programming, and tools.</description>
    <atom:link href="https://tobanwiebe.com/writing/rss.xml" rel="self" type="application/rss+xml"/>
    <lastBuildDate>Sat, 20 Jun 2026 12:00:00 GMT</lastBuildDate>
    <language>en</language>
    <managingEditor>tobanw@gmail.com (Toban Wiebe)</managingEditor>
<item>
  <title>Use eCDFs instead of histograms</title>
  <link>https://tobanwiebe.com/writing/use-ecdfs-instead-of-histograms/</link>
  <guid isPermaLink="true">https://tobanwiebe.com/writing/use-ecdfs-instead-of-histograms/</guid>
  <pubDate>Sat, 20 Jun 2026 12:00:00 GMT</pubDate>
  <description>Histograms are familiar, but empirical CDFs are usually a better default for visualizing distributions: no bin widths, easy percentiles, and better behavior in the tails.</description>
  <content:encoded><![CDATA[<p>Histograms are usually the first plot people reach for when they want to look at a distribution.
They are familiar and often good enough.
But for many distribution questions, they are the wrong default.</p>
<p>My default has long been the empirical CDF, or eCDF.
Once you get used to reading them, eCDFs are almost always cleaner: no bin widths, percentiles are directly visible, and zooming in does not make you forget about the tails.</p>
<p>An eCDF plots the fraction of observations less than or equal to each value.
Formally, for a sample <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>x</mi><mn>1</mn></msub><mo separator="true">,</mo><mo>…</mo><mo separator="true">,</mo><msub><mi>x</mi><mi>n</mi></msub></mrow><annotation encoding="application/x-tex">x_1,\ldots,x_n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner">…</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.1514em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span>, it is</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mover accent="true"><mi>F</mi><mo>^</mo></mover><mo stretchy="false">(</mo><mi>t</mi><mo stretchy="false">)</mo><mo>=</mo><mfrac><mn>1</mn><mi>n</mi></mfrac><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>n</mi></munderover><mn mathvariant="bold">1</mn><mo stretchy="false">(</mo><msub><mi>x</mi><mi>i</mi></msub><mo>≤</mo><mi>t</mi><mo stretchy="false">)</mo><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">\hat F(t) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}(x_i \le t).</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1.1968em;vertical-align:-0.25em"></span><span class="mord accent"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.9468em"><span style="top:-3em"><span class="pstrut" style="height:3em"></span><span class="mord mathnormal" style="margin-right:0.1389em">F</span></span><span style="top:-3.2523em"><span class="pstrut" style="height:3em"></span><span class="accent-body" style="left:-0.1667em"><span class="mord">^</span></span></span></span></span></span></span><span class="mopen">(</span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:2.9291em;vertical-align:-1.2777em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3214em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">n</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.686em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.6514em"><span style="top:-1.8723em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mrel mtight">=</span><span class="mord mtight">1</span></span></span></span><span style="top:-3.05em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span><span style="top:-4.3em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">n</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord mathbf">1</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">≤</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal">t</span><span class="mclose">)</span><span class="mord">.</span></span></span></span></span>
<p>In plain English: the x-axis is the value, and the y-axis is the cumulative share of the data.
If the curve is at 0.8 when <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>x</mi><mo>=</mo><mn>12</mn></mrow><annotation encoding="application/x-tex">x = 12</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord mathnormal">x</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">12</span></span></span></span>, then 80% of the observations are at or below 12.</p>
<p>Here is a simple example.
The sample mostly comes from a central distribution, with a visible right-tailed outlier component.
Draw a few random samples, change the histogram bin width, and toggle the x-axis zoom.
The histogram keeps asking you to make plotting choices; the eCDF answers distribution questions directly.</p>
<figure class="viz ecdf-viz" aria-labelledby="_r1R_0_"><div class="viz-header"><div><figcaption id="_r1R_0_">Histogram vs eCDF on the same sample</figcaption><p>The sample has a visible right-tailed outlier component. Zooming hides those x-values, but the eCDF still shows the missing tail mass as vertical distance from 100%.</p></div><code>full sample range</code></div><div class="ecdf-controls" aria-label="Visualization controls"><button type="button">draw random sample</button><button class="" type="button" aria-pressed="false">zoom x-axis</button><div class="ecdf-slider-control"><div><label for="_r1R_0H1_">bin width</label><output for="_r1R_0H1_">0.5</output></div><input id="_r1R_0H1_" aria-label="Histogram bin width" max="2" min="0.2" step="0.1" type="range" value="0.5"/><div class="ecdf-stepper" aria-label="Bin width step controls"><button type="button" aria-label="Decrease bin width" title="Decrease bin width">-</button><button type="button" aria-label="Increase bin width" title="Increase bin width">+</button></div></div></div><dl class="ecdf-stats" aria-label="Sample summary"><div><dt>sample size</dt><dd>420</dd><span>420 visible</span></div><div><dt>hidden tails</dt><dd>0.0%</dd><span>0 below / 0 above</span></div><div><dt>median</dt><dd>0.2</dd><span>read at eCDF = 50%</span></div><div><dt>90th pct.</dt><dd>1.9</dd><span>read at eCDF = 90%</span></div></dl><svg class="viz-plot ecdf-plot" role="img" viewBox="0 0 1040 360" aria-label="Two-panel plot comparing a histogram and empirical cumulative distribution function for the same sample."><g><text class="ecdf-panel-title" x="58" y="26">Histogram</text><text class="viz-label" x="152" y="26">local mass depends on bin width</text></g><g><text class="ecdf-panel-title" x="568" y="26">eCDF</text><text class="viz-label" x="662" y="26">cumulative share keeps the tail mass visible</text></g><g><g><line class="viz-axis" x1="58" y1="58" x2="58" y2="288"></line><line class="viz-axis" x1="58" y1="288" x2="488" y2="288"></line><g><line class="viz-grid" x1="58" y1="288" x2="488" y2="288"></line><text class="viz-label" x="48" y="292" text-anchor="end">0%</text></g><g><line class="viz-grid" x1="58" y1="173" x2="488" y2="173"></line><text class="viz-label" x="48" y="177" text-anchor="end">10%</text></g><g><line class="viz-grid" x1="58" y1="58" x2="488" y2="58"></line><text class="viz-label" x="48" y="62" text-anchor="end">20%</text></g><g><line class="viz-grid" x1="58" y1="58" x2="58" y2="288"></line><text class="viz-label" x="58" y="312" text-anchor="middle">-20</text></g><g><line class="viz-grid" x1="158" y1="58" x2="158" y2="288"></line><text class="viz-label" x="158" y="312" text-anchor="middle">-10</text></g><g><line class="viz-grid" x1="258" y1="58" x2="258" y2="288"></line><text class="viz-label" x="258" y="312" text-anchor="middle">0</text></g><g><line class="viz-grid" x1="358" y1="58" x2="358" y2="288"></line><text class="viz-label" x="358" y="312" text-anchor="middle">10</text></g><g><line class="viz-grid" x1="458" y1="58" x2="458" y2="288"></line><text class="viz-label" x="458" y="312" text-anchor="middle">20</text></g></g><rect class="ecdf-hist-bar" x="59" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="64" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="69" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="74" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="79" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="84" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="89" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="94" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="99" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="104" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="109" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="114" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="119" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="124" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="129" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="134" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="139" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="144" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="149" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="154" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="159" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="164" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="169" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="174" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="179" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="184" y="279.7857142857143" width="3" height="8.214285714285722"></rect><rect class="ecdf-hist-bar" x="189" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="194" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="199" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="204" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="209" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="214" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="219" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="224" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="229" y="282.5238095238095" width="3" height="5.476190476190482"></rect><rect class="ecdf-hist-bar" x="234" y="266.0952380952381" width="3" height="21.904761904761926"></rect><rect class="ecdf-hist-bar" x="239" y="255.14285714285714" width="3" height="32.85714285714286"></rect><rect class="ecdf-hist-bar" x="244" y="200.38095238095238" width="3" height="87.61904761904762"></rect><rect class="ecdf-hist-bar" x="249" y="156.57142857142858" width="3" height="131.42857142857142"></rect><rect class="ecdf-hist-bar" x="254" y="120.97619047619048" width="3" height="167.02380952380952"></rect><rect class="ecdf-hist-bar" x="259" y="60.73809523809524" width="3" height="227.26190476190476"></rect><rect class="ecdf-hist-bar" x="264" y="129.1904761904762" width="3" height="158.8095238095238"></rect><rect class="ecdf-hist-bar" x="269" y="186.6904761904762" width="3" height="101.3095238095238"></rect><rect class="ecdf-hist-bar" x="274" y="192.16666666666669" width="3" height="95.83333333333331"></rect><rect class="ecdf-hist-bar" x="279" y="263.35714285714283" width="3" height="24.642857142857167"></rect><rect class="ecdf-hist-bar" x="284" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="289" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="294" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="299" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="304" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="309" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="314" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="319" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="324" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="329" y="263.35714285714283" width="3" height="24.642857142857167"></rect><rect class="ecdf-hist-bar" x="334" y="282.5238095238095" width="3" height="5.476190476190482"></rect><rect class="ecdf-hist-bar" x="339" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="344" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="349" y="282.5238095238095" width="3" height="5.476190476190482"></rect><rect class="ecdf-hist-bar" x="354" y="277.04761904761904" width="3" height="10.952380952380963"></rect><rect class="ecdf-hist-bar" x="359" y="282.5238095238095" width="3" height="5.476190476190482"></rect><rect class="ecdf-hist-bar" x="364" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="369" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="374" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="379" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="384" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="389" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="394" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="399" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="404" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="409" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="414" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="419" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="424" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="429" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="434" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="439" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="444" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="449" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="454" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="459" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="464" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="469" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="474" y="285.26190476190476" width="3" height="2.738095238095241"></rect><rect class="ecdf-hist-bar" x="479" y="288" width="3" height="0"></rect><rect class="ecdf-hist-bar" x="484" y="288" width="3" height="0"></rect><text class="viz-label" x="488" y="330" text-anchor="end">value</text></g><g><rect class="ecdf-tail-gap" x="568" y="58" width="430" height="0"></rect><g><line class="viz-axis" x1="568" y1="58" x2="568" y2="288"></line><line class="viz-axis" x1="568" y1="288" x2="998" y2="288"></line><g><line class="viz-grid" x1="568" y1="288" x2="998" y2="288"></line><text class="viz-label" x="558" y="292" text-anchor="end">0%</text></g><g><line class="viz-grid" x1="568" y1="230.5" x2="998" y2="230.5"></line><text class="viz-label" x="558" y="234.5" text-anchor="end">25%</text></g><g><line class="viz-grid" x1="568" y1="173" x2="998" y2="173"></line><text class="viz-label" x="558" y="177" text-anchor="end">50%</text></g><g><line class="viz-grid" x1="568" y1="115.5" x2="998" y2="115.5"></line><text class="viz-label" x="558" y="119.5" text-anchor="end">75%</text></g><g><line class="viz-grid" x1="568" y1="58" x2="998" y2="58"></line><text class="viz-label" x="558" y="62" text-anchor="end">100%</text></g><g><line class="viz-grid" x1="568" y1="58" x2="568" y2="288"></line><text class="viz-label" x="568" y="312" text-anchor="middle">-20</text></g><g><line class="viz-grid" x1="668" y1="58" x2="668" y2="288"></line><text class="viz-label" x="668" y="312" text-anchor="middle">-10</text></g><g><line class="viz-grid" x1="768" y1="58" x2="768" y2="288"></line><text class="viz-label" x="768" y="312" text-anchor="middle">0</text></g><g><line class="viz-grid" x1="868" y1="58" x2="868" y2="288"></line><text class="viz-label" x="868" y="312" text-anchor="middle">10</text></g><g><line class="viz-grid" x1="968" y1="58" x2="968" y2="288"></line><text class="viz-label" x="968" y="312" text-anchor="middle">20</text></g></g><path class="ecdf-step-line" d="M 568.00 288.00 L 575.16 288.00 L 575.16 287.45 L 665.99 287.45 L 665.99 286.90 L 671.27 286.90 L 671.27 286.36 L 683.02 286.36 L 683.02 285.81 L 690.61 285.81 L 690.61 285.26 L 693.07 285.26 L 693.07 284.71 L 693.32 284.71 L 693.32 284.17 L 695.85 284.17 L 695.85 283.62 L 741.39 283.62 L 741.39 283.07 L 741.90 283.07 L 741.90 282.52 L 743.97 282.52 L 743.97 281.98 L 744.78 281.98 L 744.78 281.43 L 745.38 281.43 L 745.38 280.88 L 745.41 280.88 L 745.41 280.33 L 745.51 280.33 L 745.51 279.79 L 746.58 279.79 L 746.58 279.24 L 747.80 279.24 L 747.80 278.69 L 747.89 278.69 L 747.89 278.14 L 748.03 278.14 L 748.03 277.60 L 748.31 277.60 L 748.31 277.05 L 748.36 277.05 L 748.36 276.50 L 749.57 276.50 L 749.57 275.95 L 750.21 275.95 L 750.21 275.40 L 750.56 275.40 L 750.56 274.86 L 750.60 274.86 L 750.60 274.31 L 751.33 274.31 L 751.33 273.76 L 752.35 273.76 L 752.35 273.21 L 752.43 273.21 L 752.43 272.67 L 752.86 272.67 L 752.86 272.12 L 752.94 272.12 L 752.94 271.57 L 753.23 271.57 L 753.23 271.02 L 753.27 271.02 L 753.27 270.48 L 753.69 270.48 L 753.69 269.93 L 753.89 269.93 L 753.89 269.38 L 754.10 269.38 L 754.10 268.83 L 754.13 268.83 L 754.13 268.29 L 754.36 268.29 L 754.36 267.74 L 754.43 267.74 L 754.43 267.19 L 754.85 267.19 L 754.85 266.64 L 754.98 266.64 L 754.98 266.10 L 755.03 266.10 L 755.03 265.55 L 755.23 265.55 L 755.23 265.00 L 755.34 265.00 L 755.34 264.45 L 755.35 264.45 L 755.35 263.90 L 755.75 263.90 L 755.75 263.36 L 756.25 263.36 L 756.25 262.81 L 756.31 262.81 L 756.31 262.26 L 756.71 262.26 L 756.71 261.71 L 756.82 261.71 L 756.82 261.17 L 756.86 261.17 L 756.86 260.62 L 756.89 260.62 L 756.89 260.07 L 756.96 260.07 L 756.96 259.52 L 757.18 259.52 L 757.18 258.98 L 757.23 258.98 L 757.23 258.43 L 757.40 258.43 L 757.40 257.88 L 757.48 257.88 L 757.48 257.33 L 757.49 257.33 L 757.49 256.79 L 757.57 256.79 L 757.57 256.24 L 757.75 256.24 L 757.75 255.69 L 757.89 255.69 L 757.89 255.14 L 757.89 255.14 L 757.89 254.60 L 757.91 254.60 L 757.91 254.05 L 758.12 254.05 L 758.12 253.50 L 758.13 253.50 L 758.13 252.95 L 758.58 252.95 L 758.58 252.40 L 758.62 252.40 L 758.62 251.86 L 758.69 251.86 L 758.69 251.31 L 758.72 251.31 L 758.72 250.76 L 758.78 250.76 L 758.78 250.21 L 758.86 250.21 L 758.86 249.67 L 758.92 249.67 L 758.92 249.12 L 758.97 249.12 L 758.97 248.57 L 759.00 248.57 L 759.00 248.02 L 759.01 248.02 L 759.01 247.48 L 759.07 247.48 L 759.07 246.93 L 759.14 246.93 L 759.14 246.38 L 759.15 246.38 L 759.15 245.83 L 759.17 245.83 L 759.17 245.29 L 759.30 245.29 L 759.30 244.74 L 759.48 244.74 L 759.48 244.19 L 759.50 244.19 L 759.50 243.64 L 759.84 243.64 L 759.84 243.10 L 759.90 243.10 L 759.90 242.55 L 760.05 242.55 L 760.05 242.00 L 760.12 242.00 L 760.12 241.45 L 760.55 241.45 L 760.55 240.90 L 760.83 240.90 L 760.83 240.36 L 760.88 240.36 L 760.88 239.81 L 761.05 239.81 L 761.05 239.26 L 761.10 239.26 L 761.10 238.71 L 761.25 238.71 L 761.25 238.17 L 761.57 238.17 L 761.57 237.62 L 761.66 237.62 L 761.66 237.07 L 761.69 237.07 L 761.69 236.52 L 761.86 236.52 L 761.86 235.98 L 761.87 235.98 L 761.87 235.43 L 761.95 235.43 L 761.95 234.88 L 762.04 234.88 L 762.04 234.33 L 762.09 234.33 L 762.09 233.79 L 762.15 233.79 L 762.15 233.24 L 762.20 233.24 L 762.20 232.69 L 762.22 232.69 L 762.22 232.14 L 762.28 232.14 L 762.28 231.60 L 762.37 231.60 L 762.37 231.05 L 762.39 231.05 L 762.39 230.50 L 762.74 230.50 L 762.74 229.95 L 762.91 229.95 L 762.91 229.40 L 762.94 229.40 L 762.94 228.86 L 762.98 228.86 L 762.98 228.31 L 762.99 228.31 L 762.99 227.76 L 763.13 227.76 L 763.13 227.21 L 763.14 227.21 L 763.14 226.67 L 763.21 226.67 L 763.21 226.12 L 763.33 226.12 L 763.33 225.57 L 763.35 225.57 L 763.35 225.02 L 763.41 225.02 L 763.41 224.48 L 763.62 224.48 L 763.62 223.93 L 763.71 223.93 L 763.71 223.38 L 763.88 223.38 L 763.88 222.83 L 764.04 222.83 L 764.04 222.29 L 764.05 222.29 L 764.05 221.74 L 764.16 221.74 L 764.16 221.19 L 764.17 221.19 L 764.17 220.64 L 764.33 220.64 L 764.33 220.10 L 764.40 220.10 L 764.40 219.55 L 764.51 219.55 L 764.51 219.00 L 764.55 219.00 L 764.55 218.45 L 764.57 218.45 L 764.57 217.90 L 764.65 217.90 L 764.65 217.36 L 764.79 217.36 L 764.79 216.81 L 764.92 216.81 L 764.92 216.26 L 764.93 216.26 L 764.93 215.71 L 765.01 215.71 L 765.01 215.17 L 765.04 215.17 L 765.04 214.62 L 765.18 214.62 L 765.18 214.07 L 765.35 214.07 L 765.35 213.52 L 765.43 213.52 L 765.43 212.98 L 765.61 212.98 L 765.61 212.43 L 765.62 212.43 L 765.62 211.88 L 765.71 211.88 L 765.71 211.33 L 765.80 211.33 L 765.80 210.79 L 765.93 210.79 L 765.93 210.24 L 765.97 210.24 L 765.97 209.69 L 766.30 209.69 L 766.30 209.14 L 766.32 209.14 L 766.32 208.60 L 766.36 208.60 L 766.36 208.05 L 766.41 208.05 L 766.41 207.50 L 766.42 207.50 L 766.42 206.95 L 766.42 206.95 L 766.42 206.40 L 766.46 206.40 L 766.46 205.86 L 766.50 205.86 L 766.50 205.31 L 766.61 205.31 L 766.61 204.76 L 766.71 204.76 L 766.71 204.21 L 766.82 204.21 L 766.82 203.67 L 766.83 203.67 L 766.83 203.12 L 766.97 203.12 L 766.97 202.57 L 767.17 202.57 L 767.17 202.02 L 767.18 202.02 L 767.18 201.48 L 767.28 201.48 L 767.28 200.93 L 767.33 200.93 L 767.33 200.38 L 767.35 200.38 L 767.35 199.83 L 767.53 199.83 L 767.53 199.29 L 767.73 199.29 L 767.73 198.74 L 767.76 198.74 L 767.76 198.19 L 767.77 198.19 L 767.77 197.64 L 767.80 197.64 L 767.80 197.10 L 767.81 197.10 L 767.81 196.55 L 767.83 196.55 L 767.83 196.00 L 767.86 196.00 L 767.86 195.45 L 767.88 195.45 L 767.88 194.90 L 767.99 194.90 L 767.99 194.36 L 768.05 194.36 L 768.05 193.81 L 768.07 193.81 L 768.07 193.26 L 768.08 193.26 L 768.08 192.71 L 768.09 192.71 L 768.09 192.17 L 768.10 192.17 L 768.10 191.62 L 768.10 191.62 L 768.10 191.07 L 768.16 191.07 L 768.16 190.52 L 768.20 190.52 L 768.20 189.98 L 768.23 189.98 L 768.23 189.43 L 768.31 189.43 L 768.31 188.88 L 768.32 188.88 L 768.32 188.33 L 768.32 188.33 L 768.32 187.79 L 768.51 187.79 L 768.51 187.24 L 768.55 187.24 L 768.55 186.69 L 768.61 186.69 L 768.61 186.14 L 768.68 186.14 L 768.68 185.60 L 768.75 185.60 L 768.75 185.05 L 768.80 185.05 L 768.80 184.50 L 768.89 184.50 L 768.89 183.95 L 768.92 183.95 L 768.92 183.40 L 769.05 183.40 L 769.05 182.86 L 769.18 182.86 L 769.18 182.31 L 769.23 182.31 L 769.23 181.76 L 769.28 181.76 L 769.28 181.21 L 769.38 181.21 L 769.38 180.67 L 769.66 180.67 L 769.66 180.12 L 769.74 180.12 L 769.74 179.57 L 769.76 179.57 L 769.76 179.02 L 769.83 179.02 L 769.83 178.48 L 769.90 178.48 L 769.90 177.93 L 769.93 177.93 L 769.93 177.38 L 770.01 177.38 L 770.01 176.83 L 770.06 176.83 L 770.06 176.29 L 770.08 176.29 L 770.08 175.74 L 770.10 175.74 L 770.10 175.19 L 770.16 175.19 L 770.16 174.64 L 770.20 174.64 L 770.20 174.10 L 770.23 174.10 L 770.23 173.55 L 770.34 173.55 L 770.34 173.00 L 770.36 173.00 L 770.36 172.45 L 770.42 172.45 L 770.42 171.90 L 770.55 171.90 L 770.55 171.36 L 770.56 171.36 L 770.56 170.81 L 770.57 170.81 L 770.57 170.26 L 770.58 170.26 L 770.58 169.71 L 770.60 169.71 L 770.60 169.17 L 770.69 169.17 L 770.69 168.62 L 770.75 168.62 L 770.75 168.07 L 771.01 168.07 L 771.01 167.52 L 771.06 167.52 L 771.06 166.98 L 771.09 166.98 L 771.09 166.43 L 771.11 166.43 L 771.11 165.88 L 771.12 165.88 L 771.12 165.33 L 771.14 165.33 L 771.14 164.79 L 771.21 164.79 L 771.21 164.24 L 771.30 164.24 L 771.30 163.69 L 771.35 163.69 L 771.35 163.14 L 771.38 163.14 L 771.38 162.60 L 771.39 162.60 L 771.39 162.05 L 771.45 162.05 L 771.45 161.50 L 771.48 161.50 L 771.48 160.95 L 771.68 160.95 L 771.68 160.40 L 771.73 160.40 L 771.73 159.86 L 771.74 159.86 L 771.74 159.31 L 771.77 159.31 L 771.77 158.76 L 771.88 158.76 L 771.88 158.21 L 771.89 158.21 L 771.89 157.67 L 771.92 157.67 L 771.92 157.12 L 771.98 157.12 L 771.98 156.57 L 772.05 156.57 L 772.05 156.02 L 772.06 156.02 L 772.06 155.48 L 772.22 155.48 L 772.22 154.93 L 772.30 154.93 L 772.30 154.38 L 772.31 154.38 L 772.31 153.83 L 772.43 153.83 L 772.43 153.29 L 772.46 153.29 L 772.46 152.74 L 772.46 152.74 L 772.46 152.19 L 772.56 152.19 L 772.56 151.64 L 772.67 151.64 L 772.67 151.10 L 772.71 151.10 L 772.71 150.55 L 772.92 150.55 L 772.92 150.00 L 772.97 150.00 L 772.97 149.45 L 772.98 149.45 L 772.98 148.90 L 773.27 148.90 L 773.27 148.36 L 773.56 148.36 L 773.56 147.81 L 773.69 147.81 L 773.69 147.26 L 773.72 147.26 L 773.72 146.71 L 773.75 146.71 L 773.75 146.17 L 773.93 146.17 L 773.93 145.62 L 774.13 145.62 L 774.13 145.07 L 774.33 145.07 L 774.33 144.52 L 774.48 144.52 L 774.48 143.98 L 774.48 143.98 L 774.48 143.43 L 774.49 143.43 L 774.49 142.88 L 774.55 142.88 L 774.55 142.33 L 774.66 142.33 L 774.66 141.79 L 774.76 141.79 L 774.76 141.24 L 774.79 141.24 L 774.79 140.69 L 774.82 140.69 L 774.82 140.14 L 774.94 140.14 L 774.94 139.60 L 775.06 139.60 L 775.06 139.05 L 775.06 139.05 L 775.06 138.50 L 775.17 138.50 L 775.17 137.95 L 775.26 137.95 L 775.26 137.40 L 775.28 137.40 L 775.28 136.86 L 775.32 136.86 L 775.32 136.31 L 775.34 136.31 L 775.34 135.76 L 775.39 135.76 L 775.39 135.21 L 775.45 135.21 L 775.45 134.67 L 775.46 134.67 L 775.46 134.12 L 775.51 134.12 L 775.51 133.57 L 775.56 133.57 L 775.56 133.02 L 775.60 133.02 L 775.60 132.48 L 775.65 132.48 L 775.65 131.93 L 775.76 131.93 L 775.76 131.38 L 775.77 131.38 L 775.77 130.83 L 775.83 130.83 L 775.83 130.29 L 775.88 130.29 L 775.88 129.74 L 775.95 129.74 L 775.95 129.19 L 775.97 129.19 L 775.97 128.64 L 776.00 128.64 L 776.00 128.10 L 776.05 128.10 L 776.05 127.55 L 776.11 127.55 L 776.11 127.00 L 776.18 127.00 L 776.18 126.45 L 776.22 126.45 L 776.22 125.90 L 776.22 125.90 L 776.22 125.36 L 776.23 125.36 L 776.23 124.81 L 776.24 124.81 L 776.24 124.26 L 776.72 124.26 L 776.72 123.71 L 776.74 123.71 L 776.74 123.17 L 777.07 123.17 L 777.07 122.62 L 777.13 122.62 L 777.13 122.07 L 777.17 122.07 L 777.17 121.52 L 777.36 121.52 L 777.36 120.98 L 777.37 120.98 L 777.37 120.43 L 777.40 120.43 L 777.40 119.88 L 777.41 119.88 L 777.41 119.33 L 777.80 119.33 L 777.80 118.79 L 777.85 118.79 L 777.85 118.24 L 777.89 118.24 L 777.89 117.69 L 777.94 117.69 L 777.94 117.14 L 778.03 117.14 L 778.03 116.60 L 778.26 116.60 L 778.26 116.05 L 778.33 116.05 L 778.33 115.50 L 778.44 115.50 L 778.44 114.95 L 778.45 114.95 L 778.45 114.40 L 778.95 114.40 L 778.95 113.86 L 778.96 113.86 L 778.96 113.31 L 779.05 113.31 L 779.05 112.76 L 779.14 112.76 L 779.14 112.21 L 779.26 112.21 L 779.26 111.67 L 779.32 111.67 L 779.32 111.12 L 779.46 111.12 L 779.46 110.57 L 779.46 110.57 L 779.46 110.02 L 779.59 110.02 L 779.59 109.48 L 779.79 109.48 L 779.79 108.93 L 779.97 108.93 L 779.97 108.38 L 780.14 108.38 L 780.14 107.83 L 780.14 107.83 L 780.14 107.29 L 780.14 107.29 L 780.14 106.74 L 780.25 106.74 L 780.25 106.19 L 780.60 106.19 L 780.60 105.64 L 780.63 105.64 L 780.63 105.10 L 780.92 105.10 L 780.92 104.55 L 781.06 104.55 L 781.06 104.00 L 781.22 104.00 L 781.22 103.45 L 781.44 103.45 L 781.44 102.90 L 781.48 102.90 L 781.48 102.36 L 781.82 102.36 L 781.82 101.81 L 781.82 101.81 L 781.82 101.26 L 781.92 101.26 L 781.92 100.71 L 781.93 100.71 L 781.93 100.17 L 782.09 100.17 L 782.09 99.62 L 782.26 99.62 L 782.26 99.07 L 782.27 99.07 L 782.27 98.52 L 782.37 98.52 L 782.37 97.98 L 782.52 97.98 L 782.52 97.43 L 782.54 97.43 L 782.54 96.88 L 783.03 96.88 L 783.03 96.33 L 783.07 96.33 L 783.07 95.79 L 783.21 95.79 L 783.21 95.24 L 783.22 95.24 L 783.22 94.69 L 783.23 94.69 L 783.23 94.14 L 783.39 94.14 L 783.39 93.60 L 783.43 93.60 L 783.43 93.05 L 783.58 93.05 L 783.58 92.50 L 783.60 92.50 L 783.60 91.95 L 783.88 91.95 L 783.88 91.40 L 783.90 91.40 L 783.90 90.86 L 783.91 90.86 L 783.91 90.31 L 783.93 90.31 L 783.93 89.76 L 784.02 89.76 L 784.02 89.21 L 784.02 89.21 L 784.02 88.67 L 784.02 88.67 L 784.02 88.12 L 784.24 88.12 L 784.24 87.57 L 784.46 87.57 L 784.46 87.02 L 784.58 87.02 L 784.58 86.48 L 784.73 86.48 L 784.73 85.93 L 784.73 85.93 L 784.73 85.38 L 785.28 85.38 L 785.28 84.83 L 785.86 84.83 L 785.86 84.29 L 786.13 84.29 L 786.13 83.74 L 786.26 83.74 L 786.26 83.19 L 786.68 83.19 L 786.68 82.64 L 786.69 82.64 L 786.69 82.10 L 786.76 82.10 L 786.76 81.55 L 786.77 81.55 L 786.77 81.00 L 787.24 81.00 L 787.24 80.45 L 787.30 80.45 L 787.30 79.90 L 787.39 79.90 L 787.39 79.36 L 787.61 79.36 L 787.61 78.81 L 787.66 78.81 L 787.66 78.26 L 787.69 78.26 L 787.69 77.71 L 788.40 77.71 L 788.40 77.17 L 788.89 77.17 L 788.89 76.62 L 789.01 76.62 L 789.01 76.07 L 789.50 76.07 L 789.50 75.52 L 789.54 75.52 L 789.54 74.98 L 790.23 74.98 L 790.23 74.43 L 790.45 74.43 L 790.45 73.88 L 791.70 73.88 L 791.70 73.33 L 792.00 73.33 L 792.00 72.79 L 804.69 72.79 L 804.69 72.24 L 838.04 72.24 L 838.04 71.69 L 838.39 71.69 L 838.39 71.14 L 838.91 71.14 L 838.91 70.60 L 839.18 70.60 L 839.18 70.05 L 839.43 70.05 L 839.43 69.50 L 840.34 69.50 L 840.34 68.95 L 842.12 68.95 L 842.12 68.40 L 842.25 68.40 L 842.25 67.86 L 842.88 67.86 L 842.88 67.31 L 843.42 67.31 L 843.42 66.76 L 847.04 66.76 L 847.04 66.21 L 853.88 66.21 L 853.88 65.67 L 858.92 65.67 L 858.92 65.12 L 860.38 65.12 L 860.38 64.57 L 863.17 64.57 L 863.17 64.02 L 864.23 64.02 L 864.23 63.48 L 864.75 63.48 L 864.75 62.93 L 865.48 62.93 L 865.48 62.38 L 872.11 62.38 L 872.11 61.83 L 872.25 61.83 L 872.25 61.29 L 873.42 61.29 L 873.42 60.74 L 880.03 60.74 L 880.03 60.19 L 883.23 60.19 L 883.23 59.64 L 929.38 59.64 L 929.38 59.10 L 958.08 59.10 L 958.08 58.55 L 985.37 58.55 L 985.37 58.00 L 998.00 58.00"></path><g><line class="ecdf-quantile-line" x1="568" y1="173" x2="770.3521653601398" y2="173"></line><line class="ecdf-quantile-line" x1="770.3521653601398" y1="173" x2="770.3521653601398" y2="288"></line><circle class="ecdf-quantile-point" cx="770.3521653601398" cy="173" r="4"></circle><text class="viz-label" x="776.3521653601398" y="166">median</text></g><g><line class="ecdf-quantile-line" x1="568" y1="81" x2="786.8150827297563" y2="81"></line><line class="ecdf-quantile-line" x1="786.8150827297563" y1="81" x2="786.8150827297563" y2="288"></line><circle class="ecdf-quantile-point" cx="786.8150827297563" cy="81" r="4"></circle><text class="viz-label" x="792.8150827297563" y="74">90th</text></g><text class="viz-label" x="998" y="330" text-anchor="end">value</text><text class="ecdf-tail-label" x="994" y="73" text-anchor="end">0.0% above range</text></g></svg></figure>
<h2 id="no-bins">No bins</h2>
<p>The first advantage is that eCDFs have no bin width.</p>
<p>With a histogram, the shape can change depending on where the bins start and how wide they are.
Make the bins too narrow and the plot looks noisy.
Make them too wide and you smooth away real structure.
There are rules of thumb for choosing bin widths, but the fact that you need a rule of thumb is already a smell.</p>
<p>An eCDF has no bins.
Every observation is used directly, and the plot means exactly what it says.
There is no hidden smoothing parameter quietly changing the visual impression.</p>
<h2 id="percentiles-are-the-native-units">Percentiles are the native units</h2>
<p>The second advantage is that percentiles are trivial to read.
Want the median?
Look at where the curve crosses 0.5.
Want the 90th percentile?
Look at where it crosses 0.9.
Want to compare two distributions?
Plot two eCDFs and see which curve reaches a given cumulative share sooner.
This makes the plot especially nice for questions like “how bad is the worst 10%?” or “what share is under this threshold?”</p>
<p>Histograms are worse for this because they show local mass rather than cumulative probability.
That is sometimes what you want, but often you end up mentally integrating the bars anyway.
If the question is about ranks, percentiles, thresholds, or tail probabilities, the eCDF is already in the right units.</p>
<h2 id="zooming-without-losing-the-tails">Zooming without losing the tails</h2>
<p>The third advantage is that eCDFs behave much better when there are outliers.</p>
<p>With a histogram, a few extreme values can stretch the x-axis so far that the relevant part of the distribution gets crushed into a small region.
You can zoom in, but then the tails disappear from the plot entirely.
So you are forced to choose between seeing the main body clearly and remembering that the tails exist.</p>
<p>With an eCDF, zooming into the relevant x-range does not have the same failure mode.
If you cut off the x-axis at, say, the 99th percentile, the curve simply stops near 0.99.
The missing 1% is still visible as missing vertical distance.
You can focus on the region where most of the data lives without pretending that the tail mass is zero.</p>
<p>This is a nice separation of concerns.
The x-axis can focus on the region you care about, while the y-axis still accounts for the whole sample.
If the curve ends at 0.96, then 4% of the sample remains above the visible range.
That is much harder to miss than a zoomed histogram with chopped-off bars.</p>
<h2 id="the-familiarity-trap">The familiarity trap</h2>
<p>The real advantage of histograms is familiarity.
People know how to read them because they have seen them a thousand times.
That is not nothing, and in narrow cases it may be decisive.
Otherwise, “histogram” mostly means “the plot everyone is used to”, which is not a statistical argument.</p>
<p>For routine distribution checks, histograms are usually a bad bargain.
They make you choose bins, they hide percentiles, and they handle outliers poorly.
If the question is about thresholds, ranks, quantiles, tail probabilities, or comparing distributions, a histogram is forcing the reader to do extra work.
The eCDF is already showing the relevant object.</p>]]></content:encoded>
</item>
<item>
  <title>Stop worrying about class imbalance</title>
  <link>https://tobanwiebe.com/writing/stop-worrying-about-class-imbalance/</link>
  <guid isPermaLink="true">https://tobanwiebe.com/writing/stop-worrying-about-class-imbalance/</guid>
  <pubDate>Fri, 22 May 2026 12:00:00 GMT</pubDate>
  <description>Data scientists worry too much about class imbalance in classifier training sets. If you care about calibrated probabilities, log loss already handles the imbalance correctly.</description>
  <content:encoded><![CDATA[<p>Data scientists worry too much about class imbalance.</p>
<p>The standard example is something like fraud detection: suppose only 1% of transactions are fraudulent.
People see the 99/1 split and immediately reach for over-sampling, under-sampling, SMOTE, class weights, or some other intervention to “fix” the training set.
But in many cases there is nothing to fix.
The imbalance is not a bug in the data.
It is a fact about the world that the model is supposed to learn.</p>
<p>This is especially true if the classifier is meant to produce probabilities.
If the true probability of fraud is usually small, then a good model should usually predict small probabilities.
That is not a failure mode.
That is the correct answer.</p>
<p>The loss function matters here.
For a binary classifier that predicts <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>i</mi></msub><mo>=</mo><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><msub><mi>y</mi><mi>i</mi></msub><mo>=</mo><mn>1</mn><mo>∣</mo><msub><mi>x</mi><mi>i</mi></msub><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">p_i = Pr[y_i = 1 \mid x_i]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">∣</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose">]</span></span></span></span>, the negative log-likelihood, aka log loss, is</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mrow><mo>−</mo><munder><mo>∑</mo><mi>i</mi></munder><mrow><mo fence="true">[</mo><msub><mi>y</mi><mi>i</mi></msub><mi>log</mi><mo>⁡</mo><mo stretchy="false">(</mo><msub><mi>p</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><msub><mi>y</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mi>log</mi><mo>⁡</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><msub><mi>p</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mo fence="true">]</mo></mrow><mi mathvariant="normal">.</mi></mrow><annotation encoding="application/x-tex">-\sum_i \left[y_i \log(p_i) + (1-y_i)\log(1-p_i)\right].</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:2.3277em;vertical-align:-1.2777em"></span><span class="mord">−</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop op-limits"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.05em"><span style="top:-1.8723em;margin-left:0em"><span class="pstrut" style="height:3.05em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span><span style="top:-3.05em"><span class="pstrut" style="height:3.05em"></span><span><span class="mop op-symbol large-op">∑</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:1.2777em"><span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em">[</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop">lo<span style="margin-right:0.0139em">g</span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop">lo<span style="margin-right:0.0139em">g</span></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose">)</span><span class="mclose delimcenter" style="top:0em">]</span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord">.</span></span></span></span></span>
<p>This loss already accounts for both classes.
A positive example contributes <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>−</mo><mi>log</mi><mo>⁡</mo><mo stretchy="false">(</mo><msub><mi>p</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">-\log(p_i)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord">−</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop">lo<span style="margin-right:0.0139em">g</span></span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span>; a negative example contributes <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>−</mo><mi>log</mi><mo>⁡</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><msub><mi>p</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">-\log(1-p_i)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord">−</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mop">lo<span style="margin-right:0.0139em">g</span></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span></span>.
If positives are rare, there are fewer positive terms in the sum because positives really are rare.
The maximum likelihood estimate is trying to fit the conditional probability distribution in the population you sampled from.
It is not secretly confused by the fact that one class occurs more often than the other.</p>
<p>In fact, the imbalance is often exactly what determines the intercept, or more generally the baseline probability.
If you train a logistic regression model on representative data where positives occur 1% of the time, the model has to learn that the base rate is around 1%.
If you over-sample the positives until the training data is 50/50, you have told the model a different story about the world.
Unless you correct for that later, the predicted probabilities will be too high.</p>
<p>This is the main point that gets lost in the usual discussion: resampling changes the objective.
Over-sampling the minority class is equivalent to giving those examples extra weight.
Under-sampling the majority class is equivalent to throwing away information from the common class.
Both can be useful if you deliberately want a different objective, but they are not neutral preprocessing steps.
They change what the model is optimizing.</p>
<p>Here is a simulated logistic regression example.
The model is well-specified and the test set has the real base rate.
Changing the positive-class weight mostly shifts the log-odds, so the ranking survives while the probabilities get worse.
The calibration plot compares mean predicted probability on the x-axis with the observed positive rate on the y-axis; the dashed diagonal is where predictions match reality:</p>
<figure class="viz imbalance-viz" aria-labelledby="_r1R_0_"><div class="viz-header"><div><figcaption id="_r1R_0_">Weighted logistic regression on a representative test set</figcaption><p>The data-generating process has a 0.97% positive rate. Weighting positives shifts the fitted log-odds by log(weight), which changes probabilities but preserves ranking.</p></div><code>194 positives / 20,000</code></div><div class="imbalance-control"><div class="imbalance-control-head"><label for="_r1R_0H1_">positive class weight</label><output for="_r1R_0H1_">25x</output></div><input id="_r1R_0H1_" aria-label="Positive class weight" class="imbalance-slider" max="2" min="0" step="0.01" type="range" value="1.3979400086720377"/><div class="imbalance-presets" aria-label="Positive class weight presets"><button class="" type="button" aria-pressed="false">1x</button><button class="" type="button" aria-pressed="false">10x</button><button class="" type="button" aria-pressed="false">100x</button></div></div><dl class="imbalance-metrics" aria-label="Simulation metrics"><div class="imbalance-metric is-harm"><dt>mean predicted</dt><dd>1.00% -&gt; 10.65%</dd><span>true rate 0.97%</span></div><div class="imbalance-metric is-harm"><dt>test log loss</dt><dd>0.0375 -&gt; 0.1463</dd><span>worse on real base rate</span></div><div class="imbalance-metric is-good"><dt>ROC AUC</dt><dd>0.916 -&gt; 0.916</dd><span>ranking unchanged</span></div><div class="imbalance-metric is-neutral"><dt>decision threshold</dt><dd>56.8%</dd><span>matches a calibrated 5.0% cutoff</span></div></dl><div class="imbalance-chart-wrap"><svg class="viz-plot imbalance-plot" role="img" viewBox="0 0 720 440" aria-label="Calibration plot comparing an unweighted calibrated logistic model with a positive-class weighted model."><text class="imbalance-plot-title" x="58" y="24">Calibration curve: predicted vs observed rates</text><line class="viz-axis" x1="58" y1="60" x2="58" y2="388"></line><line class="viz-axis" x1="58" y1="388" x2="692" y2="388"></line><g><line class="viz-grid" x1="58" y1="60" x2="58" y2="388"></line><line class="viz-grid" x1="58" y1="388" x2="692" y2="388"></line><text class="viz-label" x="58" y="416" text-anchor="middle">0%</text><text class="viz-label" x="48" y="392" text-anchor="end">0%</text></g><g><line class="viz-grid" x1="216.5" y1="60" x2="216.5" y2="388"></line><line class="viz-grid" x1="58" y1="306" x2="692" y2="306"></line><text class="viz-label" x="216.5" y="416" text-anchor="middle">25%</text><text class="viz-label" x="48" y="310" text-anchor="end">25%</text></g><g><line class="viz-grid" x1="375" y1="60" x2="375" y2="388"></line><line class="viz-grid" x1="58" y1="224" x2="692" y2="224"></line><text class="viz-label" x="375" y="416" text-anchor="middle">50%</text><text class="viz-label" x="48" y="228" text-anchor="end">50%</text></g><g><line class="viz-grid" x1="533.5" y1="60" x2="533.5" y2="388"></line><line class="viz-grid" x1="58" y1="142" x2="692" y2="142"></line><text class="viz-label" x="533.5" y="416" text-anchor="middle">75%</text><text class="viz-label" x="48" y="146" text-anchor="end">75%</text></g><g><line class="viz-grid" x1="692" y1="60" x2="692" y2="388"></line><line class="viz-grid" x1="58" y1="60" x2="692" y2="60"></line><text class="viz-label" x="692" y="416" text-anchor="middle">100%</text><text class="viz-label" x="48" y="64" text-anchor="end">100%</text></g><line class="imbalance-ideal" x1="58" y1="388" x2="692" y2="60"></line><path class="imbalance-base-line" d="M 58.02 388.00 L 58.08 387.67 L 58.18 388.00 L 58.33 388.00 L 58.58 387.51 L 59.01 387.51 L 59.79 386.85 L 61.42 385.87 L 65.65 384.23 L 106.35 364.55"></path><path class="imbalance-weighted-line" d="M 58.54 388.00 L 59.98 387.67 L 62.35 388.00 L 66.07 388.00 L 72.16 387.51 L 82.26 387.51 L 99.81 386.85 L 133.34 385.87 L 204.33 384.23 L 416.22 364.55"></path><circle class="imbalance-base-point" cx="58.02175437707165" cy="388" r="4"></circle><circle class="imbalance-base-point" cx="58.07959844657345" cy="387.672" r="4"></circle><circle class="imbalance-base-point" cx="58.175367623144986" cy="388" r="4"></circle><circle class="imbalance-base-point" cx="58.326913971910976" cy="388" r="4"></circle><circle class="imbalance-base-point" cx="58.579313055266034" cy="387.508" r="4"></circle><circle class="imbalance-base-point" cx="59.008505223111534" cy="387.508" r="4"></circle><circle class="imbalance-base-point" cx="59.78919801130401" cy="386.852" r="4"></circle><circle class="imbalance-base-point" cx="61.41676478310845" cy="385.868" r="4"></circle><circle class="imbalance-base-point" cx="65.650124171282" cy="384.228" r="4"></circle><circle class="imbalance-base-point" cx="106.35246033722709" cy="364.548" r="4"></circle><circle class="imbalance-weighted-point" cx="58.54326228008776" cy="388" r="4"></circle><circle class="imbalance-weighted-point" cx="59.983555830639794" cy="387.672" r="4"></circle><circle class="imbalance-weighted-point" cx="62.354215232194726" cy="388" r="4"></circle><circle class="imbalance-weighted-point" cx="66.07016371761051" cy="388" r="4"></circle><circle class="imbalance-weighted-point" cx="72.16421797309114" cy="387.508" r="4"></circle><circle class="imbalance-weighted-point" cx="82.2640626579327" cy="387.508" r="4"></circle><circle class="imbalance-weighted-point" cx="99.81490541383937" cy="386.852" r="4"></circle><circle class="imbalance-weighted-point" cx="133.34177501450716" cy="385.868" r="4"></circle><circle class="imbalance-weighted-point" cx="204.32525860604918" cy="384.228" r="4"></circle><circle class="imbalance-weighted-point" cx="416.2237961861186" cy="364.548" r="4"></circle><text class="viz-label" x="58" y="50">observed positive rate</text><text class="viz-label" x="692" y="432" text-anchor="end">mean predicted probability</text></svg><div class="imbalance-legend" aria-hidden="true"><span><i class="imbalance-key imbalance-key-base"></i> calibrated</span><span><i class="imbalance-key imbalance-key-weighted"></i> weighted</span><span><i class="imbalance-key imbalance-key-ideal"></i> perfect calibration</span></div></div></figure>
<p>This matters because log loss is a proper scoring rule.
In plain English, that means the loss is minimized by telling the truth: if the conditional probability is 0.03, the best prediction under log loss is 0.03.
But if you train on an artificially balanced dataset, the empirical distribution no longer has the same base rate as the real distribution.
The model may still learn a ranking that is useful for discrimination, but its raw probabilities will generally be miscalibrated.
And calibration is exactly what you want if the downstream decision depends on expected value, risk, or any comparison of probabilities.</p>
<p>There are reasonable caveats.
If the rare class is extremely rare, you may not have enough examples to learn much about it.
That is a sample size problem, not a class imbalance problem.
The solution is usually more data, better features, pooling across related cases, or stronger regularization.
Also, if your optimization procedure struggles because mini-batches rarely contain positives, there may be engineering tricks that help training.
But those tricks should be understood as optimization aids, not as corrections to the statistical target.</p>
<p>Similarly, if your goal is not calibrated probabilities but a specific decision rule, then asymmetric costs may justify a weighted loss.
For example, missing a fraud case may be much worse than annoying a customer with a review.
But then the class weights should come from the costs of the decision problem, not from the class frequencies themselves.
The fact that fraud is rare does not automatically imply that fraudulent examples deserve 99 times the weight.</p>
<p>The clean separation is:</p>
<ol>
<li>Use log loss if you want calibrated probabilities.</li>
<li>Choose a threshold using the costs and benefits of action.</li>
<li>Evaluate calibration and decision quality on data with the real base rate.</li>
</ol>
<p>The common mistake is to mix these steps together.
People see an imbalanced training set, modify it to make the classes look balanced, train a classifier, and then wonder why the predicted probabilities are off.
But the probabilities are off because the model was trained on a distorted version of reality.</p>
<p>Class imbalance can be a warning sign that you need more data or that naive accuracy will be a useless metric.
It is not, by itself, a reason to resample the training set.
If the training set is representative and the model is trained with log loss, the imbalance is already part of the likelihood.
Treat it as signal, not contamination.</p>]]></content:encoded>
</item>
<item>
  <title>Quantifying uncertainty in probability predictions</title>
  <link>https://tobanwiebe.com/writing/quantifying-uncertainty-in-probability-predictions/</link>
  <guid isPermaLink="true">https://tobanwiebe.com/writing/quantifying-uncertainty-in-probability-predictions/</guid>
  <pubDate>Sun, 24 Feb 2019 13:00:00 GMT</pubDate>
  <description>Suppose you&apos;re interested in knowing the chances of an event X occuring (e.g., X = &quot; a nuclear strike over any populated area in the year 2019 &quot;). When making predictions a...</description>
  <content:encoded><![CDATA[<p>Suppose you’re interested in knowing the chances of an event <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> occuring (e.g., <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi><mo>=</mo></mrow><annotation encoding="application/x-tex">X =</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span></span></span></span> “<em>a nuclear strike over any populated area in the year 2019</em>”).
When making predictions about events with binary outcomes (either the event happens or it doesn’t), people generally report a single probability (e.g., a 2% chance of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> occurring).
But, you may wonder, why not report an interval around that prediction, e.g. a prediction interval like 2% <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>±</mo></mrow><annotation encoding="application/x-tex">\pm</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6667em;vertical-align:-0.0833em"></span><span class="mord">±</span></span></span></span> 0.5%, or a distribution of probabilities (e.g., a Beta distribution) to reflect uncertainty?</p>
<p>For example, this question comes up with prediction markets, where the market price can be interpreted as the best estimate of the probability of the event <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> occuring.
But there are no prediction intervals around this market price.
Or consider models for classification, such as logistic regression or other machine learning algorithms, which produce predicted probabilities for each possible class (e.g., <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>X</mi><mi>i</mi></msub><mo>=</mo></mrow><annotation encoding="application/x-tex">X_i =</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0785em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span></span></span></span> “<em>transaction <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>i</mi></mrow><annotation encoding="application/x-tex">i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6595em"></span><span class="mord mathnormal">i</span></span></span></span> is fraudulent</em>”).
In both of these cases, we face the same issue with representing uncertainty — how does the market/model express confidence in its predicted probabilities?</p>
<p>In this post, I’ll explain why this question stems from a fundamental confusion:
<strong>it’s a misconception to think that a predicted probability is a point estimate that doesn’t convey any uncertainty.</strong>
Below, I’ll show that there are two distinct sources of uncertainty that are being conflated here, and that one or both can be used to express uncertainty.</p>
<h2 id="two-types-of-uncertainty">Two types of uncertainty</h2>
<p>The key distinction here is between:</p>
<ol>
<li>Uncertainty over the outcome, <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> vs <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">¬</mi><mi>X</mi></mrow><annotation encoding="application/x-tex">\neg X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>
<ul>
<li>Also known as <a href="https://en.wikipedia.org/wiki/Uncertainty_quantification#Aleatoric_and_epistemic_uncertainty">aleatoric uncertainty</a></li>
<li>(FYI: the symbol &quot;<span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">¬</mi></mrow><annotation encoding="application/x-tex">\neg</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.4306em"></span><span class="mord">¬</span></span></span></span>&quot; is the negation operator and can be read as “not”)</li>
</ul>
</li>
<li>Uncertainty over model parameters which are used to generate a prediction for the outcome
<ul>
<li>Also known as <a href="https://en.wikipedia.org/wiki/Uncertainty_quantification#Aleatoric_and_epistemic_uncertainty">epistemic uncertainty</a></li>
</ul>
</li>
</ol>
<p>Let’s unpack each case in depth.</p>
<h3 id="uncertainty-over-outcomes">Uncertainty over outcomes</h3>
<p>When we aren’t working with a model, we only have the first source of uncertainty to deal with.
But it isn’t obvious where the uncertainty lies: if we say <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><mi>X</mi><mo stretchy="false">]</mo><mo>=</mo><mn>0.02</mn></mrow><annotation encoding="application/x-tex">Pr[X] = 0.02</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">]</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">0.02</span></span></span></span>, it may appear that we’ve just given a point estimate.
But recall that this is a binary outcome space, i.e., the only possible outcomes are <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> or <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">¬</mi><mi>X</mi></mrow><annotation encoding="application/x-tex">\neg X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>.
So the full probability distribution (over the two possible outcomes) can be summarized by one probability, <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>:</mo><mo>=</mo><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><mi>X</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">p := Pr[X]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">:=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">]</span></span></span></span> (which implies <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>1</mn><mo>−</mo><mi>p</mi><mo>=</mo><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><mi mathvariant="normal">¬</mi><mi>X</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">1-p = Pr[\neg X]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7278em;vertical-align:-0.0833em"></span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span></span><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">]</span></span></span></span>).
As we’ve provided a full probability distribution over the outcome space, it’s not possible to say anything more — any uncertainty must be embedded in this distribution.</p>
<p>Intuitively, probabilities near 0 or 1 reflect a high degree of certainty.
A prediction without any uncertainty at all would just be a <em>yes or no</em> answer, i.e., a predicted probability of 0 or 1.
It would just state which outcome will occur, with no notion of uncertainty or hedging.</p>
<p>More precisely, confidence in a probability prediction is reflected by how extreme it is <em>relative to a baseline or prior belief</em>.
To see this, suppose that there is an event <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> that is very likely to occur, and that a prediction market has given <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> a predicted probability of 0.97.
If you are maximally uncertain/ignorant about <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>, what probability do you assign?
Intuitively, you <em>hedge your bets</em> and stick to 0.97.
Here, 0.97 is the baseline, which you can treat as your prior probability.
Given this prior information, a prediction of 0.97 reflects maximal uncertainty.
(If you didn’t have any prior information whatsoever, you would go with 0.5.)</p>
<p>Then, if you have some new information about <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>, you can update your prior to get a posterior.
If your information provides strong evidence in favor of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>, then your posterior probability might jump up to, say, 0.997.
On the other hand, if your information strongly supports <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">¬</mi><mi>X</mi></mrow><annotation encoding="application/x-tex">\neg X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>, then your posterior might drop to, say, 0.78.
Thus, your degree of confidence is revealed by the degree to which your probability moves away from the baseline and toward 0 or 1.</p>
<p>You can use Bayes’ Theorem to play with some numbers yourself.
Denote your prior by <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>:</mo><mo>=</mo><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><mi>X</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">p := Pr[X]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">:=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">]</span></span></span></span>, and assume you’ve used your information <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>D</mi></mrow><annotation encoding="application/x-tex">D</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0278em">D</span></span></span></span> to compute the likelihoods <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mo stretchy="false">(</mo><mi>X</mi><mo stretchy="false">)</mo><mo>:</mo><mo>=</mo><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><mi>D</mi><mo>∣</mo><mi>X</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">q(X) := Pr[D \mid X]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">:=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.0278em">D</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">∣</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">]</span></span></span></span> and <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>q</mi><mo stretchy="false">(</mo><mi mathvariant="normal">¬</mi><mi>X</mi><mo stretchy="false">)</mo><mo>:</mo><mo>=</mo><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><mi>D</mi><mo>∣</mo><mi mathvariant="normal">¬</mi><mi>X</mi><mo stretchy="false">]</mo></mrow><annotation encoding="application/x-tex">q(\neg X) := Pr[D \mid \neg X]</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">:=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.0278em">D</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">∣</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">]</span></span></span></span>.
Denote the likelihood ratio by <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>:</mo><mo>=</mo><mi>q</mi><mo stretchy="false">(</mo><mi>X</mi><mo stretchy="false">)</mo><mi mathvariant="normal">/</mi><mi>q</mi><mo stretchy="false">(</mo><mi mathvariant="normal">¬</mi><mi>X</mi><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">\lambda := q(X) / q(\neg X)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em"></span><span class="mord mathnormal">λ</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">:=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span><span class="mord">/</span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span></span></span></span>.</p>
<p>Then compute the posterior and rearrange in terms of the likelihood ratio:</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right left" columnspacing="0em"><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mi>P</mi><mi>r</mi><mo stretchy="false">[</mo><mi>X</mi><mo>∣</mo><mi>D</mi><mo stretchy="false">]</mo></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>p</mi><mo>⋅</mo><mi>q</mi><mo stretchy="false">(</mo><mi>X</mi><mo stretchy="false">)</mo></mrow><mrow><mi>p</mi><mo>⋅</mo><mi>q</mi><mo stretchy="false">(</mo><mi>X</mi><mo stretchy="false">)</mo><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="false">)</mo><mi>q</mi><mo stretchy="false">(</mo><mi mathvariant="normal">¬</mi><mi>X</mi><mo stretchy="false">)</mo></mrow></mfrac></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>p</mi><mo>⋅</mo><mi>q</mi><mo stretchy="false">(</mo><mi>X</mi><mo stretchy="false">)</mo><mi mathvariant="normal">/</mi><mi>q</mi><mo stretchy="false">(</mo><mi mathvariant="normal">¬</mi><mi>X</mi><mo stretchy="false">)</mo></mrow><mrow><mi>p</mi><mo>⋅</mo><mi>q</mi><mo stretchy="false">(</mo><mi>X</mi><mo stretchy="false">)</mo><mi mathvariant="normal">/</mi><mi>q</mi><mo stretchy="false">(</mo><mi mathvariant="normal">¬</mi><mi>X</mi><mo stretchy="false">)</mo><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="false">)</mo></mrow></mfrac></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>p</mi><mo>⋅</mo><mi>λ</mi></mrow><mrow><mi>p</mi><mo>⋅</mo><mi>λ</mi><mo>+</mo><mo stretchy="false">(</mo><mn>1</mn><mo>−</mo><mi>p</mi><mo stretchy="false">)</mo></mrow></mfrac></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
Pr[X \mid D] &amp;= \frac{p \cdot q(X)}{p \cdot q(X) + (1-p)q(\neg X)}\\
&amp;= \frac{p \cdot q(X)/q(\neg X)}{p \cdot q(X)/q(\neg X) + (1-p)}\\
&amp;= \frac{p \cdot \lambda}{p \cdot \lambda + (1-p)}
\end{aligned}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:7.6334em;vertical-align:-3.5667em"></span><span class="mord"><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:4.0667em"><span style="top:-6.0667em"><span class="pstrut" style="height:3.427em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.1389em">P</span><span class="mord mathnormal" style="margin-right:0.0278em">r</span><span class="mopen">[</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">∣</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathnormal" style="margin-right:0.0278em">D</span><span class="mclose">]</span></span></span><span style="top:-3.4037em"><span class="pstrut" style="height:3.427em"></span><span class="mord"></span></span><span style="top:-0.7963em"><span class="pstrut" style="height:3.427em"></span><span class="mord"></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:3.5667em"><span></span></span></span></span></span><span class="col-align-l"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:4.0667em"><span style="top:-6.0667em"><span class="pstrut" style="height:3.427em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.427em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">⋅</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal">p</span><span class="mclose">)</span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">⋅</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.936em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span><span style="top:-3.4037em"><span class="pstrut" style="height:3.427em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.427em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">⋅</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span><span class="mord">/</span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal">p</span><span class="mclose">)</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">⋅</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span><span class="mord">/</span><span class="mord mathnormal" style="margin-right:0.0359em">q</span><span class="mopen">(</span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span><span class="mclose">)</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.936em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span><span style="top:-0.7963em"><span class="pstrut" style="height:3.427em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.3714em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">⋅</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal">λ</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mopen">(</span><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal">p</span><span class="mclose">)</span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">⋅</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord mathnormal">λ</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.936em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:3.5667em"><span></span></span></span></span></span></span></span></span></span></span></span>
<p>Note that the posterior can be expressed purely in terms of the prior and the likelihood ratio (i.e., it doesn’t depend on the individual likelihoods).
This means that the magnitudes of the likelihoods don’t matter; all that matters is their ratio, which indicates how much the information <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>D</mi></mrow><annotation encoding="application/x-tex">D</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0278em">D</span></span></span></span> favors <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> <em>relative to</em> <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi mathvariant="normal">¬</mi><mi>X</mi></mrow><annotation encoding="application/x-tex">\neg X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord">¬</span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>.</p>
<p>If you play with this formula, you’ll get a sense of how the information in the likelihoods updates the prior to a posterior probability.
Notice that when <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>=</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\lambda = 1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6944em"></span><span class="mord mathnormal">λ</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">1</span></span></span></span>, the posterior reduces to <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi></mrow><annotation encoding="application/x-tex">p</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord mathnormal">p</span></span></span></span>, the prior.
In other words, when <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>D</mi></mrow><annotation encoding="application/x-tex">D</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0278em">D</span></span></span></span> is uninformative about <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span>, it leaves your prior belief unchanged.
Furthermore, for any prior belief <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi></mrow><annotation encoding="application/x-tex">p</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord mathnormal">p</span></span></span></span>, if <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>&gt;</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\lambda &gt; 1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7335em;vertical-align:-0.0391em"></span><span class="mord mathnormal">λ</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">&gt;</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">1</span></span></span></span>, then your posterior will be pushed upward from your prior (and vice versa for <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>λ</mi><mo>&lt;</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">\lambda &lt; 1</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7335em;vertical-align:-0.0391em"></span><span class="mord mathnormal">λ</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">&lt;</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">1</span></span></span></span>).
That is, any information in favor of <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> will increase your confidence in <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>X</mi></mrow><annotation encoding="application/x-tex">X</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.6833em"></span><span class="mord mathnormal" style="margin-right:0.0785em">X</span></span></span></span> — even if <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>p</mi><mo>=</mo><mn>0.999</mn></mrow><annotation encoding="application/x-tex">p=0.999</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord mathnormal">p</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span></span><span class="base"><span class="strut" style="height:0.6444em"></span><span class="mord">0.999</span></span></span></span>!</p>
<p>Judging the quality of probability predictions is simple: just check that they’re <a href="https://scikit-learn.org/stable/modules/calibration.html">calibrated</a>.
For example, predictions made with, 80% confidence should be correct 80% of the time.
With enough completed predictions, you can plot a <a href="https://www.metoffice.gov.uk/research/climate/seasonal-to-decadal/gpc-outlooks/user-guide/interpret-reliability">reliability diagram</a> to assess the calibration of the predictions.</p>
<h3 id="uncertainty-over-models">Uncertainty over models</h3>
<p>When we <em>are</em> working with a model, we also have a second source of uncertainty — that of the model.
This uncertainty is reflected in the posterior distribution over the parameters (at least for Bayesians — frequentists would use the sampling distribution of the parameter estimator to derive confidence intervals / standard errors).
Because these parameters are used to produce predictions, their uncertainty propagates through to produce additional uncertainty over the outcome.</p>
<p>This is <em>meta</em>-uncertainty: uncertainty over the model which produces the uncertain prediction of the outcome.
(In fact, you can have higher levels of meta-uncertainty by including uncertainty over any hyperparameters of the model.)</p>
<p>For example, here’s a specification for a Bayesian logistic regression model, where I’ve put an informative prior on the model coefficients:</p>
<span class="katex-display"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><semantics><mtable rowspacing="0.25em" columnalign="right left" columnspacing="0em"><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><msub><mi>y</mi><mi>i</mi></msub></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>∼</mo><mi>B</mi><mi>e</mi><mi>r</mi><mi>n</mi><mi>o</mi><mi>u</mi><mi>l</mi><mi>l</mi><mi>i</mi><mo stretchy="false">(</mo><msub><mi>p</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mi>log</mi><mo>⁡</mo><mrow><mo fence="true">(</mo><mfrac><msub><mi>p</mi><mi>i</mi></msub><mrow><mn>1</mn><mo>−</mo><msub><mi>p</mi><mi>i</mi></msub></mrow></mfrac><mo fence="true">)</mo></mrow></mrow></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>=</mo><msub><mi>β</mi><mn>0</mn></msub><mo>+</mo><msub><mi>x</mi><mrow><mi>i</mi><mn>1</mn></mrow></msub><msub><mi>β</mi><mn>1</mn></msub><mo>+</mo><msub><mi>x</mi><mrow><mi>i</mi><mn>2</mn></mrow></msub><msub><mi>β</mi><mn>2</mn></msub><mo>+</mo><mo>…</mo><mo>+</mo><msub><mi>x</mi><mrow><mi>i</mi><mi>K</mi></mrow></msub><msub><mi>β</mi><mi>K</mi></msub></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><msub><mi>β</mi><mn>0</mn></msub></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>∼</mo><mi mathvariant="script">N</mi><mo stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mn>1.5</mn><mo stretchy="false">)</mo></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle scriptlevel="0" displaystyle="true"><msub><mi>β</mi><mi>k</mi></msub></mstyle></mtd><mtd><mstyle scriptlevel="0" displaystyle="true"><mrow><mrow></mrow><mo>∼</mo><mi mathvariant="script">N</mi><mo stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mn>1.5</mn><mo stretchy="false">)</mo><mo separator="true">,</mo><mtext>  </mtext><mi>k</mi><mo>=</mo><mn>1</mn><mo separator="true">,</mo><mn>2</mn><mo separator="true">,</mo><mo>…</mo><mo separator="true">,</mo><mi>K</mi></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex">\begin{aligned}
y_i &amp;\sim Bernoulli(p_i) \\
\log\left(\frac{p_i}{1 - p_i}\right) &amp;= \beta_0 + x_{i1} \beta_1 + x_{i2} \beta_2 + \ldots + x_{iK} \beta_K \\
\beta_0 &amp;\sim \mathcal{N}(0,1.5) \\
\beta_k &amp;\sim \mathcal{N}(0,1.5), \; k = 1,2,\ldots,K
\end{aligned}</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:6.9em;vertical-align:-3.2em"></span><span class="mord"><span class="mtable"><span class="col-align-r"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:3.7em"><span style="top:-6.31em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0359em">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:-0.0359em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span><span style="top:-4.2em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mop">lo<span style="margin-right:0.0139em">g</span></span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner"><span class="mopen delimcenter" style="top:0em"><span class="delimsizing size3">(</span></span><span class="mord"><span class="mopen nulldelimiter"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:1.1076em"><span style="top:-2.314em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord">1</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span><span style="top:-3.23em"><span class="pstrut" style="height:3em"></span><span class="frac-line" style="border-bottom-width:0.04em"></span></span><span style="top:-3.677em"><span class="pstrut" style="height:3em"></span><span class="mord"><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.8804em"><span></span></span></span></span></span><span class="mclose nulldelimiter"></span></span><span class="mclose delimcenter" style="top:0em"><span class="delimsizing size3">)</span></span></span></span></span><span style="top:-2.11em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0528em">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0528em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span><span style="top:-0.61em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mord"><span class="mord mathnormal" style="margin-right:0.0528em">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3361em"><span style="top:-2.55em;margin-left:-0.0528em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0315em">k</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:3.2em"><span></span></span></span></span></span><span class="col-align-l"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:3.7em"><span style="top:-6.31em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">∼</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathnormal" style="margin-right:0.0502em">B</span><span class="mord mathnormal" style="margin-right:0.0278em">er</span><span class="mord mathnormal">n</span><span class="mord mathnormal">o</span><span class="mord mathnormal">u</span><span class="mord mathnormal" style="margin-right:0.0197em">l</span><span class="mord mathnormal" style="margin-right:0.0197em">l</span><span class="mord mathnormal">i</span><span class="mopen">(</span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mclose">)</span></span></span><span style="top:-4.2em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0528em">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0528em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">0</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0528em">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0528em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">1</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mord mtight">2</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0528em">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3011em"><span style="top:-2.55em;margin-left:-0.0528em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight">2</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="minner">…</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mbin">+</span><span class="mspace" style="margin-right:0.2222em"></span><span class="mord"><span class="mord mathnormal">x</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mathnormal mtight">i</span><span class="mord mathnormal mtight" style="margin-right:0.0715em">K</span></span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.0528em">β</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em"><span style="top:-2.55em;margin-left:-0.0528em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.0715em">K</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span><span style="top:-2.11em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">∼</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathcal" style="margin-right:0.1474em">N</span><span class="mopen">(</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord">1.5</span><span class="mclose">)</span></span></span><span style="top:-0.61em"><span class="pstrut" style="height:3.45em"></span><span class="mord"><span class="mord"></span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">∼</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord mathcal" style="margin-right:0.1474em">N</span><span class="mopen">(</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord">1.5</span><span class="mclose">)</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord mathnormal" style="margin-right:0.0315em">k</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em"></span><span class="mord">1</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord">2</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="minner">…</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord mathnormal" style="margin-right:0.0715em">K</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:3.2em"><span></span></span></span></span></span></span></span></span></span></span></span>
<p>You can see how the uncertainty from the model’s prior (Normal distribution) propagates through, adding to the uncertainty in the likelihood (Bernoulli distribution).
As a result, we get a full density for <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>p</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">p_i</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.625em;vertical-align:-0.1944em"></span><span class="mord"><span class="mord mathnormal">p</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3117em"><span style="top:-2.55em;margin-left:0em;margin-right:0.05em"><span class="pstrut" style="height:2.7em"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight">i</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em"><span></span></span></span></span></span></span></span></span></span> over the interval <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo stretchy="false">(</mo><mn>0</mn><mo separator="true">,</mo><mn>1</mn><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">(0,1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:1em;vertical-align:-0.25em"></span><span class="mopen">(</span><span class="mord">0</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em"></span><span class="mord">1</span><span class="mclose">)</span></span></span></span>.
The spread of this density reflects model uncertainty, whereas the distance from the prior distribution reflects the degree of confidence in the prediction of the outcome.
The key thing to realize is that these two sources of uncertainty are orthogonal.</p>
<p>For example, you could simultaneously have a very confident prediction of the outcome but with a lot of model uncertainty — a widely spread posterior distribution that is far away from the prior distribution:</p>
<figure class="viz beta-viz" aria-labelledby="confident-prediction-uncertain-model-plot"><div class="viz-header"><div><figcaption id="confident-prediction-uncertain-model-plot">Confident prediction, uncertain model</figcaption><p>Posterior Beta(2, 4) and prior Beta(95, 5) densities over predicted probabilities.</p></div><code>posterior Beta(2, 4)</code></div><svg class="viz-plot" viewBox="0 0 720 360" role="img" aria-label="Posterior beta distribution with alpha 2 and beta 4, compared with prior beta distribution with alpha 95 and beta 5."><line class="viz-axis" x1="46" y1="28" x2="46" y2="312"></line><line class="viz-axis" x1="46" y1="312" x2="694" y2="312"></line><g><line class="viz-grid" x1="46" y1="28" x2="46" y2="312"></line><text class="viz-label" x="46" y="338" text-anchor="middle">.00</text></g><g><line class="viz-grid" x1="208" y1="28" x2="208" y2="312"></line><text class="viz-label" x="208" y="338" text-anchor="middle">.25</text></g><g><line class="viz-grid" x1="370" y1="28" x2="370" y2="312"></line><text class="viz-label" x="370" y="338" text-anchor="middle">.50</text></g><g><line class="viz-grid" x1="532" y1="28" x2="532" y2="312"></line><text class="viz-label" x="532" y="338" text-anchor="middle">.75</text></g><g><line class="viz-grid" x1="694" y1="28" x2="694" y2="312"></line><text class="viz-label" x="694" y="338" text-anchor="middle">1.00</text></g><line class="viz-grid" x1="46" y1="99" x2="694" y2="99"></line><line class="viz-grid" x1="46" y1="170" x2="694" y2="170"></line><line class="viz-grid" x1="46" y1="241" x2="694" y2="241"></line><path class="viz-fill" d="M 46.00 311.97 L 49.62 310.41 L 53.24 308.88 L 56.86 307.40 L 60.48 305.97 L 64.10 304.59 L 67.72 303.26 L 71.34 301.98 L 74.96 300.75 L 78.58 299.57 L 82.20 298.43 L 85.82 297.33 L 89.44 296.28 L 93.06 295.28 L 96.68 294.31 L 100.30 293.39 L 103.92 292.51 L 107.54 291.67 L 111.16 290.87 L 114.78 290.11 L 118.40 289.39 L 122.02 288.71 L 125.64 288.06 L 129.26 287.44 L 132.88 286.87 L 136.50 286.32 L 140.12 285.81 L 143.74 285.33 L 147.36 284.89 L 150.98 284.48 L 154.60 284.09 L 158.22 283.74 L 161.84 283.41 L 165.46 283.12 L 169.08 282.85 L 172.70 282.61 L 176.32 282.40 L 179.94 282.21 L 183.56 282.04 L 187.18 281.90 L 190.80 281.79 L 194.42 281.70 L 198.04 281.63 L 201.66 281.58 L 205.28 281.56 L 208.91 281.55 L 212.53 281.57 L 216.15 281.60 L 219.77 281.65 L 223.39 281.73 L 227.01 281.82 L 230.63 281.92 L 234.25 282.05 L 237.87 282.18 L 241.49 282.34 L 245.11 282.51 L 248.73 282.69 L 252.35 282.89 L 255.97 283.10 L 259.59 283.33 L 263.21 283.57 L 266.83 283.81 L 270.45 284.07 L 274.07 284.35 L 277.69 284.63 L 281.31 284.92 L 284.93 285.22 L 288.55 285.53 L 292.17 285.85 L 295.79 286.17 L 299.41 286.51 L 303.03 286.85 L 306.65 287.20 L 310.27 287.55 L 313.89 287.91 L 317.51 288.27 L 321.13 288.65 L 324.75 289.02 L 328.37 289.40 L 331.99 289.78 L 335.61 290.17 L 339.23 290.56 L 342.85 290.95 L 346.47 291.35 L 350.09 291.75 L 353.71 292.15 L 357.33 292.55 L 360.95 292.95 L 364.57 293.35 L 368.19 293.75 L 371.81 294.16 L 375.43 294.56 L 379.05 294.96 L 382.67 295.36 L 386.29 295.77 L 389.91 296.17 L 393.53 296.56 L 397.15 296.96 L 400.77 297.35 L 404.39 297.75 L 408.01 298.13 L 411.63 298.52 L 415.25 298.90 L 418.87 299.28 L 422.49 299.66 L 426.11 300.03 L 429.73 300.40 L 433.35 300.77 L 436.97 301.13 L 440.59 301.49 L 444.21 301.84 L 447.83 302.18 L 451.45 302.53 L 455.07 302.86 L 458.69 303.20 L 462.31 303.52 L 465.93 303.84 L 469.55 304.16 L 473.17 304.47 L 476.79 304.77 L 480.41 305.07 L 484.03 305.36 L 487.65 305.65 L 491.27 305.93 L 494.89 306.20 L 498.51 306.46 L 502.13 306.72 L 505.75 306.98 L 509.37 307.23 L 512.99 307.47 L 516.61 307.70 L 520.23 307.93 L 523.85 308.15 L 527.47 308.36 L 531.09 308.57 L 534.72 308.77 L 538.34 308.96 L 541.96 309.15 L 545.58 309.33 L 549.20 309.50 L 552.82 309.66 L 556.44 309.82 L 560.06 309.98 L 563.68 310.12 L 567.30 310.26 L 570.92 310.40 L 574.54 310.52 L 578.16 310.65 L 581.78 310.76 L 585.40 310.87 L 589.02 310.97 L 592.64 311.07 L 596.26 311.16 L 599.88 311.24 L 603.50 311.32 L 607.12 311.40 L 610.74 311.47 L 614.36 311.53 L 617.98 311.59 L 621.60 311.64 L 625.22 311.69 L 628.84 311.74 L 632.46 311.78 L 636.08 311.81 L 639.70 311.84 L 643.32 311.87 L 646.94 311.90 L 650.56 311.92 L 654.18 311.94 L 657.80 311.95 L 661.42 311.97 L 665.04 311.98 L 668.66 311.98 L 672.28 311.99 L 675.90 311.99 L 679.52 312.00 L 683.14 312.00 L 686.76 312.00 L 690.38 312.00 L 694.00 312.00 L 694 312 L 46 312 Z"></path><path class="viz-line" d="M 46.00 311.97 L 49.62 310.41 L 53.24 308.88 L 56.86 307.40 L 60.48 305.97 L 64.10 304.59 L 67.72 303.26 L 71.34 301.98 L 74.96 300.75 L 78.58 299.57 L 82.20 298.43 L 85.82 297.33 L 89.44 296.28 L 93.06 295.28 L 96.68 294.31 L 100.30 293.39 L 103.92 292.51 L 107.54 291.67 L 111.16 290.87 L 114.78 290.11 L 118.40 289.39 L 122.02 288.71 L 125.64 288.06 L 129.26 287.44 L 132.88 286.87 L 136.50 286.32 L 140.12 285.81 L 143.74 285.33 L 147.36 284.89 L 150.98 284.48 L 154.60 284.09 L 158.22 283.74 L 161.84 283.41 L 165.46 283.12 L 169.08 282.85 L 172.70 282.61 L 176.32 282.40 L 179.94 282.21 L 183.56 282.04 L 187.18 281.90 L 190.80 281.79 L 194.42 281.70 L 198.04 281.63 L 201.66 281.58 L 205.28 281.56 L 208.91 281.55 L 212.53 281.57 L 216.15 281.60 L 219.77 281.65 L 223.39 281.73 L 227.01 281.82 L 230.63 281.92 L 234.25 282.05 L 237.87 282.18 L 241.49 282.34 L 245.11 282.51 L 248.73 282.69 L 252.35 282.89 L 255.97 283.10 L 259.59 283.33 L 263.21 283.57 L 266.83 283.81 L 270.45 284.07 L 274.07 284.35 L 277.69 284.63 L 281.31 284.92 L 284.93 285.22 L 288.55 285.53 L 292.17 285.85 L 295.79 286.17 L 299.41 286.51 L 303.03 286.85 L 306.65 287.20 L 310.27 287.55 L 313.89 287.91 L 317.51 288.27 L 321.13 288.65 L 324.75 289.02 L 328.37 289.40 L 331.99 289.78 L 335.61 290.17 L 339.23 290.56 L 342.85 290.95 L 346.47 291.35 L 350.09 291.75 L 353.71 292.15 L 357.33 292.55 L 360.95 292.95 L 364.57 293.35 L 368.19 293.75 L 371.81 294.16 L 375.43 294.56 L 379.05 294.96 L 382.67 295.36 L 386.29 295.77 L 389.91 296.17 L 393.53 296.56 L 397.15 296.96 L 400.77 297.35 L 404.39 297.75 L 408.01 298.13 L 411.63 298.52 L 415.25 298.90 L 418.87 299.28 L 422.49 299.66 L 426.11 300.03 L 429.73 300.40 L 433.35 300.77 L 436.97 301.13 L 440.59 301.49 L 444.21 301.84 L 447.83 302.18 L 451.45 302.53 L 455.07 302.86 L 458.69 303.20 L 462.31 303.52 L 465.93 303.84 L 469.55 304.16 L 473.17 304.47 L 476.79 304.77 L 480.41 305.07 L 484.03 305.36 L 487.65 305.65 L 491.27 305.93 L 494.89 306.20 L 498.51 306.46 L 502.13 306.72 L 505.75 306.98 L 509.37 307.23 L 512.99 307.47 L 516.61 307.70 L 520.23 307.93 L 523.85 308.15 L 527.47 308.36 L 531.09 308.57 L 534.72 308.77 L 538.34 308.96 L 541.96 309.15 L 545.58 309.33 L 549.20 309.50 L 552.82 309.66 L 556.44 309.82 L 560.06 309.98 L 563.68 310.12 L 567.30 310.26 L 570.92 310.40 L 574.54 310.52 L 578.16 310.65 L 581.78 310.76 L 585.40 310.87 L 589.02 310.97 L 592.64 311.07 L 596.26 311.16 L 599.88 311.24 L 603.50 311.32 L 607.12 311.40 L 610.74 311.47 L 614.36 311.53 L 617.98 311.59 L 621.60 311.64 L 625.22 311.69 L 628.84 311.74 L 632.46 311.78 L 636.08 311.81 L 639.70 311.84 L 643.32 311.87 L 646.94 311.90 L 650.56 311.92 L 654.18 311.94 L 657.80 311.95 L 661.42 311.97 L 665.04 311.98 L 668.66 311.98 L 672.28 311.99 L 675.90 311.99 L 679.52 312.00 L 683.14 312.00 L 686.76 312.00 L 690.38 312.00 L 694.00 312.00"></path><path class="viz-prior-line" d="M 46.00 312.00 L 49.62 312.00 L 53.24 312.00 L 56.86 312.00 L 60.48 312.00 L 64.10 312.00 L 67.72 312.00 L 71.34 312.00 L 74.96 312.00 L 78.58 312.00 L 82.20 312.00 L 85.82 312.00 L 89.44 312.00 L 93.06 312.00 L 96.68 312.00 L 100.30 312.00 L 103.92 312.00 L 107.54 312.00 L 111.16 312.00 L 114.78 312.00 L 118.40 312.00 L 122.02 312.00 L 125.64 312.00 L 129.26 312.00 L 132.88 312.00 L 136.50 312.00 L 140.12 312.00 L 143.74 312.00 L 147.36 312.00 L 150.98 312.00 L 154.60 312.00 L 158.22 312.00 L 161.84 312.00 L 165.46 312.00 L 169.08 312.00 L 172.70 312.00 L 176.32 312.00 L 179.94 312.00 L 183.56 312.00 L 187.18 312.00 L 190.80 312.00 L 194.42 312.00 L 198.04 312.00 L 201.66 312.00 L 205.28 312.00 L 208.91 312.00 L 212.53 312.00 L 216.15 312.00 L 219.77 312.00 L 223.39 312.00 L 227.01 312.00 L 230.63 312.00 L 234.25 312.00 L 237.87 312.00 L 241.49 312.00 L 245.11 312.00 L 248.73 312.00 L 252.35 312.00 L 255.97 312.00 L 259.59 312.00 L 263.21 312.00 L 266.83 312.00 L 270.45 312.00 L 274.07 312.00 L 277.69 312.00 L 281.31 312.00 L 284.93 312.00 L 288.55 312.00 L 292.17 312.00 L 295.79 312.00 L 299.41 312.00 L 303.03 312.00 L 306.65 312.00 L 310.27 312.00 L 313.89 312.00 L 317.51 312.00 L 321.13 312.00 L 324.75 312.00 L 328.37 312.00 L 331.99 312.00 L 335.61 312.00 L 339.23 312.00 L 342.85 312.00 L 346.47 312.00 L 350.09 312.00 L 353.71 312.00 L 357.33 312.00 L 360.95 312.00 L 364.57 312.00 L 368.19 312.00 L 371.81 312.00 L 375.43 312.00 L 379.05 312.00 L 382.67 312.00 L 386.29 312.00 L 389.91 312.00 L 393.53 312.00 L 397.15 312.00 L 400.77 312.00 L 404.39 312.00 L 408.01 312.00 L 411.63 312.00 L 415.25 312.00 L 418.87 312.00 L 422.49 312.00 L 426.11 312.00 L 429.73 312.00 L 433.35 312.00 L 436.97 312.00 L 440.59 312.00 L 444.21 312.00 L 447.83 312.00 L 451.45 312.00 L 455.07 312.00 L 458.69 312.00 L 462.31 312.00 L 465.93 312.00 L 469.55 312.00 L 473.17 312.00 L 476.79 312.00 L 480.41 312.00 L 484.03 312.00 L 487.65 312.00 L 491.27 312.00 L 494.89 312.00 L 498.51 312.00 L 502.13 312.00 L 505.75 312.00 L 509.37 312.00 L 512.99 312.00 L 516.61 312.00 L 520.23 312.00 L 523.85 312.00 L 527.47 312.00 L 531.09 312.00 L 534.72 312.00 L 538.34 312.00 L 541.96 312.00 L 545.58 312.00 L 549.20 312.00 L 552.82 312.00 L 556.44 312.00 L 560.06 312.00 L 563.68 311.99 L 567.30 311.99 L 570.92 311.98 L 574.54 311.97 L 578.16 311.95 L 581.78 311.92 L 585.40 311.87 L 589.02 311.78 L 592.64 311.65 L 596.26 311.44 L 599.88 311.10 L 603.50 310.58 L 607.12 309.79 L 610.74 308.58 L 614.36 306.78 L 617.98 304.13 L 621.60 300.29 L 625.22 294.80 L 628.84 287.11 L 632.46 276.56 L 636.08 262.41 L 639.70 243.93 L 643.32 220.53 L 646.94 192.00 L 650.56 158.78 L 654.18 122.38 L 657.80 85.74 L 661.42 53.52 L 665.04 31.94 L 668.66 28.00 L 672.28 47.64 L 675.90 92.84 L 679.52 158.16 L 683.14 228.84 L 686.76 284.02 L 690.38 309.03 L 694.00 312.00"></path><line class="viz-mean" x1="262" y1="28" x2="262" y2="312"></line><text class="viz-label" x="262" y="44" text-anchor="middle">posterior mean</text><line class="viz-prior-mean" x1="661.6" y1="28" x2="661.6" y2="312"></line><text class="viz-label" x="661.6" y="59" text-anchor="middle">prior mean</text><text class="viz-label" x="46" y="18">density, scaled to max 19.67</text><text class="viz-label" x="694" y="354" text-anchor="end">predicted probability</text></svg><dl class="viz-stats" aria-label="Plot parameters"><div><dt>posterior</dt><dd>Beta(2, 4)</dd></div><div><dt>prior</dt><dd>Beta(95, 5)</dd></div><div><dt>means</dt><dd>0.33 / 0.95</dd></div></dl></figure>
<p>Or you could have a very unconfident prediction of the outcome with very little model uncertainty — a tightly distributed posterior distribution that remains close to the prior distribution:</p>
<figure class="viz beta-viz" aria-labelledby="uncertain-prediction-confident-model-plot"><div class="viz-header"><div><figcaption id="uncertain-prediction-confident-model-plot">Uncertain prediction, confident model</figcaption><p>Posterior Beta(120, 30) and prior Beta(60, 15) densities over predicted probabilities.</p></div><code>posterior Beta(120, 30)</code></div><svg class="viz-plot" viewBox="0 0 720 360" role="img" aria-label="Posterior beta distribution with alpha 120 and beta 30, compared with prior beta distribution with alpha 60 and beta 15."><line class="viz-axis" x1="46" y1="28" x2="46" y2="312"></line><line class="viz-axis" x1="46" y1="312" x2="694" y2="312"></line><g><line class="viz-grid" x1="46" y1="28" x2="46" y2="312"></line><text class="viz-label" x="46" y="338" text-anchor="middle">.00</text></g><g><line class="viz-grid" x1="208" y1="28" x2="208" y2="312"></line><text class="viz-label" x="208" y="338" text-anchor="middle">.25</text></g><g><line class="viz-grid" x1="370" y1="28" x2="370" y2="312"></line><text class="viz-label" x="370" y="338" text-anchor="middle">.50</text></g><g><line class="viz-grid" x1="532" y1="28" x2="532" y2="312"></line><text class="viz-label" x="532" y="338" text-anchor="middle">.75</text></g><g><line class="viz-grid" x1="694" y1="28" x2="694" y2="312"></line><text class="viz-label" x="694" y="338" text-anchor="middle">1.00</text></g><line class="viz-grid" x1="46" y1="99" x2="694" y2="99"></line><line class="viz-grid" x1="46" y1="170" x2="694" y2="170"></line><line class="viz-grid" x1="46" y1="241" x2="694" y2="241"></line><path class="viz-fill" d="M 46.00 312.00 L 49.62 312.00 L 53.24 312.00 L 56.86 312.00 L 60.48 312.00 L 64.10 312.00 L 67.72 312.00 L 71.34 312.00 L 74.96 312.00 L 78.58 312.00 L 82.20 312.00 L 85.82 312.00 L 89.44 312.00 L 93.06 312.00 L 96.68 312.00 L 100.30 312.00 L 103.92 312.00 L 107.54 312.00 L 111.16 312.00 L 114.78 312.00 L 118.40 312.00 L 122.02 312.00 L 125.64 312.00 L 129.26 312.00 L 132.88 312.00 L 136.50 312.00 L 140.12 312.00 L 143.74 312.00 L 147.36 312.00 L 150.98 312.00 L 154.60 312.00 L 158.22 312.00 L 161.84 312.00 L 165.46 312.00 L 169.08 312.00 L 172.70 312.00 L 176.32 312.00 L 179.94 312.00 L 183.56 312.00 L 187.18 312.00 L 190.80 312.00 L 194.42 312.00 L 198.04 312.00 L 201.66 312.00 L 205.28 312.00 L 208.91 312.00 L 212.53 312.00 L 216.15 312.00 L 219.77 312.00 L 223.39 312.00 L 227.01 312.00 L 230.63 312.00 L 234.25 312.00 L 237.87 312.00 L 241.49 312.00 L 245.11 312.00 L 248.73 312.00 L 252.35 312.00 L 255.97 312.00 L 259.59 312.00 L 263.21 312.00 L 266.83 312.00 L 270.45 312.00 L 274.07 312.00 L 277.69 312.00 L 281.31 312.00 L 284.93 312.00 L 288.55 312.00 L 292.17 312.00 L 295.79 312.00 L 299.41 312.00 L 303.03 312.00 L 306.65 312.00 L 310.27 312.00 L 313.89 312.00 L 317.51 312.00 L 321.13 312.00 L 324.75 312.00 L 328.37 312.00 L 331.99 312.00 L 335.61 312.00 L 339.23 312.00 L 342.85 312.00 L 346.47 312.00 L 350.09 312.00 L 353.71 312.00 L 357.33 312.00 L 360.95 312.00 L 364.57 312.00 L 368.19 312.00 L 371.81 312.00 L 375.43 312.00 L 379.05 312.00 L 382.67 312.00 L 386.29 312.00 L 389.91 312.00 L 393.53 312.00 L 397.15 312.00 L 400.77 312.00 L 404.39 312.00 L 408.01 312.00 L 411.63 312.00 L 415.25 312.00 L 418.87 312.00 L 422.49 312.00 L 426.11 312.00 L 429.73 312.00 L 433.35 312.00 L 436.97 312.00 L 440.59 312.00 L 444.21 312.00 L 447.83 312.00 L 451.45 312.00 L 455.07 311.99 L 458.69 311.98 L 462.31 311.97 L 465.93 311.95 L 469.55 311.92 L 473.17 311.86 L 476.79 311.76 L 480.41 311.59 L 484.03 311.34 L 487.65 310.93 L 491.27 310.32 L 494.89 309.38 L 498.51 308.00 L 502.13 306.00 L 505.75 303.15 L 509.37 299.18 L 512.99 293.78 L 516.61 286.58 L 520.23 277.21 L 523.85 265.30 L 527.47 250.55 L 531.09 232.78 L 534.72 211.99 L 538.34 188.43 L 541.96 162.66 L 545.58 135.58 L 549.20 108.43 L 552.82 82.75 L 556.44 60.23 L 560.06 42.62 L 563.68 31.48 L 567.30 28.00 L 570.92 32.82 L 574.54 45.86 L 578.16 66.35 L 581.78 92.80 L 585.40 123.25 L 589.02 155.47 L 592.64 187.25 L 596.26 216.69 L 599.88 242.38 L 603.50 263.53 L 607.12 279.95 L 610.74 291.95 L 614.36 300.18 L 617.98 305.47 L 621.60 308.64 L 625.22 310.40 L 628.84 311.30 L 632.46 311.72 L 636.08 311.90 L 639.70 311.97 L 643.32 311.99 L 646.94 312.00 L 650.56 312.00 L 654.18 312.00 L 657.80 312.00 L 661.42 312.00 L 665.04 312.00 L 668.66 312.00 L 672.28 312.00 L 675.90 312.00 L 679.52 312.00 L 683.14 312.00 L 686.76 312.00 L 690.38 312.00 L 694.00 312.00 L 694 312 L 46 312 Z"></path><path class="viz-line" d="M 46.00 312.00 L 49.62 312.00 L 53.24 312.00 L 56.86 312.00 L 60.48 312.00 L 64.10 312.00 L 67.72 312.00 L 71.34 312.00 L 74.96 312.00 L 78.58 312.00 L 82.20 312.00 L 85.82 312.00 L 89.44 312.00 L 93.06 312.00 L 96.68 312.00 L 100.30 312.00 L 103.92 312.00 L 107.54 312.00 L 111.16 312.00 L 114.78 312.00 L 118.40 312.00 L 122.02 312.00 L 125.64 312.00 L 129.26 312.00 L 132.88 312.00 L 136.50 312.00 L 140.12 312.00 L 143.74 312.00 L 147.36 312.00 L 150.98 312.00 L 154.60 312.00 L 158.22 312.00 L 161.84 312.00 L 165.46 312.00 L 169.08 312.00 L 172.70 312.00 L 176.32 312.00 L 179.94 312.00 L 183.56 312.00 L 187.18 312.00 L 190.80 312.00 L 194.42 312.00 L 198.04 312.00 L 201.66 312.00 L 205.28 312.00 L 208.91 312.00 L 212.53 312.00 L 216.15 312.00 L 219.77 312.00 L 223.39 312.00 L 227.01 312.00 L 230.63 312.00 L 234.25 312.00 L 237.87 312.00 L 241.49 312.00 L 245.11 312.00 L 248.73 312.00 L 252.35 312.00 L 255.97 312.00 L 259.59 312.00 L 263.21 312.00 L 266.83 312.00 L 270.45 312.00 L 274.07 312.00 L 277.69 312.00 L 281.31 312.00 L 284.93 312.00 L 288.55 312.00 L 292.17 312.00 L 295.79 312.00 L 299.41 312.00 L 303.03 312.00 L 306.65 312.00 L 310.27 312.00 L 313.89 312.00 L 317.51 312.00 L 321.13 312.00 L 324.75 312.00 L 328.37 312.00 L 331.99 312.00 L 335.61 312.00 L 339.23 312.00 L 342.85 312.00 L 346.47 312.00 L 350.09 312.00 L 353.71 312.00 L 357.33 312.00 L 360.95 312.00 L 364.57 312.00 L 368.19 312.00 L 371.81 312.00 L 375.43 312.00 L 379.05 312.00 L 382.67 312.00 L 386.29 312.00 L 389.91 312.00 L 393.53 312.00 L 397.15 312.00 L 400.77 312.00 L 404.39 312.00 L 408.01 312.00 L 411.63 312.00 L 415.25 312.00 L 418.87 312.00 L 422.49 312.00 L 426.11 312.00 L 429.73 312.00 L 433.35 312.00 L 436.97 312.00 L 440.59 312.00 L 444.21 312.00 L 447.83 312.00 L 451.45 312.00 L 455.07 311.99 L 458.69 311.98 L 462.31 311.97 L 465.93 311.95 L 469.55 311.92 L 473.17 311.86 L 476.79 311.76 L 480.41 311.59 L 484.03 311.34 L 487.65 310.93 L 491.27 310.32 L 494.89 309.38 L 498.51 308.00 L 502.13 306.00 L 505.75 303.15 L 509.37 299.18 L 512.99 293.78 L 516.61 286.58 L 520.23 277.21 L 523.85 265.30 L 527.47 250.55 L 531.09 232.78 L 534.72 211.99 L 538.34 188.43 L 541.96 162.66 L 545.58 135.58 L 549.20 108.43 L 552.82 82.75 L 556.44 60.23 L 560.06 42.62 L 563.68 31.48 L 567.30 28.00 L 570.92 32.82 L 574.54 45.86 L 578.16 66.35 L 581.78 92.80 L 585.40 123.25 L 589.02 155.47 L 592.64 187.25 L 596.26 216.69 L 599.88 242.38 L 603.50 263.53 L 607.12 279.95 L 610.74 291.95 L 614.36 300.18 L 617.98 305.47 L 621.60 308.64 L 625.22 310.40 L 628.84 311.30 L 632.46 311.72 L 636.08 311.90 L 639.70 311.97 L 643.32 311.99 L 646.94 312.00 L 650.56 312.00 L 654.18 312.00 L 657.80 312.00 L 661.42 312.00 L 665.04 312.00 L 668.66 312.00 L 672.28 312.00 L 675.90 312.00 L 679.52 312.00 L 683.14 312.00 L 686.76 312.00 L 690.38 312.00 L 694.00 312.00"></path><path class="viz-prior-line" d="M 46.00 312.00 L 49.62 312.00 L 53.24 312.00 L 56.86 312.00 L 60.48 312.00 L 64.10 312.00 L 67.72 312.00 L 71.34 312.00 L 74.96 312.00 L 78.58 312.00 L 82.20 312.00 L 85.82 312.00 L 89.44 312.00 L 93.06 312.00 L 96.68 312.00 L 100.30 312.00 L 103.92 312.00 L 107.54 312.00 L 111.16 312.00 L 114.78 312.00 L 118.40 312.00 L 122.02 312.00 L 125.64 312.00 L 129.26 312.00 L 132.88 312.00 L 136.50 312.00 L 140.12 312.00 L 143.74 312.00 L 147.36 312.00 L 150.98 312.00 L 154.60 312.00 L 158.22 312.00 L 161.84 312.00 L 165.46 312.00 L 169.08 312.00 L 172.70 312.00 L 176.32 312.00 L 179.94 312.00 L 183.56 312.00 L 187.18 312.00 L 190.80 312.00 L 194.42 312.00 L 198.04 312.00 L 201.66 312.00 L 205.28 312.00 L 208.91 312.00 L 212.53 312.00 L 216.15 312.00 L 219.77 312.00 L 223.39 312.00 L 227.01 312.00 L 230.63 312.00 L 234.25 312.00 L 237.87 312.00 L 241.49 312.00 L 245.11 312.00 L 248.73 312.00 L 252.35 312.00 L 255.97 312.00 L 259.59 312.00 L 263.21 312.00 L 266.83 312.00 L 270.45 312.00 L 274.07 312.00 L 277.69 312.00 L 281.31 312.00 L 284.93 312.00 L 288.55 312.00 L 292.17 312.00 L 295.79 312.00 L 299.41 312.00 L 303.03 312.00 L 306.65 312.00 L 310.27 312.00 L 313.89 312.00 L 317.51 312.00 L 321.13 312.00 L 324.75 312.00 L 328.37 312.00 L 331.99 312.00 L 335.61 312.00 L 339.23 312.00 L 342.85 312.00 L 346.47 312.00 L 350.09 312.00 L 353.71 312.00 L 357.33 312.00 L 360.95 312.00 L 364.57 312.00 L 368.19 312.00 L 371.81 312.00 L 375.43 312.00 L 379.05 312.00 L 382.67 312.00 L 386.29 312.00 L 389.91 312.00 L 393.53 312.00 L 397.15 312.00 L 400.77 312.00 L 404.39 311.99 L 408.01 311.99 L 411.63 311.99 L 415.25 311.98 L 418.87 311.97 L 422.49 311.96 L 426.11 311.94 L 429.73 311.92 L 433.35 311.88 L 436.97 311.83 L 440.59 311.76 L 444.21 311.66 L 447.83 311.53 L 451.45 311.35 L 455.07 311.11 L 458.69 310.79 L 462.31 310.37 L 465.93 309.83 L 469.55 309.12 L 473.17 308.21 L 476.79 307.05 L 480.41 305.59 L 484.03 303.77 L 487.65 301.51 L 491.27 298.75 L 494.89 295.40 L 498.51 291.37 L 502.13 286.59 L 505.75 280.97 L 509.37 274.45 L 512.99 266.96 L 516.61 258.46 L 520.23 248.96 L 523.85 238.47 L 527.47 227.06 L 531.09 214.86 L 534.72 202.03 L 538.34 188.80 L 541.96 175.46 L 545.58 162.34 L 549.20 149.83 L 552.82 138.34 L 556.44 128.28 L 560.06 120.09 L 563.68 114.16 L 567.30 110.82 L 570.92 110.32 L 574.54 112.81 L 578.16 118.32 L 581.78 126.75 L 585.40 137.84 L 589.02 151.23 L 592.64 166.42 L 596.26 182.84 L 599.88 199.88 L 603.50 216.91 L 607.12 233.34 L 610.74 248.64 L 614.36 262.43 L 617.98 274.41 L 621.60 284.45 L 625.22 292.55 L 628.84 298.82 L 632.46 303.47 L 636.08 306.75 L 639.70 308.95 L 643.32 310.34 L 646.94 311.16 L 650.56 311.61 L 654.18 311.84 L 657.80 311.94 L 661.42 311.98 L 665.04 311.99 L 668.66 312.00 L 672.28 312.00 L 675.90 312.00 L 679.52 312.00 L 683.14 312.00 L 686.76 312.00 L 690.38 312.00 L 694.00 312.00"></path><line class="viz-mean" x1="564.4" y1="28" x2="564.4" y2="312"></line><text class="viz-label" x="564.4" y="44" text-anchor="middle">posterior mean</text><line class="viz-prior-mean" x1="564.4" y1="28" x2="564.4" y2="312"></line><text class="viz-label" x="564.4" y="59" text-anchor="middle">prior mean</text><text class="viz-label" x="46" y="18">density, scaled to max 12.27</text><text class="viz-label" x="694" y="354" text-anchor="end">predicted probability</text></svg><dl class="viz-stats" aria-label="Plot parameters"><div><dt>posterior</dt><dd>Beta(120, 30)</dd></div><div><dt>prior</dt><dd>Beta(60, 15)</dd></div><div><dt>means</dt><dd>0.80 / 0.80</dd></div></dl></figure>
<h2 id="summing-up">Summing up</h2>
<p>The argument I’ve made here can be summed up as:</p>
<ol>
<li>A probability prediction is a full probability distribution and so it inherently quantifies uncertainty — it’s a misconception to think that you need a prediction interval to express uncertainty</li>
<li>Model uncertainty propagates through to produce an additional (but orthogonal) layer of uncertainty over the outcome</li>
</ol>
<p>This may seem obvious in retrospect, but it’s always good to gain clarity on the fundamentals, where confused intuitions may lurk unnoticed.
Here, intuitions such as “a scalar prediction must be a point estimate” and “you need a confidence/prediction interval to express uncertainty” are highly misleading.</p>
<p>In this case, understanding the distinct sources of uncertainty has resolved some confusion I had about prediction markets and machine learning model predictions.
The upshot is that (if I’m comfortable ignoring model uncertainty), I need only be concerned that the probability predictions are <a href="https://scikit-learn.org/stable/modules/calibration.html">calibrated</a>.
This is straightforward to check: for prediction markets, you just need some historical data on outcomes; for classifier models, you can check on a holdout dataset.
Then you can be comfortable that the predictions are accurately quantifying uncertainty.</p>]]></content:encoded>
</item>
<item>
  <title>Economic growth via effective regulation</title>
  <link>https://tobanwiebe.com/writing/economic-growth-effective-regulation/</link>
  <guid isPermaLink="true">https://tobanwiebe.com/writing/economic-growth-effective-regulation/</guid>
  <pubDate>Sat, 07 Nov 2015 08:30:00 GMT</pubDate>
  <description>I somehow ended up reading this long essay by economist John Cochrane on growth-oriented policy. I really enjoyed it because it takes a hard-headed approach to topics where...</description>
  <content:encoded><![CDATA[<p>I somehow ended up reading <a href="http://johnhcochrane.blogspot.com/2015/10/economic-growth.html">this long essay</a> by economist John Cochrane on growth-oriented policy.
I really enjoyed it because it takes a hard-headed approach to topics where reason is too often crowded out by <a href="https://philosophynow.org/issues/101/The_Righteous_Mind_by_Jonathan_Haidt">moral instinct</a>.
(Other examples of this approach include <a href="http://www.effectivealtruism.org/">Effective Altruism</a>, or the <a href="http://www.ecomodernism.org/">Eco-Modernist Manifesto</a>.)</p>
<p>It’s an idealistic vision — it ignores the political economy that sustains the gigantic inefficiencies he decries — yet it’s an overwhelmingly sensible one.
Perhaps I’m biased towards the views of pro-growth economists, but this essay was packed with an uncommon amount of common sense (which is a pretty low bar when it comes to the topic of policy).</p>
<p>The essay makes a two-part argument: that growth should be a very high policy priority, and that there’s a lot of inefficient regulation that hinders growth.
I’ll quote a bunch of highlights below if you don’t want to read the whole thing.</p>
<p><em><strong>In the long run, nothing but growth matters.</strong></em>
Small differences in growth rates lead to dramatically different outcomes, because of compounding.</p>
<blockquote>
<p>If the US economy had grown at 2% rather than 3.5% since 1950, income per person by 2000 would have been <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>23</mn><mo separator="true">,</mo><mn>000</mn><mi>n</mi><mi>o</mi><mi>t</mi></mrow><annotation encoding="application/x-tex">23,000 not </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8389em;vertical-align:-0.1944em;"></span><span class="mord">23</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord">000</span><span class="mord mathnormal">n</span><span class="mord mathnormal">o</span><span class="mord mathnormal">t</span></span></span></span>50,000. That’s a huge difference. Nowhere in economic policy are we even talking about events that will double, or halve, the average American’s living standards in the next generation.</p>
</blockquote>
<p>The primacy of economic growth should be a really obvious point, but it’s easy to lose sight of the big picture.
If you think about it, economic growth is the <em>only</em> thing that has lifted nations out of poverty.</p>
<blockquote>
<p>Nothing other than productivity matters in the long run. A factor of three increase in income in 50 years, and the much larger rise in income and health since the dawn of the industrial age, dwarfs what unions bargaining for better wages, progressive taxes or redistribution, monetary, fiscal or other stimulus programs, minimum wage laws or other Federal regulation of labor markets, price caps and supports, subsidies, or much of anything else the government can do.</p>
</blockquote>
<p>And the primacy of growth should be non-controversial:</p>
<blockquote>
<p>38% more income — or 26% less income — drives just about any agenda one could wish for, from strong defense, to environmental protection, to the affordability of social programs, to the welfare of any segment of the population, to public investments, health, and fundamental research.</p>
</blockquote>
<p><em><strong>Dumb regulation hinders growth.</strong></em>
To increase growth, there’s a lot of obviously wasteful regulation that could be made a lot smarter.
Unfortunately, that regulation is there because of problems of political economy, and so it’s not as easy to fix as it looks.</p>
<blockquote>
<p>When the average person (voter) expresses concern over inequality, what they really mean is that they are concerned that average people are not getting ahead economically. If the average person were getting ahead, whether some big shot CEOs fly on private jets or not would make little difference. Conversely, the average voter, if not the average left-wing pundit, does not support equality of misery. If the average person continues to do poorly, it would bring them little solace for the government to tax away the lifestyles of the rich and famous.</p>
</blockquote>
<p>On the problems of health insurance regulation:</p>
<blockquote>
<p>The central problem of preexisting conditions was an artifact of regulation. In the ideal form of health insurance, you buy cheap catastrophic insurance when young, but the insurance policy can follow you as you age, change jobs, and move from state to state, and does not radically increase premiums if you get sick.
[…]
We need to allow simple, portable, largely catastrophic, lifelong, guaranteed-renewable health insurance to emerge. Right now it’s illegal.</p>
</blockquote>
<p>On energy policy:</p>
<blockquote>
<p>The poster child for inefficiency may well be the mandate for gasoline producers to use ethanol. Corn ethanol, it turns out, does nothing to help the environment: It takes nearly as much petroleum energy to produce it as it contains, in the form of fertilizer, transport fuel and so on; it uses up valuable land, which directly emits greenhouse gases, and contributes to erosion and runoff; it drives up the price of food.</p>
</blockquote>
<blockquote>
<p>If you are serious about carbon, let the words “nuclear power” pass your lips. We have sitting before us a technology that can easily supply our electricity and many transport needs, with zero carbon or methane emissions. New designs, if only they could pass the immense regulatory hurdle, would be much safer than the 1950s Soviet technology that failed at Chernobyl or the 1960s technology that failed at Fukushima. We are now operating antiques. And even with this rate of accident, nuclear power has caused orders of magnitude less human or environmental suffering than any other fuel.</p>
</blockquote>
<blockquote>
<p>Similarly, the most environmentally friendly way for people to live is in tightly packed cities, fed by genetically modified foods which yield more per acre of farmland and require fewer fertilizers and pesticides, from laser-leveled fields run efficiently by large corporations in the highest productivity locations. Federal policies to the contrary are not just anti-growth, they’re anti-environment too. When Federal policy can say these things in public, it will have a bit more standing to invoke the name of “science.”</p>
</blockquote>
<p>On taxation:</p>
<blockquote>
<p>Often, however, tax reform proposals sacrifice too quickly the principles of what a good tax system should be with perceived political accommodations to powerful interest groups. Economists should not play politician. We should always start with “in a perfect world, here is what the tax code should look like,” and accommodate political constraints only when asked to. Political constraints change quickly. Economic fundamentals do not.</p>
</blockquote>
<blockquote>
<p>The right corporate tax rate is zero. Corporations never pay taxes. Every dollar of taxes that a corporation pays comes from higher prices of their products, lower wages to their workers, or lower returns to their owners.
[…]
For all these reasons, eliminating the corporate tax is as likely to be more rather than less progressive. The higher prices a corporation charges hurt everyone. The lower wages corporations pay hurt workers. The income it passes along to its owners is subject to our highly progressive tax system.</p>
</blockquote>
<blockquote>
<p>When we say broaden the base by removing deductions and credits, we should be serious about that. Thus, even the holy trinity of mortgage interest deduction, charitable donation deduction, and employer provided health insurance deduction should be scrapped. The extra revenue could finance a large reduction in marginal rates.
Why? Consider the mortgage interest deduction. Imagine that in the absence of the deduction, Congress proposes to send a check to each homeowner, in proportion to the interest he or she pays on money borrowed against the value of the house. Furthermore, rich people, people who buy more expensive houses, people who borrow lots of money, and people who refinance often to take cash out get bigger checks than poor people, people who buy smaller houses, people who save up and pay cash, or people who pay down their mortgages. A rich person buying a huge house in Palo Alto, who pays 40% marginal income tax rate, gets a check for 40% of his huge mortgage. A poor person buying a small house in Fresno, who pays a 10% income tax, gets a check for 10% of his much smaller mortgage. There would be riots in the streets before this bill would pass. Yet this is exactly what the mortgage interest deduction accomplishes.</p>
</blockquote>
<p>On labor markets:</p>
<blockquote>
<p>Start, of course, with taxes: income taxes and payroll taxes are primarily taxes on employment. But the regulatory burdens of employment are larger still, as anyone who has tried to get a nanny legal will attest.
Minimum wages, occupational licensing, anti-discrimination laws, laws regulating hours people can work, benefits they must receive, leave they must be given, fear of lawsuits if you fire someone, and so forth all impede the labor market.</p>
</blockquote>
<blockquote>
<p>The usual argument is that workers need protection of all these laws. Well, the supposed protections do cost economic growth, and they do reduce employment. How much do they actually protect workers? The strongest force for worker protection is a vibrant labor market — if you don’t like this job, go take another. The tightly regulated labor market makes it much harder to get a new job, and thus, paradoxically, lowers your bargaining power in the old one.</p>
</blockquote>
<p>An exceptional section on immigration. Had a hard time not quoting the whole thing:</p>
<blockquote>
<p>Immigrants contribute to economic growth. Even if income per capita is unchanged, imagine how much better off our social security system, our medicare system, our unfunded pension promises, and our looming deficits and debt would be, if America could attract a steady flow of young, hard-working people who want to come and pay taxes. Aha, we can attract them! They’re beating the doors down to come. But then we keep them out.</p>
</blockquote>
<blockquote>
<p>Allowing free migration is, by many estimates the single policy change that would raise world GDP the most. If you believe in free trade in goods, and free investment, then you have to believe that free movement of people has the same benefits.</p>
</blockquote>
<p>On education:</p>
<blockquote>
<p>The culprit is easy to find: awful public schools run by and for the benefit of politically powerful teachers’ and administrators’ unions. (Don’t forget the latter! Teachers account for only half of typical public school expenses.) Education poses a particularly large tradeoff between profits to incumbents and economic growth, since education lies at the foundation of higher productivity. In addition, the costs of awful schools fall primarily on lower-income people who cannot afford to get out of the system. It is one of the major contributors to inequality.
The solution is simple as well: widespread financing by vouchers and charter schools. As with health care, a vibrant market demands that people control their spending, and can move it to where they get better results. As with health care, the government does not have to directly provide a service in order to help people to pay for that service. But as with health care, a healthy market also demands supply competition, that new schools be allowed to start and compete for students.</p>
</blockquote>
<p>As exciting as this vision is, I’m not very optimistic about any of these prescriptions being implemented, at least not in the near term.
But it’s good to know that we have a lot of simple solutions if we ever get desperate.
And I hope that some of these proposals (like immigration) can gain enough momentum to be implemented in the medium term.</p>]]></content:encoded>
</item>
<item>
  <title>Double standard: caffeine vs nicotine</title>
  <link>https://tobanwiebe.com/writing/double-standard-caffeine-vs-nicotine/</link>
  <guid isPermaLink="true">https://tobanwiebe.com/writing/double-standard-caffeine-vs-nicotine/</guid>
  <pubDate>Fri, 16 Jan 2015 14:20:48 GMT</pubDate>
  <description>[...] there is a kind of puritanical view that everything relating to nicotine is bad and harmful and should be stamped on. - Richard West I&apos;m no expert on this topic, but...</description>
  <content:encoded><![CDATA[<blockquote>[...] there is a kind of puritanical view that everything relating to nicotine is bad and harmful and should be stamped on.<p></p>
<p>-<a href="http://www.newscientist.com/article/dn26160-who-vaping-report-is-misguided-say-tobacco-experts.html#.VLgL1jXL_7B">Richard West</a></p></blockquote>
<p>I'm no expert on this topic, but it seems to me that the harms of tobacco are from smoking (and chewing) and not from the nicotine itself (apart from the addictiveness). So nicotine administered with gum/patch/vaping would be somewhat more like caffeine in harmfulness (loads of people are addicted to caffeine).</p>
<p>Nicotine also seems to have similar useful effects. From <a href="https://en.wikipedia.org/wiki/Nicotine">Wikipedia</a>: "Nicotine appears to have significant performance enhancing effects, particularly in fine motor skills, attention, and memory." A Martian scientist comparing the harms and benefits (for humans) of drinking coffee vs <a href="http://www.engadget.com/2014/05/23/vaporizers-explainer/">vaping nicotine</a> might think that both drugs are safe and have productivity-boosting effects, and wouldn't see a reason for only one of them to be socially acceptable.</p>
<p>So why is nicotine held to a <a href="http://www.newscientist.com/article/dn26160-who-vaping-report-is-misguided-say-tobacco-experts.html#.VLgL1jXL_7B">much higher standard</a>? Well, the public health effort to end smoking became a moral crusade. As Jonathan Haidt puts it, morality binds and blinds. The anti-smoking crusade gained enough strength that it created <a href="https://www.youtube.com/watch?v=gUEjnoWpdao">a hated enemy</a>. Smokers in North America are now a low-status out-group from the perspective of most non-smokers. As usual, when activists morally charge their case by painting something as sacred or evil, they commit themselves to not giving up an inch. Smoking is pure evil, and nicotine is central to smoking, so nicotine must be evil too. Any admission that nicotine might not be horrible is helping the enemy.</p>
<p>Caffeine, on the other hand, did not get loaded with negative affect because it had a safe delivery system. Imagine if nicotine had historically been consumed in drink form instead of smoking -- in such a world, I imagine that it would just be another stimulant drink like coffee or tea.</p>
<p>It's a real shame that e-cigarettes (nicotine) are being morally condemned along with cigarettes (tobacco). If smokers can switch to vaping, they can avoid all the harm and still get their nicotine. From a public health perspective, getting smokers to switch is a huge win. Doctors should be prescribing e-cigs to smokers, government should promote switching. Unfortunately, since nicotine has been painted evil by association, it's probably not going to happen.</p>]]></content:encoded>
</item>
<item>
  <title>A thought on methodology</title>
  <link>https://tobanwiebe.com/writing/a-thought-on-methodology/</link>
  <guid isPermaLink="true">https://tobanwiebe.com/writing/a-thought-on-methodology/</guid>
  <pubDate>Tue, 26 Aug 2014 01:30:50 GMT</pubDate>
  <description>I&apos;ve never felt comfortable with the logical positivists&apos; &quot;science is prediction&quot; characterization, primarily because it neglects what I intuitively think of as the heart o...</description>
  <content:encoded><![CDATA[<p>I've never felt comfortable with the logical positivists' "science is prediction" characterization, primarily because it neglects what I intuitively think of as the heart of science: explanation. For example, Darwin's theory of evolution – perhaps the greatest scientific discovery ever – is big on explanation, but not nearly as big on prediction. (Because evolution happens on such long timescales, although microbial evolution can be observed on very short timescales.) Or consider the 'selfish gene' paradigm, the evolutionary paradigm of viewing the gene as fundamental unit of selection, and the organism as a mere tool fabricated by the genes for the purpose of propagating themselves into the future. Dawkins' discusses (I think it was in The Extended Phenotype) criticisms of the idea as not providing any new predictions. My initial reaction to these criticisms is always: So what? They're wonderful <em>explanations</em> of the world. They make sense of the world. Isn't that pretty remarkable?</p>
<p>The point I want to make here is that the emphasis on prediction is just a convenient special case of a more general principle: that a theory should correspond to reality, just as a <a href="http://wiki.lesswrong.com/wiki/Map_and_Territory_(sequence)" target="_blank">map corresponds to the territory</a>. Reality can be observed past, present, and future. It is just as well to vet a theory against past observations as against future observations. The reason the scientific method favors prediction is that it prevents the scientist from concocting a 'just so' story that too neatly fits the existing facts ("overfitting"). An idealized honest scientist can test a theory against any empirical evidence.</p>
<p>Darwin's theory is so amazing because <a href="http://www.amazon.com/gp/product/1416594795/ref=as_li_tl?ie=UTF8&#x26;camp=1789&#x26;creative=390957&#x26;creativeASIN=1416594795&#x26;linkCode=as2&#x26;tag=highthou-20&#x26;linkId=V4OQUPRBW3EBZQ6E">it makes sense of so many existing facts</a>. (And you can still make predictions about historical facts (<a href="http://erdmannevolution.blogspot.com/2010/04/retrodictions.html" target="_blank">"retrodictions"</a>), such as the famous quip that evolution would be falsified by finding <a href="https://en.wikipedia.org/wiki/Precambrian_rabbit" target="_blank">fossil rabbits in the precambrian</a>.) Of course, if you have a beautiful theory that has no connection to reality, then you're not doing science. Science is concerned with explaining reality, and so scientific theories must say things about reality – things which can (in principle) be empirically checked. It ultimately doesn't matter where that evidence is temporally located.</p>]]></content:encoded>
</item>
  </channel>
</rss>