Calibration Pipeline
Platt scaling and Gaussian CDF for well-calibrated bracket probabilities
Why Calibration Matters
Raw model outputs are not calibrated probabilities. A model that says 70% should be correct 70% of the time for optimal Kelly sizing. Without calibration, the Kelly criterion will systematically over- or under-size positions, destroying expected value even when the model has genuine edge.
Gaussian CDF Bracket Mapping
Given the predicted mean (mu) and standard deviation (sigma) from the LightGBM dual model, we compute the probability of temperature falling in each bracket using the Gaussian CDF. For a bracket [a, b], the probability is Phi((b - mu) / sigma) - Phi((a - mu) / sigma), where Phi is the standard normal CDF from scipy.stats.norm.
Platt Scaling Calibrator
After CDF mapping, a Platt scaling layer (logistic regression on log-odds) corrects for systematic biases. The calibrator is fit on a held-out validation set using sklearn's CalibratedClassifierCV equivalent. This brings Brier scores from ~0.08 uncalibrated down to ~0.05 calibrated, a meaningful improvement for trading.