Modern Statistical Arbitrage by Quantitativo
Sunday May 31^st, 2026 at 12:25 PM

Quant Trading Rules

The idea

“If I have seen further it is by standing on the shoulders of Giants.” Sir Isaac Newton.

Isaac Newton was arguably the greatest scientist who ever lived. He effectively discovered gravity. He showed us how to predict the motion of the planets. He had every right to brag about his genius. Yet he chose humility. Why?

His critics paraded their ideas with hubris. Newton offered his with deference to those who came before. And that humility was no pose. It came from something he understood early and deeply: knowledge builds upon itself. Each idea improves on the last, little by little, until the small gains add up to something revolutionary. That is the essence of his most famous “standing on the shoulders of Giants” metaphor.

From a young age, Newton kept a commonplace book, a gift from his father. In it, he copied passages from what he read and added his own notes, turning borrowed knowledge into original ideas. He called it his “Waste Book.” The name was a nod to the usefulness of useless knowledge and the combinatorial nature of creativity, what Einstein would later call “combinatory play.” Creating by connecting was the foundation of Newton’s mind. It was his real superpower.

This week, we will cover two articles. We will build on The Modern Spirit of Statistical Arbitrage, a great piece by SysLS. And we will implement a recent breakthrough paper that rigorously tested more than 190 signals in the US equity market.

Here’s our plan:

First, we will summarize the modern spirit of stat arb.
Next, we will construct a signal and show, in a few lines of code, how it performs across several large baskets.
We will then present the combined performance of the top ~20 signals and walk through them, summarizing the source paper along the way.
Finally, we will lay out a simple way to merge these signals into a portfolio that survives friction and costs.

Let’s get started.

What is statistical arbitrage?

At its core, statistical arbitrage is a class of strategy built from a portfolio of signals. Each signal assigns weights to instruments based on how much they outperform or underperform the rest of their basket, measured against some central point of all the other instruments. This can happen in any space: returns, price, volume, flow, dividends, and so on. The basket is then built so that its net factor exposures tend toward zero, which means most of its returns come from idiosyncratic moves rather than broad market or factor exposure. It’s a broad definition, and that’s the point. It covers every flavor of stat arb.

For more details, check the great post by SysLS.

Let’s see a concrete example. We will start with Factor 46 from the paper we are sourcing the signals.

Definition. Factor46 is the paper’s Multi-Period Mean Reversion Ratio, computed as:

(MEAN(CLOSE,3) + MEAN(CLOSE,6) + MEAN(CLOSE,12) + MEAN(CLOSE,24)) / (4 * CLOSE)

The Python code is straightforward:

The inputs are clear: both are date-indexed, point-in-time panels covering every symbol that has ever belonged to a given universe (the Norgate “… Current & Past” watchlist, so it’s survivorship-bias free):

data: a wide DataFrame whose columns are a two-level (’Feature’, ‘Symbol’) MultiIndex, so data['Close'] slices out a single date-×-symbol price matrix. Features available are Open, High, Low, Close, Amount, Volume, and VWAP. factor46 only touches data['Close'].
index: a same-shaped date-×-symbol integer DataFrame that is 1 on days a symbol was an actual index constituent and 0 otherwise. factor46’s final result.where(index == 1) uses it as a membership mask, blanking out the factor value for any stock/day that wasn’t in the universe at that time so it can’t be traded.

As we can see, Factor 46 takes four trailing moving averages of the closing price (3, 6, 12, and 24 days, each window roughly doubling, spanning a few days out to about a trading month), equal-weights them into a single blended reference price, and divides by today’s close. The result is a pure ratio centered near 1:

a value > 1 means the current price sits below its own multi-horizon average (the stock has recently sold off relative to its recent history), and
a value < 1 means it sits above it (recently run up).

It is, in essence, a smoothed, scale-free “distance from fair value” measure where “fair value” is the stock’s own recent average price rather than any fundamental anchor.

What it captures and why economically. The effect is short-horizon cross-sectional mean reversion / contrarian price correction. The paper grounds this in behavioral universality: overreaction, herding, and liquidity-seeking are cognitive traits common to all market participants, so when traders push a price away from its recent path (chasing news or dumping inventory), the dislocation tends to correct. Crucially, arbitrageurs are slow to close these gaps because of noise-trader risk and limits to arbitrage, which lets the reversal premium persist long enough to be harvested.

Now, let’s see the code that tests this signal. The most important lines are:

What is happening here?

The line signal = -(factor.subtract(factor.mean(axis=1), axis=0)) measures how far each stock sits from the center of its basket. For every day, it takes the cross-sectional mean of the factor across all stocks and subtracts it from each stock’s value. What’s left is each stock’s displacement from the group. The leading minus sign then flips the bet: stocks far below the average get positive weight, and stocks far above it get negative weight. That is the contrarian core of stat arb, betting that extremes revert toward the center.

The line signal.divide(signal.abs().sum(axis=1), axis=0) just normalizes. It divides every weight by the total gross exposure that day, so the book always sums to the same size and the longs roughly fund the shorts. This keeps the portfolio close to dollar-neutral and comparable across time.

The flip is optional. When flip_signal is true, it multiplies the whole book by -1, reversing the direction of every bet. This is useful because we don’t always know in advance which sign of a factor is the profitable one. The same formula can be tested both ways, longs and shorts swapped, without rewriting anything.

The last line runs the backtest: signal.shift(execution_lag).multiply(returns).sum(axis=1). It takes each day’s weights, shifts them forward by execution_lag days (two by default), and multiplies them by that day’s returns. The shift is the important part. It makes sure we only earn returns on positions we could have actually held, using yesterday’s signal to trade today rather than peeking at information we wouldn’t have had in time. Summing across all stocks then collapses the basket into a single daily P&L for the whole portfolio.

Let’s see the results:

Cumulative returns of the Factor 46 signal across six equity universes

Factor 46 performance across six equity universes

Factor 46 is encouraging, but on its own it’s not a finished strategy. The same formula behaves very differently depending on where we point it: a 0.53 Sharpe on the S&P 500, a 1.46 on the S&P ASX 300. The S&P 500 version is the clear laggard, with the weakest return and a brutal 55.7% drawdown that no one would want to sit through. And these numbers are gross of costs, so the real picture is worse. That’s the honest takeaway: a single signal, however clever, is rarely tradeable alone. The edge is real but thin, and one factor gives us no diversification when it goes through a bad stretch. The fix is to stop relying on any one signal. Combine many, each with its own small edge, and the weak spots start to cancel out. That is exactly where we go next.

From a single signal to a portfolio of signals

Let’s see what happens when we combine the strongest 17 signals according to the source paper:

Cumulative returns of the portfolio of signals across six equity universes

Portfolio performance across six equity universes

The effect of diversification is immediate. The single Factor 46 signal swung as deep as 55.7% on the S&P 500; the combined portfolio’s worst drawdown across all six universes is just 8.2%, and only 4.3% on the Russell 3000. Every Sharpe ratio now sits above 1.0, from 1.15 to 1.76, where the lone signal struggled to clear 0.5. The annual returns look smaller, but that’s the point: the single signal earned big numbers by running enormous risk, while the portfolio earns steadier returns with a fraction of the pain. That’s diversification doing its job. Seventeen small, imperfect edges combine into something far smoother than any one of them.

So what are these seventeen signals? Let’s look at them next.

Inside the portfolio

The list below shows the strongest signals, the most pervasive cross-sectional drivers according to the paper:

Why LSTM is overrated for price prediction and when Gradient Boosting beats it by Nayab Bhutta
Friday May 15^th, 2026 at 4:47 PM

InsiderFinance Wire - Medium

The Hard Truth About Deep Learning in Quant Trading

For years, LSTM neural networks have been marketed as the holy grail of financial prediction.

Search for:

AI stock prediction
deep learning trading
neural networks for finance
predictive trading systems

…and you’ll see endless claims that LSTMs can “learn market patterns” and predict future prices better than traditional models.

The narrative sounds compelling:

Markets are time series.
LSTMs are designed for time series.
Therefore, LSTMs should dominate trading.

But in real quantitative trading environments, things are far less glamorous.

Many professional quant researchers eventually discover something surprising:

Simpler gradient boosting models often outperform LSTMs in real-world financial prediction tasks.

Models like:

XGBoost
LightGBM
CatBoost

frequently beat deep learning systems in:

robustness
interpretability
training efficiency
out-of-sample stability
tabular financial feature prediction

This article breaks down:

why LSTMs became popular
why they often fail in live trading
where gradient boosting excels
when deep learning actually makes sense
how institutional quants approach the problem

This is not an anti-deep-learning argument.

It’s a realism argument.

Why LSTMs Became Popular in Finance

LSTM (Long Short-Term Memory) networks were designed to solve a specific problem in machine learning:

Sequential dependency modeling.

Unlike traditional neural networks, LSTMs maintain memory over time.

This makes them theoretically ideal for:

speech recognition
language modeling
sequential forecasting
time-series prediction

Naturally, traders thought:

“Markets are sequential data too.”

And thus began the explosion of LSTM-based trading research.

The Core Promise of LSTM Trading Models

The appeal was obvious.

LSTMs can theoretically:

capture temporal dependencies
model nonlinear dynamics
remember historical patterns
adapt to changing sequences

This sounded perfect for financial markets.

Especially compared to:

linear regression
moving averages
traditional indicators

The Problem: Financial Markets Are Not Normal Time Series

This is where theory collides with reality.

Financial markets differ dramatically from structured sequential domains like language.

Markets are:

noisy
adversarial
regime-dependent
non-stationary
reflexive
heavily stochastic

This changes everything.

Why LSTMs Struggle With Financial Data

1. Financial Signals Are Weak

In natural language:

"The cat sat on the..."

The next word is highly predictable.

In markets:

Price moved up yesterday...

The next move is barely predictable.

Financial signal-to-noise ratio is extremely low.

This is devastating for deep learning models.

2. Markets Constantly Change Regime

LSTMs assume historical relationships remain somewhat meaningful.

But markets shift between:

trending periods
mean-reverting phases
volatility expansions
macroeconomic shocks
liquidity crises

Patterns decay rapidly.

This reduces the usefulness of long sequential memory.

3. LSTMs Require Huge Amounts of Data

Deep learning thrives on massive datasets.

Examples:

Even decades of market data are relatively small for deep learning standards.

And most financial datasets are highly autocorrelated and noisy.

4. Overfitting Becomes Extremely Dangerous

LSTMs are highly expressive models.

They can memorize historical noise incredibly well.

This creates:

beautiful backtests
terrible live performance

Many traders mistake memorization for prediction.

The Backtest Illusion

An LSTM can produce:

smooth equity curves
high historical Sharpe ratios
strong in-sample accuracy

while actually learning:

noise
random structure
data artifacts

Instead of genuine market edge.

5. Financial Features Are Often Tabular, Not Sequential

This is a massive insight many beginners miss.

Most useful trading information comes from:

volatility
spreads
momentum
factor exposures
volume anomalies
macro features
options positioning

These are tabular features.

And tabular data is where gradient boosting dominates.

Why Gradient Boosting Often Wins

Now we reach the important part.

Models like XGBoost and LightGBM excel because they align better with the structure of financial data.

What Is Gradient Boosting?

Gradient boosting combines multiple weak decision trees into a strong predictive system.

Instead of learning sequential memory…

It learns:

nonlinear interactions
feature relationships
probabilistic splits

This works remarkably well in financial prediction.

Why XGBoost Became a Quant Favorite

Because it handles financial data characteristics extremely well.

1. Excellent With Tabular Data

Financial datasets are usually structured like:

Gradient boosting thrives in this environment.

LSTMs do not naturally excel here.

2. Better Performance With Smaller Datasets

XGBoost can produce strong results with relatively limited data.

This is critical in finance where:

data is expensive
clean labels are limited
signal is weak

3. Less Overfitting Risk

Compared to deep neural networks:

tree ensembles generalize better
regularization is stronger
training stability is higher

This improves out-of-sample robustness.

4. Faster Training and Iteration

LSTMs require:

GPU acceleration
hyperparameter tuning
sequence preparation
long training cycles

XGBoost trains dramatically faster.

This matters enormously in quant research.

5. Better Interpretability

Institutional traders need to understand:

Why is the model making decisions?

Gradient boosting allows:

feature importance analysis
SHAP values
split interpretation

LSTMs are often black boxes.

Real-World Quant Workflow

Many professional firms use pipelines like:

Market Data
      ↓
Feature Engineering
      ↓
XGBoost / LightGBM
      ↓
Probability Forecast
      ↓
Execution Engine

Not:

Raw Prices
      ↓
Massive LSTM
      ↓
Magic Predictions

That distinction matters.

When Gradient Boosting Dominates LSTM

Gradient boosting usually performs better when:

1. Features Matter More Than Sequences

If your edge comes from:

volatility structure
order flow imbalance
factor combinations
sentiment signals

boosting models are often superior.

2. Data Size Is Limited

Smaller datasets strongly favor boosting.

3. You Need Fast Research Cycles

Quant firms test thousands of hypotheses.

Training speed matters enormously.

4. Explainability Matters

Especially in institutional environments.

5. Prediction Horizon Is Short

For many short-term signals:

recent engineered features matter more than long memory

When LSTM Actually Makes Sense

Now the important nuance:

LSTMs are not useless.

They simply get overused.

LSTMs Work Better When:

1. Sequential Structure Truly Matters

Examples:

order book dynamics
tick-level flow
high-frequency sequences

2. Massive Data Exists

Examples:

alternative datasets
market microstructure data
cross-asset sequences

3. You Model Complex Temporal Relationships

Examples:

volatility forecasting
intraday regime transitions

4. Combined With Other Architectures

Modern quant systems increasingly use:

CNN + LSTM hybrids
transformers
attention models
ensemble systems

Rarely standalone LSTMs.

Why Many “AI Trading Gurus” Mislead Beginners

Because deep learning sounds sophisticated.

Saying:

“I built an LSTM stock predictor”

sounds more impressive than:

“I used XGBoost on engineered volatility features”

But sophistication does not equal predictive power.

The Hidden Secret of Quant Trading

The real edge usually comes from:

better features
cleaner data
superior execution
regime awareness
risk management

Not from using the most complicated model.

Feature Engineering Beats Model Complexity

This is one of the biggest lessons in financial machine learning.

A strong feature set with XGBoost often outperforms:

poorly engineered deep learning systems
raw-price LSTMs
overfit neural networks

Because financial prediction is primarily a:

Feature engineering problem.

Not an architecture problem.

The Institutional Perspective

Professional quant firms rarely obsess over one model.

Instead they focus on:

data infrastructure
signal stability
execution quality
robustness testing
probabilistic forecasting

Models are just one component.

The Rise of Hybrid Quant Systems

Modern trading systems increasingly combine:

gradient boosting
deep learning
regime detection
ensemble forecasting

This hybrid approach is far more realistic.

Example Hybrid Architecture

Technical Features
      ↓
Volatility Features
      ↓
Sentiment Features
      ↓
XGBoost Layer
      ↓
Meta-Model
      ↓
Execution System

This often performs better than pure LSTM systems.

The Biggest Misconception About AI Trading

Many beginners assume:

More complex AI = better trading performance

In reality:

More complexity often increases fragility

Especially in noisy financial environments.

Final Thoughts

LSTMs became popular in trading because they seemed perfectly aligned with financial time series.

But real markets are not clean sequential systems.

They are noisy, adaptive, adversarial environments with weak predictive structure.

And in those environments:

simpler models
stronger features
better validation methods

often outperform deep neural architectures.

Gradient boosting models like XGBoost succeed because they match the true structure of most financial datasets:

tabular
sparse
nonlinear
noisy

The lesson is not:

“Deep learning is bad.”

The lesson is:

The best model is the one that matches the actual nature of the data.

And in quantitative trading, that distinction is everything.

Important Note

“If you’re new to investing and trading and not sure where to begin, here’s a simple guide to get you started. Grab it now and level up your investing game.”

A Message from InsiderFinance

Thanks for being a part of our community! Before you go:

👏 Clap for the story and follow the author 👉
📰 View more content in the InsiderFinance Wire
📚 Take our FREE Masterclass
📈 Discover Powerful Trading Tools

Why LSTM is overrated for price prediction and when Gradient Boosting beats it was originally published in InsiderFinance Wire on Medium, where people are continuing the conversation by highlighting and responding to this story.

Read the whole story

miohtama

70 days ago

reply

Helsinki, Finland

@adlrocha - Google's ZKP-hidden quantum attack by adlrocha
Sunday April 5^th, 2026 at 9:32 AM

@adlrocha Beyond The Code

This week started with a bang. Anthropic accidentally leaked the source code for Claude Code, and within hours someone had kicked off a clean-room rewrite in Python. The internet, understandably, caught fire, and it seemed like the perfect topic to write about this week. As there were still lots of threads open, and people trying to make sense of the code base, I decided to leave it for when the dust settles (that way I could read the code base myself to draw my own conclusions before rushing into writing anything).

Fortunately, amidst the noise of Claude Code’s leak, Google Quantum AI made a release (Google featuring this newsletter again) that didn’t get the attention that I think it deserved. It was the perfect excuse to write again in this newsletter about quantum computing.

I’ve been fascinated by quantum computing since I was first introduced to it (at the time, I even wrote a patent that leveraged quantum information to reach consensus in distributed networks, but I’ll spare you the details for now). From all the new fancy technologies coming up these days, quantum computing is, to me, one of the hardest technology timelines to read. Since I’ve started following and studying closely there’s been an enormous amount of hype, a few winters, a lot of exciting progress, and no immediate use case to show off yet.

I’ve been studying the technology on the side for years, but never worked on it professionally. My only hands-on experience with the technology has been through a few Qiskit hackathons many years ago (I guess the barriers were high). I’ve been meaning to go back and get hands-on time with something like IBM’s publicly available quantum systems just to recalibrate my intuition, but I never find the time or motivation. This paper made me feel that urgency more acutely that I needed to recover this rusty skill.

The TL;DR of what Google dropped this week is a whitepaper claiming to reduce the quantum resources needed to break Bitcoin’s cryptography by roughly 20-fold. Cryptocurrencies and quantum computing… you can imagine how this topic took preference over Claude Code’s leak.

Shor’s algorithm and the hard problem underneath ECDSA

Before we get to the papers, let’s set the stage so everyone (independently of your knowledge about the space) is on the same page. This means taking a quick trip into the cryptographic primitives that currently protect every Bitcoin and Ethereum transaction.

When you sign a transaction on Bitcoin or Ethereum, you’re using a cryptographic primitive called ECDSA: the Elliptic Curve Digital Signature Algorithm. The security of ECDSA rests entirely on one hard problem: the Elliptic Curve Discrete Logarithm Problem (ECDLP). Here’s a high-level intuition of what this problem is all about.

An elliptic curve over a finite field forms a specific algebraic structure: a prime-order cyclic group. You’ll see that this really matters when we discuss how it can be attacked by quantum computers. The group is generated by a single distinguished point G (the generator), and every element of the group can be written as k·G for some integer k. Your private key is that integer k. Your public key is Q = k·G, the generator point “multiplied” by your private key, where multiplication means repeatedly applying a specific point-addition rule defined by the curve’s geometry.

Given Q and G, recovering k by brute force classically (meaning with our current computing systems) requires roughly 2^128 operations on Bitcoin’s curve (secp256k1). That’s a few hundred undecillion operations, effectively the age of the universe at a billion operations per second. The problem is hard in one direction only. Computing Q from k is instant. The reverse is infeasible.This asymmetry is what cryptographers call a hard problem, and this is why they are so appealing to create cryptographic primitives out of them.

Remember my post a few months ago about complexity theory and P=NP? ?This has a lot to do with that. Cryptographic primitives are built on the assumption of hard problems complexity. Technically, ECDLP sits in NP∩co-NP, it’s not known to be NP-hard in the strict complexity-theoretic sense, and most cryptographers believe it isn’t. It isn’t known to be in P either. Another hard problem commonly used for cryptographic primitives is integer factorisation, the hard problem underlying for instance RSA, which sits in exactly the same class: NP∩co-NP, not NP-complete, not known to be efficiently solvable. Both problems are “believed hard” without being provably hard in the complexity-theoretic sense.

Both problems resist classical attacks for the same reason: no efficient algorithm has been found after decades. And here is where Shor’s famous algorithm enters the scene.

Shor’s algorithm, published in 1994, exploits the cyclic structure of the group. Rather than brute-forcing the keyspace, it uses quantum Fourier transforms and period-finding on the multiplicative structure of the group to extract k from Q in polynomial time. The precise gate complexity is approximately O(n² log n log log n) in the bit-length n of the key (often cited as O(n²) for shorthand) though the full form matters when you’re counting Toffoli gates against a hardware budget (these gates are the quantum equivalent of a controlled-controlled-NOT, used to implement AND operations reversibly. Think of it as the universal reversible gate of quantum computing, they will be important when we discuss the contributions of the papers released). For a 256-bit key, that’s tractable, if you have a sufficiently large quantum computer.

The question has always been: how large is “sufficiently large”?I think you see where I am getting at. The papers released this week seem to have changed our existing intuitions about how many qubits are needed for Shor’s algorithm to break our existing cryptography.

The two papers released

The two papers that dropped this week have made some experts reevaluate their timelines about the security of the underlying security of blockchain systems that haven’t adopted post-quantum:

The Google Quantum AI whitepaper, “Securing Elliptic Curve Cryptocurrencies against Quantum Vulnerabilities: Resource Estimates and Mitigations”. Authored by Ryan Babbush and Craig Gidney at Google Quantum AI, alongside Thiago Bergamaschi (UC Berkeley), Justin Drake from the Ethereum Foundation, and Dan Boneh from Stanford. Google also published a blog post on the responsible disclosure methodology.

Let me give you some background about some of the authors so you can frame this contribution in the state-of-the-art.. Justin Drake is one of the primary researchers at the Ethereum Foundation responsible for Ethereum’s data-availability roadmap, he was a key architect behind EIP-4844 and the KZG trusted setup ceremony. Dan Boneh is a professor of computer science at Stanford, co-director of the Stanford Security Lab, and co-author of the most widely used applied cryptography textbook in the field. His free online cryptography course has been taken by over half a million people, and some of his papers were key for the development of Filecoin (another one that hits home). Finally, Craig Gidney has been responsible for a lot of the recent progress in the intersection of quantum and AI. You can imagine the weight that claims from these people can have in their respective fields. He published a paper in May 2025 showing RSA-2048 breakable with under 1 million physical qubits, down from 20 million in his own 2019 estimate.

On the other hand, the Oratomic paper, “Shor’s algorithm is possible with as few as 10,000 reconfigurable atomic qubits”, comes from Oratomic, a neutral-atom quantum computing company out of Pasadena, with John Preskill (Caltech) and Dolev Bluvstein as co-authors. Crucially, the Google whitepaper cites the Oratomic circuits as its own input, the two papers are cross-linked and share the same circuit design.

The papers present two circuit variants for attacking secp256k1:

Circuit 1: ≤1,200 logical qubits, ≤90 million Toffoli gates
Circuit 2: ≤1,450 logical qubits, ≤70 million Toffoli gates

Translated to physical hardware using surface codes on a superconducting architecture (planar degree-4 connectivity, consistent with Google’s Willow-class chips): fewer than 500,000 physical qubits. The previous best estimate, Litinski (2023), put this at roughly 9 million physical qubits. Google just moved that needle by nearly 20-fold.

That reduction didn’t come from a hardware breakthrough, it came from a better circuit. Running Shor’s on ECDLP isn’t just “run the algorithm” (this is somethign I learnt the hard way the first time I was tinkering with Qiskit and IBMs quantum computers). The core computation is elliptic curve point multiplication, computing k·G for arithmetic on secp256k1, which Shor’s algorithm needs to evaluate in quantum superposition as part of its period-finding routine. That means implementing modular arithmetic (specifically Montgomery multiplication, the standard technique for efficient modular operations) entirely in reversible quantum gates.

Every classical arithmetic operation has to be “uncomputed” after use to avoid accumulating garbage qubits that would corrupt the superposition. The dominant cost is Toffoli Gates and there are hundreds of millions of them in a naively constructed circuit.

Prior work optimised either qubit count or gate count, but not both simultaneously. The relevant figure of merit for real hardware is spacetime volume, i.e. the product of qubits × gates × cycle time, because that’s what determines wall-clock runtime on an actual machine.

Google’s contribution is a circuit that achieves the best spacetime volume ever published for ECDLP-256, through two main improvements. First, they applied improved windowing to Montgomery multiplication: rather than processing one bit of the scalar at a time, they process wider windows, amortising the Toffoli cost across more bits per round, reducing the total gate count substantially.

Second, they revised the T-state factory overhead: magic state distillation (the process for producing the high-fidelity ancilla states that Toffoli gates consume) is the dominant physical qubit cost in any surface-code implementation, and prior estimates were conservative. More careful accounting of distillation factory layout and scheduling cut the physical qubit estimate significantly. The combination brought the spacetime volume down far enough to halve the physical qubit requirement relative to Litinski 2023, and Litinski 2023 had already improved substantially on everything before it.

But before going any further I think is worth stressing the distinction between logical and physical qubits and why this matters. Theoretical qubits are what algorithms assume, perfect, noiseless two-state quantum systems. Logical qubits are error-corrected abstractions built from many physical qubits using a quantum error-correcting code (typically a surface code, I have to admit that loving information theory this field of error-corrected qubits is one that I am fascinated about. I actually leverage some of these error-corrected algorithms for my patent).

Physical qubits are the actual noisy hardware. Today’s devices operate at error rates around 10^-3 per gate, which means you need roughly 1,000 physical qubits to sustain one reliable logical qubit. The overhead varies by architecture and target error rate, but it’s the dominant cost in any near-term hardware plan.

To put the current state in perspective: Google’s Willow chip has 105 physical qubits. IBM’s Condor processor reached 1,121 qubits in late 2023, the largest superconducting qubit count to date, though not all at useful error rates. The gap between today and 500,000 error-corrected qubits is still enormous. But the conceptual threshold has moved, and it’s moved faster than almost anyone expected.

The two papers cover different hardware architectures, and the distinction matters. Superconducting qubits, the technology behind Google Willow and IBM’s quantum systems, encode quantum information in tiny circuits cooled near absolute zero (i.e close to 0 Kelvins), where electrical resistance vanishes and quantum effects dominate. Gate operations run in nanoseconds to microseconds. Neutral-atom architectures, like those used by Oratomic, trap individual atoms using focused laser beams and manipulate their quantum states optically. They achieve extremely long coherence times and flexible qubit connectivity, but gate operations are around 1000x slower). Ion trap systems (IonQ, Quantinuum) work on similar principles: individual ions levitated in electromagnetic fields and controlled with lasers. IonQ’s Forte system currently achieves around 29 “algorithmic qubits”, roughly the effective logical qubit count after accounting for noise. The Oratomic team reported 6,100 coherent atomic qubits trapped, with fault-tolerant operations demonstrated below the error threshold on around 500 qubits.

The Oratomic result is the more striking one in raw qubit count: the same computation runs with as few as 10,000–26,000 qubits on neutral-atom hardware. The catch: at current clock speeds (around 1ms/cycle), runtime is close to 10 days, not minutes. That limits the attack to at-rest targets, long-dormant wallets that have been sitting on-chain for years, not live transaction interception.

That clock speed difference is one of the genuinely novel framings in these papers. Superconducting hardware runs gate cycles in microseconds; neutral atoms and ion traps are 100–1,000x slower. This determines which kind of attack is feasible. The papers define three categories: on-spend (race Bitcoin’s block clock before the transaction confirms), at-rest (target publicly exposed keys on dormant wallets), and on-setup (recover secrets from one-time cryptographic ceremonies like KZG). Fast-clock architectures enable on-spend. Slow-clock ones are limited to the other two.

The ZKP disclosure 😱

Here’s the part that really blew my mind about Google’s whitepaper (and that I think justifies even more having Justing Drake and than Dan Boneh around for the paper). Google did not publish the attack circuits. Instead, they published a zero-knowledge proof that the circuits work.

The attack circuit, a sequence of quantum gate operations implementing Shor’s algorithm for secp256k1, was written as an ordinary Rust code using a quantum circuit library that models qubits, gates (Hadamard, CNOT, Toffoli, phase rotation), and multi-qubit arithmetic operations. The program encodes the Montgomery modular multiplication routine at the core of the elliptic curve group arithmetic, the quantum Fourier transform used for period extraction, and the bookkeeping that wires those components into a complete Shor’s instance for ECDLP-256. The circuit itself is a classical description of a quantum computation, a directed graph of gate operations to be executed on hardware. It’s the blueprint, not the machine. (sidenote: the circuit of the image is the classical implementation of Shor’s algorithm for those of you that haven’t seen one ever).

That Rust program was then fed into SP1, a zero-knowledge virtual machine built by Succinct Labs which targets the RISC-V architecture. For those unfamiliar with ZK-VMs, SP1 compiles Rust to RISC-V bytecode (using the standard RISC-V target), and then generates a cryptographic proof, specifically a STARK-based proof, that a given RISC-V program was executed correctly on specific inputs and produced a specific output. You get a proof of correct execution without anyone needing to see the program or the inputs.

In this case: Google ran the circuit program against 9,000 randomly sampled secp256k1 input points, verified that the circuit correctly performs the elliptic curve operations it claims to, and had SP1 generate a proof of that execution. The SHA-256 hash of the circuit was committed publicly so anyone can verify they’re talking about the same circuit. The SP1 proof attests: “this hash corresponds to a program that, when run on these inputs, produces these outputs consistently with a correct Shor’s implementation for ECDLP-256.”

The inner SP1 proof is a STARK. STARKs have no trusted setup, but they’re large, hundreds of kilobytes to megabytes. So SP1 wraps the STARK in an outer Groth16 SNARK. Groth16 takes the STARK proof as a statement to be proved and generates a compact proof of it: roughly 200 bytes, regardless of the complexity of the original computation. The final artefact, code and proof, sits on Zenodo. Anyone can download it and verify Groth16’s 200-byte proof in milliseconds, without ever seeing the attack circuit.

What this means practically: the existence and correctness of the attack is publicly verifiable. The attack tool itself is not.

This is a genuinely new move in responsible disclosure. The standard practice for software vulnerabilities is to notify the vendor, wait a window, then publish. But there’s no vendor to notify here, no patch to deploy in 90 days. So Google found a different answer: prove the result is real, withhold the exploit.

Here’s where it gets funny, or uncomfortable, depending on your perspective. Groth16 is itself an elliptic curve construction. It operates over BN254, a pairing-friendly curve distinct from secp256k1, but it is still fundamentally an elliptic curve scheme. The pairings that make Groth16 work rely on the same class of hard problems, discrete logarithms on elliptic curves, that Shor’s algorithm can break. So Google used a cryptographic primitive that is also eventually threatened by sufficiently powerful quantum computers to prove the existence of the circuit that threatens elliptic curve cryptography. If CRQCs (Cryptographically Relevant Quantum Computers, the term the whitepaper uses for machines capable of running these attacks) ever arrive at scale, Groth16 and the broader ZKP ecosystem go down with the rest.

I don’t know if that’s elegant or just funny. Probably both.

But what is even crazier to me is that this could become eventually the standard model for future research and proprietary algorithms, where companies and researchers can show that “their algorithms do what they claim to be doing” without leaking anything about its underlying implementation. That’s enough for a post of itself. I’ve been saying it for a while but ZKP primitives can have immediate use outside of blockchain networks and web3.

Post-quantum cryptography: what exists, what migration looks like

To understand why certain cryptographic schemes survive a quantum computer and others don’t, we need to understand why Shor’s algorithm works in the first place.

Shor’s algorithm is a period-finding machine. It exploits the fact that ECDLP and integer factorisation both reduce to finding the period of a function defined over a cyclic algebraic group. Quantum Fourier transforms make period-finding tractable on cyclic structures, and that’s the attack. The quantum speedup isn’t general; it’s specific to problems with this periodic structure. If you pick a hard problem that doesn’t have it, Shor’s doesn’t help.

That’s exactly what post-quantum cryptography does.

Lattice problems, specifically the Shortest Vector Problem (SVP) and its structured variant, Module Learning With Errors (MLWE), ask you to find the shortest non-zero vector in a high-dimensional lattice, or to distinguish a structured equation system from a random one. Neither problem has a cyclic group structure Shor’s can exploit. The best known quantum algorithm for SVP offers only a polynomial speedup over classical approaches, not the exponential gap that Shor’s gives against ECDLP.

SVP is NP-hard in the worst case, and lattice cryptography has an elegant property: worst-case hardness reduces to average-case hardness, which makes the security proofs unusually strong. The specific structured variants used in practice (MLWE, MSIS) sit slightly off the worst-case problem, so ongoing cryptanalysis remains active, but no quantum attack comes close to breaking them.

Hash-based problems rest on collision resistance alone. There is no algebraic structure, no group, no lattice. If SHA-256 or SHAKE-256 resist collision attacks, and there’s no known quantum or classical attack that breaks them, the scheme is secure. Grover’s algorithm gives a quadratic speedup for unstructured search, which halves the effective security level (256-bit security becomes 128-bit), but doubling the output size restores it. That’s a parameter choice, not a structural break.

Code-based problems, specifically the Syndrome Decoding Problem, ask you to find a codeword in a random linear error-correcting code given a corrupted version. Berlekamp showed in 1978 that SDP is NP-complete in the worst case. No quantum speedup beyond polynomial is known. The cost has historically been large key sizes (around 1MB for McEliece-based schemes), but newer constructions have reduced this substantially.

The NIST post-quantum standards (i.e. list of post-quantum standards so far accepted by NIST) are a portfolio of bets across those three problem families:

ML-KEM (FIPS 203), key encapsulation, formerly CRYSTALS-Kyber. Lattice-based (MLWE). FIPS-finalised, production-ready.

ML-DSA / Dilithium (FIPS 204), digital signatures. Lattice-based (MLWE/MSIS). Signature size: ~2.5KB. FIPS-finalised, production-ready.
SLH-DSA / SPHINCS+ (FIPS 205), stateless hash-based signatures. Signature size: ~8KB. FIPS-finalised. Heavy but the most conservative security assumption available.
HQC, selected March 2025 as fifth KEM, full standard expected 2027. Code-based (syndrome decoding). Smaller keys than McEliece.

And why not migrate immediately to these primitives. The main issue rests in the size of the keys, that can mean breaking a lot of assumptions in some systems (including blockchain networks). Post-quantum keys can be 100-fold larger than existing ECDSA and even RSA keys.

Has the timeline really changed?

What about all of this claims and the statement in Google’s paper about this discovery making them “reevaluate” current quantum supremacy timelines? My immediate answer would be, “who knows?”

Here’s one thing that I think some people may be missing when reading this results: the dramatic reduction in resource counts is real, but the practical problem is not about how many qubits you need on paper. It’s about whether you can build qubits good enough to make those counts mean anything.

The Google whitepaper assumes a physical gate error rate of 10^ 3 sustained uniformly across all qubits. That’s the modelling assumption. Where is hardware today?

The state of the art, as of 2024, is two-qubit gate fidelity of ~99.9%, which is exactly 10^ -3. Multiple groups have now reported this number, including Google with Willow. So you might conclude the assumption is already met. Scott Aaronson (you probably remember him as being my favourite computer scientist alive :) ), who has been tracking this more carefully than most, made exactly this point in September 2024:

“Within the past year, multiple groups have reported 99.9% [two-qubit gate fidelity]. I’m now more optimistic than I’ve ever been that, if things continue at the current rate, either there are useful fault-tolerant QCs in the next decade, or else something surprising happens to stop that.”

But he also noted that 99.99%, a full order of magnitude better, is what you really need for sustained fault-tolerant operation where error correction delivers a net gain rather than just breaking even. That threshold hasn’t been reached.

There’s a version of the coverage that reads these papers as evidence the timeline itself has shortened. I don’t think that’s right, and the distinction matters. What these papers changed is the target: the number of qubits and gates required on paper to run the attack. What they didn’t change is the distance to that target, which is determined entirely by hardware, and hardware hadn’t moved much this past month. The Willow chip had the same error rates the day after the whitepaper dropped as it did the day before. A more efficient attack circuit doesn’t build better qubits. It lowers the bar you need to clear, but if you can’t clear the bar yet, lowering it isn’t the same as getting closer.

More critically: those fidelity numbers are measured on the best qubit pairs on a 100-qubit chip under carefully optimised conditions. Nobody has demonstrated 99.9% gate fidelity sustained uniformly across a million physical qubits.

Google’s own Willow error correction paper, the paper that demonstrated below-threshold surface code performance for the first time, achieved that milestone on 101 physical qubits. The target for a cryptographically relevant attack is somewhere between 500,000 and 1 million. The Willow paper itself notes that logical performance is limited by rare correlated error events, roughly once per hour, that fall outside the standard noise model fault-tolerance proofs assume. At million-qubit scale, the frequency and character of those events is unknown.

Then there’s inter-chip communication. Gidney’s estimates assume a planar grid of qubits with nearest-neighbour connectivity. At the million-qubit scale, that means stitching together many chips into a coherent quantum system, something that has not been demonstrated anywhere. Aaronson again: “eventually you’ll need communication of qubits between chips, which has yet to be demonstrated.”

There’s still a sentence near the end of the whitepaper that I think frames the risk correctly:

“It is conceivable that the existence of early CRQCs may first be detected on the blockchain rather than announced.”

That’s the authors acknowledging a tail scenario the “Nassim Taleb-way”: a nation-state or well-funded private effort builds this quietly, and the first public evidence of success is unexplained large wallet drains on-chain (my good friend Marko Vukolic always said that Bitcoin and Satoshi’s wallet was the biggest quantum computing bounty available, so this claim adds up).

So the honest position is: the resource count dropped dramatically, and that matters. But the real question for the timeline isn’t how many qubits you need on paper, it’s whether anyone can build a million qubits that are actually good enough.

We’ll have to wait and see… Until next week!

Read the whole story

miohtama

110 days ago

reply

Helsinki, Finland

Which one are you? 😆 I am definitely feeling the tea + anxiety one right now.
Thursday October 9^th, 2025 at 11:55 PM

chibird

chibird:

Which one are you? 😆 I am definitely feeling the tea + anxiety one right now.
Chibird store | Positive pin club | Instagram

Read the whole story

miohtama

287 days ago

reply

Helsinki, Finland

Short Walk by Doug
Tuesday September 16^th, 2025 at 1:21 PM

Savage Chickens – Cartoons on Sticky Notes by Doug Savage

Short Walk

And so begins my annual week of pirate chickens, leading up to September 19th’s Talk Like A Pirate Day!

Read the whole story

miohtama

311 days ago

reply

Helsinki, Finland

David Splinter on how much tax billionaires pay by Tyler Cowen
Tuesday August 26^th, 2025 at 8:39 AM

Marginal REVOLUTION

Here is his comment on the paper presented here:

Summary: The U.S. tax system is highly progressive. Effective tax rates increase from 2% for the bottom quintile of income to 45% for the top hundredth of one percent. But rates may be lower among those with the highest wealth. This comment starts with the “top 400” tax rate estimates by wealth in Balkir, Saez, Yagan, and Zucman (2025, BSYZ), and adjusts these to account for Forbes family wealth being spread across multiple tax returns, to avoid double-counting capital income, to include missing taxes, and to apply standard tax and income definitions. This results in “top 400” effective tax rates exceeding overall tax rates by 13 percentage points. Still, the “top 400” tax rate is lower than for the top hundredth of one percent, suggesting a modest decline in effective tax rates at the very top when ranking by wealth. However, this is an unsurprising deviation from progressive rates because the tax system targets income, not wealth. Compared to the annual estimates in BSYZ, longer-run estimates are more appropriate for top wealth groups, which have volatile wealth and concentrate charitable giving into end-of-life bequests. End-of-life giving suggests long-run top 400 effective tax-and-giving rates could exceed 75%.

The full link.

The post David Splinter on how much tax billionaires pay appeared first on Marginal REVOLUTION.

Modern Statistical Arbitrage by Quantitativo Sunday May 31st, 2026 at 12:25 PM

The idea

What is statistical arbitrage?

From a single signal to a portfolio of signals

Inside the portfolio

Why LSTM is overrated for price prediction and when Gradient Boosting beats it by Nayab Bhutta Friday May 15th, 2026 at 4:47 PM

Why LSTMs Became Popular in Finance

The Core Promise of LSTM Trading Models

The Problem: Financial Markets Are Not Normal Time Series

Why LSTMs Struggle With Financial Data

1. Financial Signals Are Weak

2. Markets Constantly Change Regime

3. LSTMs Require Huge Amounts of Data

4. Overfitting Becomes Extremely Dangerous

The Backtest Illusion

5. Financial Features Are Often Tabular, Not Sequential

Why Gradient Boosting Often Wins

What Is Gradient Boosting?

Why XGBoost Became a Quant Favorite

1. Excellent With Tabular Data

2. Better Performance With Smaller Datasets

3. Less Overfitting Risk

4. Faster Training and Iteration

5. Better Interpretability

Real-World Quant Workflow

When Gradient Boosting Dominates LSTM

1. Features Matter More Than Sequences

2. Data Size Is Limited

3. You Need Fast Research Cycles

4. Explainability Matters

5. Prediction Horizon Is Short

When LSTM Actually Makes Sense

LSTMs Work Better When:

1. Sequential Structure Truly Matters

2. Massive Data Exists

3. You Model Complex Temporal Relationships

4. Combined With Other Architectures

Why Many “AI Trading Gurus” Mislead Beginners

The Hidden Secret of Quant Trading

Feature Engineering Beats Model Complexity

The Institutional Perspective

The Rise of Hybrid Quant Systems

Example Hybrid Architecture

The Biggest Misconception About AI Trading

Final Thoughts

Important Note

A Message from InsiderFinance

@adlrocha - Google's ZKP-hidden quantum attack by adlrocha Sunday April 5th, 2026 at 9:32 AM

Shor’s algorithm and the hard problem underneath ECDSA

The two papers released

The ZKP disclosure 😱

Post-quantum cryptography: what exists, what migration looks like

Has the timeline really changed?

Which one are you? 😆 I am definitely feeling the tea + anxiety one right now. Thursday October 9th, 2025 at 11:55 PM

Short Walk by Doug Tuesday September 16th, 2025 at 1:21 PM

David Splinter on how much tax billionaires pay by Tyler Cowen Tuesday August 26th, 2025 at 8:39 AM

Related Stories

Modern Statistical Arbitrage by Quantitativo
Sunday May 31^st, 2026 at 12:25 PM

Why LSTM is overrated for price prediction and when Gradient Boosting beats it by Nayab Bhutta
Friday May 15^th, 2026 at 4:47 PM

@adlrocha - Google's ZKP-hidden quantum attack by adlrocha
Sunday April 5^th, 2026 at 9:32 AM

Which one are you? 😆 I am definitely feeling the tea + anxiety one right now.
Thursday October 9^th, 2025 at 11:55 PM

Short Walk by Doug
Tuesday September 16^th, 2025 at 1:21 PM

David Splinter on how much tax billionaires pay by Tyler Cowen
Tuesday August 26^th, 2025 at 8:39 AM