View a PDF of the paper titled Characterizing Linear Alignment Across Language Models, by Matt Gorbett and 1 other authors
View PDF
HTML (experimental)
Abstract:Language models increasingly appear to learn similar representations, despite differences in training objectives, architectures, and data modalities. This emerging compatibility between independently trained models introduces new opportunities for cross-model alignment to downstream objectives. Moreover, this capability unlocks new potential application domains, such as settings where security, privacy, or competitive constraints prohibit direct data or model sharing. In this work, we investigate the extent to which representational convergence enables practical linear alignment between large language models. Specifically, we learn affine transformations between the final hidden states of independent models and empirically evaluate these mappings across text generation, embedding classification, and out-of-distribution detection. We find that performance is largely preserved across model pairs, and show for the first time that linear alignment sometimes enables text generation across independently trained models. We further highlight a potential application of linear alignment for privacy-preserving cross-silo inference. The framework learns an affine transformation over a shared public dataset and uses homomorphic encryption to protect client queries. By encrypting only the linear classification operation, the method achieves sub-second inference latency.
Submission history
From: Matt Gorbett [view email]
[v1]
Thu, 19 Mar 2026 13:43:32 UTC (2,229 KB)
[v2]
Sat, 21 Mar 2026 20:00:35 UTC (610 KB)
[v3]
Thu, 26 Mar 2026 15:17:05 UTC (2,577 KB)


![[2603.18908] Characterizing Linear Alignment Across Language Models Measuring Intelligence Efficiency of Local AI](https://skytik.cc/wp-content/uploads/2025/11/Measuring-Intelligence-Efficiency-of-Local-AI-768x448.png)