Helping AI have long-term memory

The Transformer architecture revolutionized sequence modeling with its introduction of attention, a mechanism by which models look back at earlier inputs to prioritize relevant input data. However, computational cost increases drastically with sequence length, which limits the ability to scale Transformer-based models to extremely long contexts, such as those required for full-document understanding or genomic analysis.

The research community explored various approaches for solutions, such as efficient linear recurrent neural networks (RNNs) and state space models (SSMs) like Mamba-2. These models offer fast, linear scaling by compressing context into a fixed-size. However, this fixed-size compression cannot adequately capture the rich information in very long sequences.

In two new papers, Titans and MIRAS, we introduce an architecture and theoretical blueprint that combine the speed of RNNs with the accuracy of transformers. Titans is the specific architecture (the tool), and MIRAS is the theoretical framework (the blueprint) for generalizing these approaches. Together, they advance the concept of test-time memorization, the ability of an AI model to maintain long-term memory by incorporating more powerful “surprise” metrics (i.e., unexpected pieces of information) while the model is running and without dedicated offline retraining.

The MIRAS framework, as demonstrated by Titans, introduces a meaningful shift toward real-time adaptation. Instead of compressing information into a static state, this architecture actively learns and updates its own parameters as data streams in. This crucial mechanism enables the model to incorporate new, specific details into its core knowledge instantly.

Source link

What's Hot

ClickFix attackers using new tactic to evade detection, says Microsoft – Computerworld

M&A Monthly: February/March 2026

Posit AI Blog: luz 0.4.0

Helping AI have long-term memory

Posit AI Blog: luz 0.4.0

The Download: an AI agent’s hit piece, and preventing lightning

The Accidental Orchestrator – O’Reilly

How AI trained on birds is surfacing underwater mysteries

Copilot Tasks: From Answers to Actions | Microsoft Copilot Blog

Featured video: Coding for underwater robotics | MIT News

Hard-braking events as indicators of road segment crash risk

Understanding U-Net Architecture in Deep Learning

How to integrate a graph database into your RAG pipeline

ClickFix attackers using new tactic to evade detection, says Microsoft – Computerworld

M&A Monthly: February/March 2026

Posit AI Blog: luz 0.4.0

Top Reasons to Choose Precisely for SAP and Salesforce Process Automation

Our Picks

ClickFix attackers using new tactic to evade detection, says Microsoft – Computerworld

M&A Monthly: February/March 2026

What's Hot

Helping AI have long-term memory

Related Posts

Subscribe to Updates