Harnessing Alien Intelligence

We are now harnessing the “harness” metaphor to describe how to work effectively with generative AI. Generative AI applications often first appear familiarly “intelligent” but with use typically reveal themselves not to be so, with practical implications. Combining stochastic models with deterministic tools and code loops can change that, this state of affairs is typically called a harness. The “harness” concept (and implementations) can be used: to clarify questions about AI capabilities, to create dramatic jumps in performance in a range of challenging computations, and is related to (perhaps caused by) averting various LLMs risks we have identified in previous work (paper).

Harnesses

Harnessing alien intelligences is a hallmark achievement of humanity. Harnessing a horse, brings together a rider and horse to enable heavy laden travel over difficult terrain, or feats of speed, agility, and competition unavailable to each. For example, this last weekend was the latest edition of the Kentucky Derby. We can also harness multiple horses to a cart, bulls to a plow, dogs to a sled. Each scenario is distinct. Different goals and context details impact how the harness is shaped and operated. Common to the scenarios however is that there must be a sufficient fit of the harness to the animal and task, and a degree of domestication of the animal, which entails sufficient capacity for communication or mutual understanding. Again, the detailed capabilities of the partners and desired outcomes shape harnessing. It’s hard not to get lost considering this idea and the clever details of implementation of this phenomenon. And this even though the cross-species examples can fail to evoke the prime example of social existence, which harnesses the alien (to each other) intelligences of other humans through shared concepts, norms, and laws to achieve among other things the device I am writing this on and the network that this message travels through.

Metaphors and abstraction aside, we are now learning (or paying attention to the fact) that effective use of generative models, strongly depends on the scaffolding, the harness that we build around these. In fact, important questions that we ask about these models, such as how “intelligent” they may be are misconstrued outside the concept of a harness. This is one among many ideas discussed in an analysis from Farrell et al “Large AI models are cultural and social technologies” (paper). The paper place recent generative models in the context of the history of information and collaboration technologies.

Harnessing has been here all along, but sometimes it’s hard to see the what’s too close, “this is water”. The chat interface is a harness, so are the sampling or decoding algorithms, the recent agentic scaffolds are harnesses and finally also by name. Model weights in and of themselves are some kind of data extract, created from a pile of data sent through the digital distillery of training protocols and algorithms, but alone they do no work, make no decisions, are not intelligent. Seen as a kind of library some may be more extensive than others, but inert nevertheless.

It took for the harnesses to get big and complex for us to begin to see the concept. But also the results have come to be known under various names. Most famously we now have a menagerie of coding “agents”, the talking horses that will replace human software developers. Understanding efforts in the neighborhood of harnessing and agents is an area we are closely looking at. This is a brief look at three recent techniques that have stood out.

Recursive Language Models

The first is “Recursive Language Models” (RLM) (paper). The technique is motivated as a response to the challenges of very long context computation where attention becomes very resource intensive and yet we still experience “context rot”. Context rot means that task performance decays or collapses the context gets larger. The risk of assuming that simply longer context is better when using LLMs was among the risks in our original assessment. The RLM, described as new “inference paradigm” (not a harness) puts the context into a REPL environment and asks the LM to generate code and new prompts to “programmatically examine, decompose, and recursively call itself over snippets”. Intermediate results are stored as variables in the REPL workspace environment and the final state goes into a named variable, signaling completion. This is the harness, and it engages the model in a particular set of behaviors (through code and prompt generation) in an environment defined by an arbitrarily large provided prompt. The result is effective handling of prompts multiple orders of magnitude larger with dramatic outperformance in a set of relevant tasks. Incorporating this harness tiny models can perform as “vanilla” frontier models! We often talk about WHAT and HOW machines, and identified a top “misuse” risk as trying to approach any computational task using auto-regressive text completion. The RLM approach uses code generation and execution in a REPL to move away from this with great impact. Along the way RLM also mitigates risks related to representational transparency and black-box behavior by executing inspectable code and storing intermediate results in its environment! It also turns out that these traces can be put to further use.

AutoHarness

The second example is “AutoHarness: improving LLM agents by automatically synthesizing a code harness” (paper), twice in the title. This effort is motivated by the failure of language models, even frontier ones, to repeatedly generate valid moves in game environments. Again the word environment, its use here more direct, the formal world of a game. A familiar observation here is the presence of Potemkin understanding (paper), a model may describe or apparently understand the rules of a game but cannot reliably generate legal moves. The authors describe a harness as “the glue or plumbing between the bowel and the task that needs to be solved” and talk about harness use in two ways. One way, by asking the model to generate validation code for proposed game moves, and then interleaving the generated validator with auto-regressive answering, they call this “harness as verifier”. The second approach is to ask the model to generate code that produces valid moves directly, which they call “harness as policy”. Both approaches attempt to constrain the behavior of the model to conform to the game rules, and in this way “harness” the behavior. But there is also a third form, the overarching “AutoHarness” strategy itself. Performance of the harnessed strategies again greatly outperforms, even vanilla frontier models. In the case of harness as policy, where we ask the WHAT machine to create a little HOW machine, outperformance shows up both in task performance and dramatically in compute cost.

Meta-Harness

Full recognition of the centrality of the harness shows up in “Meta-Harness: End-to-End Optimization of Model Harnesses” (paper). The goal here is to use a strong coding model called a proposer to create new task-specific harnesses by examining the code, execution, and performance of other harnesses. This is one where the traces described before are put to good use. Interestingly the task-specific may be executed by a weaker model. This method is capable of dramatically improve upon existing harnesses with limited compute, in ways that transfer across models that execute the harness, and across a range of tasks.

In each of these cases small models can vastly outperform “vanilla” frontier models, when they are simply harnessed models through a sampling algorithm. Meta-Harness can improve upon harnesses similar to RLM, hybrids of HOW and WHAT machines and beat hand coded approaches. Omar Khattab, one of the authors here is also an author in RLM.

There is something fundamental happening here, with implications for performance, computational costs, interpretability, and risk management. We are also seeing these patterns in other work, so we will continue to revisit.