Lab | BIML

The BIML AI Lab

New developments have pushed the limits of AI since BIML’s founding in 2019 in a series of relentless waves. In 2026, highly capable new foundation models were released, including Google’s Gemini 3, Anthropic’s Claude Opus 4.6, and OpenAI’s GPT 5.2, to name just a few.

Perhaps even more interesting is the sheer variety of AI tools that have been created on top of these base technologies. Take development tools: like it or not, LLMs have gained wide acceptance in the developer community. In Stack Overflow’s 2025 survey, a large majority of respondents (and over half of experienced developers) claimed to use AI tools consistently in their development workflow.

From our perch at BIML, AI’s capacity to write code has made enormous strides in the last few months (we’re writing this in mid-February 2026). We’re astounded by what is happening to the development process as AI agents—largely autonomous systems that rely on LLMs to break complex instructions into simple tasks, write code, and integrate with external tools to test and deploy software—come into actual production use (all while minimizing human oversight).

Introducing the BIML AI Lab: our in-house implementation of an Agentic AI workflow. Building and using the BIML AI Lab is necessarily an iterative process as we get our hands dirty and figure out what does and what doesn’t work. Our plan is to use many different models and explore their capabilities in a playground, refereed by a taskmaster that orchestrates the many separate agents.

We’re starting with this foundation: AI Agents communicate through JIRA (project management and issue-tracking software) to hand off tasks and report on results, while writing code and other artifacts to GitHub. We have integrated several closed-weight LLMs via their APIs, including Gemini, Claude Opus, and GPT, as well as open-weight models from Hugging Face.

We welcome you keep up-to-date on our progress standing up and using the BIML AI Lab. You can expect regular updates on our MLsec Musings blog which we will link here in chronological order.

Letter Spirit Examiner: An example of emergent computation

Based on Gary McGraw‘s thesis work with Doug Hofstadter, Letter Spirit (part one): Emergent High-Level Perception of Gridletters Using Fluid Concepts, the Letter Spirit Examiner hosted here categorizes gridfont shapes into letter categories by running hundreds of micro-agents stochastically in parallel. High-level perception emerges from the actions of swarms of sub-symbolic bottom-up codelets under the influence of top down concepts.

Play with an interactive working version in your browser here: Letter Spirit Examiner

Code recreation credit to Paul Geiger: https://github.com/Paul-G2/letter-spirit-examiner-js

Learn about the inner workings of the Letter Spirit Examiner here.