Skip to content
Operational
Polyrhythm Software, LLC
Polyrhythm Software, LLC
Menu
← Field Notes
18 April 2026 Polyrhythm Software, LLC Updated 25 May 2026

X62A VISTA Shows the AI Verification Gap

Have Remy used the X62A VISTA to test autonomous missile-evasion agents trained in simulation. The hard part is not only the AI model; it is proving that simulated behavior remains valid in live flight.

Polyrhythm’s staff have spent more than three decades supporting modeling and simulation for the Department of War and integrating test ranges for high-fidelity simulation. That experience makes us pay close attention when X62A VISTA work has to prove that what an AI learned in simulation still holds in live flight.

The Have Remy Test Management Project did exactly that. Lockheed Martin Skunk Works and the U.S. Air Force Test Pilot School trained autonomous missile-evasion agents through billions of simulated missions, selected candidates for live flight, and flew them on the X-62A VISTA. Aerospace America covered the Have Remy X-62A VISTA testing and framed the central question as whether the technical method could cross from simulation to real life.

That is the right question. AI behavior trained in simulation is only as useful as the evidence that the simulation covered the conditions that matter. Missile-evasion behavior depends on geometry, timing, sensor assumptions, aircraft dynamics, threat models, control limits, and pilot safety constraints. If those assumptions are wrong, the agent can look strong in training and weak in flight.

The X62A VISTA matters because it provides a controllable bridge. It is a manned testbed with autonomy infrastructure, safety oversight, and the ability to test tactical AI in real airspace. That lets engineers compare simulated behavior with live performance without jumping straight to an operational platform.

The verification gap has several layers. First, the team has to show that the simulation generated useful training cases. Second, it has to show that the agent did not simply learn artifacts of the simulator. Third, it has to show that live-flight behavior stays inside safe bounds. Fourth, it has to define what evidence would be enough to expand the envelope.

The role of TPS students is important in that chain. They bring operational and test judgment into scenario design instead of treating the AI as a closed lab product. That helps connect the training setup, the live flight, and the evaluation criteria to behavior a pilot or test team can actually assess.

This is not just an algorithm problem. It is an M&S fidelity problem, a test design problem, and an integration problem. The agent, aircraft, range instrumentation, safety pilot, live virtual constructive setup, and post-flight analysis all become part of the evidence chain.

The important success is not that an AI flew a maneuver. The important success is that a team built a method for moving behavior from simulation to live test in a way that can be inspected. That method is what future autonomy programs will need.

Most AI programs stall in this verification gap. They can train a model. They can show a demo. They struggle to prove the behavior is valid under the operational conditions that matter. Have Remy is useful because it treats the gap as an engineering problem rather than a public-relations event.

That is precisely the space Polyrhythm’s experts have worked in for decades: simulation fidelity, test range integration, evidence design, and the hard boundary where modeled behavior has to earn trust in the real system.

Modeling & Simulation Mission Systems Authorization (RMF/cATO)