Submission + - Microsoftâ(TM)s Project Ire is an autonomous AI that reverse engineers malw (nerds.xyz)
BrianFagioli writes: Microsoft has revealed something genuinely exciting in the cybersecurity world. Itâ(TM)s called Project Ire, and it might be one of the most ambitious attempts yet to automate malware classification. This isnâ(TM)t just a system that scans files or compares against known threats. It actually reverse engineers unknown software entirely on its own, analyzing it from the ground up without knowing where it came from or what itâ(TM)s supposed to do.
To be clear, this is very exciting. As someone who writes about security and tech regularly, Iâ(TM)ve seen my fair share of âoeAI-poweredâ tools, but this one feels different. Project Ire doesnâ(TM)t need hand-holding. It picks apart software like a real analyst would, using decompilers, control flow analysis, memory sandboxes, and more.
This thing came out of a collaboration between Microsoft Research, Defender Research, and Discovery & Quantum. Basically, all the big brains at Microsoft put their heads together and built a system that doesnâ(TM)t just guess. Actually, it investigates. And it does so using some of the same underlying tech behind GraphRAG and Microsoft Discovery, including a toolkit of reverse engineering utilities that it calls like a seasoned analyst.
Microsoft tested Project Ire against public datasets full of Windows drivers. Some were malicious, others totally clean. The system ended up with a precision of 0.98 and a recall of 0.83, which are both impressive numbers. That means it flagged malware with near-perfect accuracy and didnâ(TM)t miss much. Even better, it produced the first ever conviction case at Microsoft authored entirely by a machine. No human in the loop. That malware sample is now blocked by Microsoft Defender.
Unlike traditional security systems, which rely heavily on signatures and rule-based filters, Project Ire goes in blind. It reconstructs software internals using tools like angr and Ghidra, then reasons through behavior to decide if a file is safe or not. Itâ(TM)s not just making guesses. Itâ(TM)s building a case, complete with an evidence chain that reviewers can look over.
One of the standout examples Microsoft shared was a rootkit called Trojan:Win64/Rootkit.EH!MTB. Project Ire picked up on behavior like hijacking Explorer.exe, injecting hooks, and reaching out to command and control servers. Another sample, HackTool:Win64/KillAV!MTB, was designed to kill antivirus software. The system correctly identified that too, including functions aimed at terminating specific security processes. These are the kinds of files that often sneak past basic scanners.
Now, Ire isnâ(TM)t perfect. It once misread a function as anti-debugging behavior, but what stood out was how it flagged the finding as questionable and used a built-in validator to double check itself. Thatâ(TM)s not something most AI tools do today. It shows that this system isnâ(TM)t blindly confident. It understands uncertainty and knows when to ask for a second opinion.
In tougher real-world testing, Ire took on nearly 4,000 hard-to-classify files that had been set aside for expert review. These werenâ(TM)t cherry-picked samples. They were unknowns. The system worked entirely on its own and still nailed about 9 out of 10 of the malware cases it flagged. Even though it caught only a quarter of all the bad files in this high-difficulty round, it barely triggered false alarms. Thatâ(TM)s a good tradeoff in real-world defense, where one wrong call can burn trust.
Microsoft says Project Ire will now be integrated into the Defender ecosystem under the name Binary Analyzer. The long-term plan is to scale it up and speed it up, making it possible to classify unknown files instantly⦠maybe even before they hit disk. That kind of capability could be a game-changer, especially as threats become faster, smarter, and harder to pin down.
To me, the most exciting part is that this isnâ(TM)t theoretical. Project Ire is already helping real analysts inside Microsoft. Itâ(TM)s working alongside humans, not replacing them, and offering detailed, explainable reports that can stand up to scrutiny. Thatâ(TM)s the kind of AI we need more of, folks, not hype, not smoke and mirrors, but something that actually helps solve hard problems.
To be clear, this is very exciting. As someone who writes about security and tech regularly, Iâ(TM)ve seen my fair share of âoeAI-poweredâ tools, but this one feels different. Project Ire doesnâ(TM)t need hand-holding. It picks apart software like a real analyst would, using decompilers, control flow analysis, memory sandboxes, and more.
This thing came out of a collaboration between Microsoft Research, Defender Research, and Discovery & Quantum. Basically, all the big brains at Microsoft put their heads together and built a system that doesnâ(TM)t just guess. Actually, it investigates. And it does so using some of the same underlying tech behind GraphRAG and Microsoft Discovery, including a toolkit of reverse engineering utilities that it calls like a seasoned analyst.
Microsoft tested Project Ire against public datasets full of Windows drivers. Some were malicious, others totally clean. The system ended up with a precision of 0.98 and a recall of 0.83, which are both impressive numbers. That means it flagged malware with near-perfect accuracy and didnâ(TM)t miss much. Even better, it produced the first ever conviction case at Microsoft authored entirely by a machine. No human in the loop. That malware sample is now blocked by Microsoft Defender.
Unlike traditional security systems, which rely heavily on signatures and rule-based filters, Project Ire goes in blind. It reconstructs software internals using tools like angr and Ghidra, then reasons through behavior to decide if a file is safe or not. Itâ(TM)s not just making guesses. Itâ(TM)s building a case, complete with an evidence chain that reviewers can look over.
One of the standout examples Microsoft shared was a rootkit called Trojan:Win64/Rootkit.EH!MTB. Project Ire picked up on behavior like hijacking Explorer.exe, injecting hooks, and reaching out to command and control servers. Another sample, HackTool:Win64/KillAV!MTB, was designed to kill antivirus software. The system correctly identified that too, including functions aimed at terminating specific security processes. These are the kinds of files that often sneak past basic scanners.
Now, Ire isnâ(TM)t perfect. It once misread a function as anti-debugging behavior, but what stood out was how it flagged the finding as questionable and used a built-in validator to double check itself. Thatâ(TM)s not something most AI tools do today. It shows that this system isnâ(TM)t blindly confident. It understands uncertainty and knows when to ask for a second opinion.
In tougher real-world testing, Ire took on nearly 4,000 hard-to-classify files that had been set aside for expert review. These werenâ(TM)t cherry-picked samples. They were unknowns. The system worked entirely on its own and still nailed about 9 out of 10 of the malware cases it flagged. Even though it caught only a quarter of all the bad files in this high-difficulty round, it barely triggered false alarms. Thatâ(TM)s a good tradeoff in real-world defense, where one wrong call can burn trust.
Microsoft says Project Ire will now be integrated into the Defender ecosystem under the name Binary Analyzer. The long-term plan is to scale it up and speed it up, making it possible to classify unknown files instantly⦠maybe even before they hit disk. That kind of capability could be a game-changer, especially as threats become faster, smarter, and harder to pin down.
To me, the most exciting part is that this isnâ(TM)t theoretical. Project Ire is already helping real analysts inside Microsoft. Itâ(TM)s working alongside humans, not replacing them, and offering detailed, explainable reports that can stand up to scrutiny. Thatâ(TM)s the kind of AI we need more of, folks, not hype, not smoke and mirrors, but something that actually helps solve hard problems.