According to reports, Amazon Web Services suffered at least two outages linked to errors involving its internal AI tools. In one case in mid-December, a system used by customers was disrupted for 13 hours after engineers allowed the Kiro assistant to make changes, with the agent reportedly deciding to delete and recreate the environment, though Amazon later denied the claim and attributed the incident to human error.
“Problems occur when AI-generated code looks correct but doesn’t fully understand how a complex system works. AI tools can write code quickly, but may miss hidden dependencies, system limits, or security concerns, leading to incorrect logic, configuration mistakes, security vulnerabilities, or unexpected overloads in large distributed systems. In large platforms like cloud services, even a small coding mistake can trigger cascading failures affecting many services,” said Naga Santhosh Josyula, Co-founder, Tablesprint, an AI platform for voice agents and enterprise software development.
While AI coding assistants are speeding up software development, reliability depends on their usage. These tools help developers write code faster and automate repetitive tasks. However, if teams rely on AI suggestions without reviewing and testing the code, problems can slip into production. AI is most effective when it acts as a productivity tool for engineers rather than a replacement for careful design, testing, and review.
Alongside, startups are shipping minimum viable products in days. Developers’ output has increased 10-fold. Boards are asking to do more with less. “The energy is unmistakable. But a quiet consensus is forming among engineering leaders: speed without oversight is not velocity; it’s debt accumulation at scale. Bugs don’t disappear because AI wrote the code. They just arrive with more confidence,” Aurobinda Nanda, Chief Executive Officer of AppHelix, said.
“The most dangerous line of code is the one nobody reviewed because everyone assumed the machine got it right. This is where the Human-in-the-Loop (HITL) imperative becomes a boardroom concern, not just a developer checklist. When AI generates code that is deployed without structured human checkpoints, organisations inherit liabilities they didn’t author and may not even understand,” Nanda, who co-founded Happiest Minds Technologies, pointed out.
He asserted that the answer is not to slow down AI adoption but to introduce human judgment at the highest-leverage stages of the development lifecycle.
Experienced engineers must still check whether the code fits the broader architecture and if it could create operational risks.
Paramdeep Singh, co-founder of Shorthills AI, said AI-assisted coding tends to be more reliable when used by senior developers who can clearly define the task and understand the code generated by AI. In such cases, AI can speed up the coding process while experienced developers review the output, retaining the correct elements and fixing potential security gaps or issues.
However, he noted that while junior developers may achieve faster code generation using AI tools, reliability can suffer if they rely heavily on AI-written code without fully understanding it. The challenge, he said, lies in the trade-off between speed and reliability, as less experienced developers may accept AI outputs at face value, increasing the risk of errors or vulnerabilities.
“The only way to guarantee reliable outputs from AI-generated code is to choose a tool that produces outputs that are easy to review and properly governed. This is critical when working with complex integrations or legacy systems, or when there are regulatory obligations. Here, using AI assistance without proper governance introduces the highest level of risk and quietly builds technical debt until it’s too expensive to repair,” Deepak Visweswaraiah, SVP and MD of Pegasystems India, noted.
He noted that in mission-critical systems such as aviation, healthcare, or real-time financial platforms, relying entirely on AI-generated code can be risky if the system fails to account for edge cases. Sectors like healthcare, airlines, and finance require high reliability, and delegating coding tasks fully to AI in such environments can increase the risk of failures. Sensitive areas like authentication, infrastructure, and other high-risk control layers are especially vulnerable if left entirely to AI-generated code.
Delaying the rate at which AI-based development is adopted is not the solution. Instead, reliability should be built into the code as it is created, not patched in after something breaks.
Published on March 13, 2026