#ai reasoning failure detection