That doesn't sound right. For example, there's plenty of software with the correct observable behavior which leaks credentials. So what needs to be captured goes beyond observable behavior.