From a denotational/lambda-calculus perspective it's very natural to consider al...

From a denotational/lambda-calculus perspective it's very natural to consider all expressions that reduce to the same expression as equivalent, at which point you have that definition of equivalence. Of course in the pure lambda calculus all "programs" are just (equivalent to) values. To get any effects like input/output you have to add extra values, and at that point you decide what the equivalence semantics of these "special" values are. That's kind of the Haskell perspective, and where I think it falls down is in the interaction between sequenced I/O and pure functions; the pure subset of Haskell can be fully evaluated at compile time (if you work in e.g. Idris this becomes true in a very direct sense: your (total) pure functions can be used in types), but as soon as a value depends on user input that obviously ceases to be the case, whereas expect to be able to decompose their program into pure and impure fragments and reason about these separately.

From an operational perspective a good definition of equivalence would be very difficult. E.g. if your language allows making system calls there are dozens or hundreds of those with no real model of how they interact, so it's very hard to say when one sequence of system calls is equivalent to another. Possibly the most natural notion of operational equivalence is to say two programs are equivalent if they execute the same sequence of CPU instructions, but that would mean that nontrivial programs were virtually never equvialent to each other, even if they both just printed the same string.