Once I learned one can parameterize queries, not doing so began to feel kind of dirty to me.
To me, the big point was not even security, but the fact I no longer had to deal with properly escaping/cleaning messy input. Plus, somebody once told me parameterized queries can (potentially) give better performance. What's not to love?
> parameterized queries can (potentially) give better performance.
Yep! More accurately, parameterized queries almost always give better performance.
The simplest way parameterization helps (and there are many) is query plan caching. When a DB server sees a query for the first time in awhile, it has to turn the SQL text into a query plan (https://en.wikipedia.org/wiki/Query_plan). That takes time and resources (lexing, parsing, validation against schema, etc.).
In most RDBMSs I know of, query plans are cached by default for awhile after they are made and executed. The plans also contain placeholders for parameters rather than inlining the parameter values supplied with the query. This means that running the same query with different parameter values usually allows the database re-use the cached query plan and just substitute different parameters, which saves time on plan construction.
Plan caching + parameterization saves time on a ton of other stuff, too: re-running the same plan for multiple queries allows automatic tuning and heuristic analysis to get better performance of the query over time, facilitates predictive caching of result data (from index warming to full row caching), and a lot of other cool stuff.
Aside: you also occasionally see some silly tricks used to defeat plan caching. For example, say you have a complex query, hitting multiple indexes, that gets run all the time with different parameters. Most runs return very few results, and the RDBMS optimizes the cached query plan for that case. Now, say you as a developer know that one particular parameter value for is going to return a lot of results, leading to extremely inefficient index scans, and that the optimized-over-time, cached version of the query is probably going to do the wrong scans, taking an unnecessarily long time. Instead, the naïve "rebuild plan from scratch" approach would yield better performance. Now, if you're right about that (and second-guessing the query planner is a really bad idea on problems like this; the people who write it are almost always smarter than you), you have a few options:
The "correct" choice usually cited is to add RDBMS-specific "hints" to your extra-big query text instructing the database on which indexes to use, or to ignore cached plans. That might be a skill you don't have, or you might not know which ones it should use, just that the naïve approach is better.
There is usually a way to "flush" cached query plans from the database, but that is usually a very bad idea to do programmatically, may require administrator access, and very possibly compromises performance of the "fast" versions of the query running in parallel.
Instead of those, what I've seen a lot of people do is just defeat the plan cacher. Since the query plan cache is often keyed by parameterized text (e.g. somewhere in the RDBMS there's a hash table mapping "SELECT * FROM TABLE WHERE FIELD = '?'" to query plan objects), I have seen folks add pointless "1 = 1"-type clauses to their text, or just add "unique-ing" comments to the beginning. This has the advantage of leaving other cached plans alone and forcing a re-plan from scratch. It's also undeniably a bit of a hack, but a very useful one to be sure.
> I have seen folks add pointless "1 = 1"-type clauses to their text
I have seen something like that, too, and I have always wondered if that was just job protection or if there was some arcane reason to it.
The code in question was part of a view, though, so it ran unaltered countless of times. That view also contained numerous "1 <> 1" clauses, apparently to disable branches of the - rather intimidating WHERE clause. It was the one time I took over someone else's code and felt like swearing.
Let's assume you start with "SELECT * FROM ITEMS" like query. Now you're adding the WHERE clause. Do you keep track if you have the WHERE already present? Do you then keep track of if you need the "AND" or if it's the first entry that cannot take the "AND"?
With 1=1 you're guaranteed to have the WHERE clause and you're guaranteed to be able to use AND in your statement.
I agree that as a starting point for experimenting with an ad-hoc query, a "1 = 1" clause can make sense.
There were two reasons I was mad:
1. Using "1 <> 1" to disable a branch of a WHERE clause is not something I would have done. I would have commented it out. If you are used to this idiom, it might feel like the natural thing to do, but I think commenting it out makes the intention clearer.
2. I understand that queries can get a little convoluted when you start experimenting with them until you get the results you need. It is perfectly natural, and it is how the views I have written started out, as well. But before I put my views into production, I cleaned up the code, removed the parts I did not need or added a little comment to explain why the code was there.
Actually there were three:
3. The entire view was written in such a convoluted way that I was pretty sure the person who had written it was drunk at the time (he was an alcoholic, so this is probably true in the literal sense). When I inherited it, the view had about 700 lines and was mostly unreadable. When I was finished cleaning it up, I had reduced that to about 300 lines. I checked very carefully, but it still gave the exact same results. It still was not pretty, but if I could throw out more than 50% of the original code without affecting the result, you can probably imagine what a mess it was.
I think you hit the most likely reason for the "1 <> 1" clauses: it was basically "commenting out" branches of the query (or something programmatically generated the query in a way that needed to switch branches on and off without changing the query text so much that the query builder would have to understand more SQL syntax).
> It was the one time I took over someone else's code and felt like swearing.
Well, to be honest, I have taken over about four pieces of software over the years, and one of them was a one-and-a-half time maintenance job (adding one tiny little feature, later finding a bug that turned out to be not in the application after all).
I was exceedingly lucky, I guess, because except for that hairy SQL view, they were all gems of clarity.
To me, the big point was not even security, but the fact I no longer had to deal with properly escaping/cleaning messy input. Plus, somebody once told me parameterized queries can (potentially) give better performance. What's not to love?