- Mysql(i)_real_escape_string prevents SQL injection
- Second order SQL injection
- Escaping user input to prevent SQL injection
- Mysqli is for the newbies while PDO is for the advanced folks
- Extensive use of the number of rows returned by a SELECT query
- If (isset($var) && !empty($var))
- Try and catch to echo an error
- HTTP_X_FORWARDED_FOR, HTTP_X_CLIENT_IP etc. to get the "real" IP
- You name it?
- Comments (2)
Initially, this site was launched as a place where I can treat PHP delusions: disprove numerous fallacies and superstitions circulating in the PHP community. Eventually I learned that it's not enough to disclose a misconception or a wrong practice - the right way should be shown as well, so for the time being I switched to writing educational articles. But seeing the same delusions expressed again and again on Stack Overflow or other forums around the Net, I decided to compile a sort of Top 10 list, the most frequent PHP delusions. Some of them are quite important, whereas others are but nitpicks. But they all indicate a cargo cult programming, a practice then someone mindlessly uses a code due to common belief/practice but don't really get the idea of it.
Mysql(i)_real_escape_string prevents SQL injection
I'd say that up to this day most PHP folks still believe in this nonsense, all thanks to a stupid note from the PHP manual: "This function must always (with few exceptions) be used to make data safe". "To make data safe", my foot!
In reality this function has absolutely nothing to do with safety or injections, merely escaping special characters in SQL string literals, making them immune to SQL injection as a side effect, but being utterly useless for any other query part, from a table name to a numeric literal. To protect from SQL injection, one should follow two simple rules:
- Any variable data literal (i.e. a string or a number) should be substituted with a parameter, whereas actual value should be sent to the query separately, through bind/execute process.
- All other query parts that happen to be added through a variable, should be explicitly filtered through a hardcoded list of allowed values.
Second order SQL injection
Many people take second order injection wrong, thanks to this nonsense answer on Stack Overflow. It states that a prepared statement protects from the 1st order but not from 2nd. Which is a complete rubbish.
In reality, it's quite the opposite: second order SQL injection happens only if you neglect a prepared statement. It happens when you are foolish enough to separate the sheep from the goat, to decide whether some certain data is safe or not, and therefore whether to use a prepared statement or not. Eventually such an approach will lead to second order injection. Whereas if you're using prepared statements all over the place, no 2nd or 99nd order injection is ever possible.
Escaping user input to prevent SQL injection
A nasty one. OWASP up to this day lists it as one of its recommendations, let alone hordes of PHP users reciting this sermon on a daily basis. Whereas you can tell from the above, it's a complete rubbish, in the meaning both parts are wrong:
- escaping. As we learned above, escaping has nothing to do with safety or injections. What we need is a prepared statement, not "escaping" for the data (and a white list for the everything else)
- user input is a vague and misleading term, implying that only what is coming from the user side should be treated somehow. But as we learned above, it's a nasty delusion that could lead to the 2nd order SQL injection. There are many reasons why there should never be such a term as "user input" in regard of SQL injection:
- in a big layered application, a database layer just have no idea what the data source is, and have no means to tell "safe" data from "unsafe". All data should be processed uniformly.
- protection from SQL injection also protects from syntax errors as well, so there is again not a single reason to avoid a prepared statement for the input that is from the super-trusted source.
- there is a second order SQL injection when the data technically comes from within your own application
- user input is a vague term that anyone treats differently, according to their knowledge. There is a famous question On Stack Overflow based on the impression that a hardcoded form value is not from the user but from the server. Go figure how many other false interpretations this notorious term could have.
Mysqli is for the newbies while PDO is for the advanced folks
In reality it's the opposite. Given the main difference between the old mysql ext and mysqli is support for the prepared statements AND a whole lot of trouble when using prepared statements with mysqli (one can't even get a usual assoc array or num rows from a prepared statement without a certain trick) on one hand; and consistent, predictable and useful API that PDO offers on the other hand, mysqli should be only recommended for someone who certainly knows what are they doing and have a decent experience in programming.
Extensive use of the number of rows returned by a SELECT query
It can be seen in almost every PHP script, yet every time you have an idea to use it, it's either superfluous or harmful.
And for the only case when it can be of any use, it is not available at all.
- it is often used by newbies to get the number of rows matching some criteria. That's an awful idea that makes your database to actually select all the matched rows from the table and send them to PHP, along with this silly number. The right way would be to ask a database to return the only count, by issuing a
- it is used to see whether a query returned any data. Which is pointless as there is always the data itself:
- if you're selecting only one row, then just fetch that row and use it in any condition that's intended to tell whether there is any result or not
- in case you are expecting multiple rows, then fetch them all into an array and then use this array exactly the same way as above. If, for some reason, you need the actual number of rows returned, you can use
count()against this array, but I cannot think of any useful application for such a number.
- the only case I can think of where it can be of some use is when the amount of rows returned is so big that it would be unwise to store them all in array. But in such a case you are bound to use unbuffered query, and therefore the number of rows is simply unavailable.
This is a nasty one, still can be seen in too much code snippets. To prevent PHP from printing errors on-screen, people often disable error reporting at all. In fact, for the newly written code error reporting should be always E_ALL, whereas to prevent errors from appearing in the browser, one needs to set another configuration option:
and let PHP to log error instead. Further reading: Error reporting in PHP
If (isset($var) && !empty($var))
Is essentially a grammar tautology that reads as
if (isset($var) && isset($var) && $var). The full explanation is here: Do you really need to check for both isset() and empty() at the same time?
Try and catch to echo an error
Catching an exception only to report it is a sure overkill. Simply because uncaught exception is a fatal error already, and it will be reported by itself. Without that blunt try/catch/die sequence, making your code much cleaner.
Sadly, the PHP manual is especially bad at it, showing such examples all over the place. On the one hand, it is understandable as such an example is guaranteed to output an error message, regardless of the user settings. On the other hand, this when this approach gets mindlessly copy-pasted into the live code, it turns out to be superfluous, user-unfriendly and harmful .
HTTP_X_FORWARDED_FOR, HTTP_X_CLIENT_IP etc. to get the "real" IP
Just read that old story from Anthony Ferrara and never trust anything other than
REMOTE_ADDR in security matters.
TL;DR: everything beside
REMOTE_ADDR is just an HTTP header that can be easily spoofed by anyone.
There is an exception from this rule, though: if your PHP script is behind the trusted proxy, and you positively know which particular HTTP header/env variable set by that proxy contains an external IP, you can use that variable for sure. Better yet, configure your proxy to inject the remote IP directly into REMOTE_ADDR (i.e.
fastcgi_param REMOTE_ADDR $http_x_real_ip; for nginx).
You name it?
The above list is highly subjective, compiled from what I see every day on Stack Overflow. If you know a fallacy or a superstition widely shared in the community, please leave a comment below.