Contents
Musical stuff
Computer / geek stuff...
Tutorials
Philosophy
Soap Box
Reviews
Pets
Media I like
Other Projects
|
The application development manager at Rackspace comments on his
experiences with PHP:
Memory leaks, inconsistent interfaces, inconsistent internal data model,
randomly freed objects, multiple object copies despite explicit use of
references, internal PHP errors, and untraceable code failures all but made
the task impossible to accomplish in PHP. ()
PHP is a nice language for some tasks. Lots of good software uses it. No
other language makes it so convenient to mix code and html, which is great
for lone web developers who are also programmers. I've found it pretty
useful for running my site, mainly because I can so easily put code in the
middle of my content, and keep the overall per-page authoring overhead down.
However, from a pure programming or information theory standpoint, it's got
some serious problems:
- Namespaces don't exist at all. (this is similar to keeping all
your files in one directory) There have been discussions about adding
namespaces, but the proposed separator is \? because "there isn't any
other character left"...
- Exceptions didn't exist until PHP5, and aren't implemented in a useful
"deep" fashion.
- Built-in and library APIs are a disorganized mess.
- There are thousands of symbols in the PHP namespace. Cleaner
languages only have a few dozen. "Everything is built in" just means
it has way too many functions in its core, especially since many are
minor variations of each other.
- No consistent naming convention is used. Some functions are
verb_noun() and others are noun_verb(). Some are underscore_separated,
while others are CamelCase or runtogether. Some are
prefixed_byModuleName, and others use a module_suffix_scheme.
Some use "to" and others use "2". And if you take a random set
of ten library functions, chances are half a dozen different
conventions will be included.
- PHP tends to use a lot of similar functions, instead of just one,
powerful one. For example, PHP has
sort(), arsort(), asort(),
ksort(), natsort(), natcasesort(), rsort(), usort(), array_multisort(),
and uksort(). For comparison, Python covers the functionality
of all of those with list.sort().
- PHP includes lots of cruft or bloat. Do we really need a built-in
str_rot13() function? Also, a lot of other built-ins are just trivial
combinations of each other. Users don't really need case-insensitive
variants of every string function, since there is already a
strtolower().
- Many parts of PHP either deviate from standards, or otherwise don't
do what users would expect.
For example, exec() returns the last line of text output from a
program. Why not return the program's return value, like every other
language does? And further, when would it ever be useful to get only
the last line of output?
Another example: PHP uses non-standard date format characters.
- The language was generally thrown together without any coherent design,
accreted in a messy and complex fashion.
- Functions...
- Functions cannot be redefined. If I want a set of includes which
all use the same interface, I can only use one of them per page load --
there's no way to include a then call a.display() then include b and
execute b.display(). I also cannot transparently wrap existing
functions by renaming/replacing them.
- Functions cannot be nested. (actually, they can, but it has the
same effect as if they were not. All functions are global, period.)
- Anonymous functions (lambda) don't exist. create_function() is
not the same thing. Given two strings, it compiles them
into code, binds the code to a new global function, and returns the
new function name as a string.
$foo = create_function('$x', 'echo "hello $x!";');
$bar = "\0lambda_1";
$bar("bar"); // sometimes prints "hello bar!", sometimes fails
Note that the number after "\0lambda_" is not predictable. It
starts at one and increments each time create_function is called.
The number keeps incrementing as long as the web server process is
running, and the counter is different in each server process. The
memory for these new global functions is not freed, either, so you
can easily run out of memory if you try to make lambdas in a loop.
- Functions are case insensitive.
- No "doc strings". Documentation must either be maintained
separately from the code, or by (rather finicky) 3rd-party code-level
documentation interpreters.
- The documentation...
- ... is often incorrect or incomplete, and finding relevant information
tends to require reading pages and pages of disorganized
user-contributed notes (which are incorrect even more often) to find
the details the documentation left out. Sometimes really important
details are left out, such as "this function is deprecated -- use foo()
instead".
- ... is (as of PHP 5.1.2) not included with the source, nor
typically installed along with the binary packages. Downloadable
documentation is available, but does not match the docs on PHP.net.
Specifically, it leaves out all the user-contributed notes, which are
important because of reasons mentioned above.
- ... is not built in. You can't just point an introspection tool at a
PHP module and get usage information from it.
These issues are important because it's not very feasible to use PHP
without referring to the documentation frequently. There is very
little internal consistency, and even less consistency between modules,
so you'll probably spend a lot of time looking through the docs.
Simply guessing how things work, based on conventions, usually doesn't
work in PHP.
- Default to pass-by-value. (php5 now defaults to reference, for
objects, though I'm not sure if it's "real" references or
reference-by-name)
- Default error behavior is to send cryptic messages to the browser,
mid-page, instead of logging a traceback for the developer to
investigate.
- Many errors are silent.
For example, accessing a nonexistent variable simply returns nothing.
Whether this is a Bad Thing is debatable (I believe it's bad), but it can
nevertheless interact badly with some other aspects of PHP -- such as the
inconsistent case sensitivity (variables are sensitive, but functions are
not):
function FUNC() { return 3; }
$VAR = 3;
print func(); // produces "3"
print $var; // produces nothing
- The combination list/hash "array" type causes problems by
oversimplifying, often resulting in unexpected/unintuitive behavior.
For example, PHP's weak type system interferes with hash keys:
| Code | Result |
$a = array("1" => "foo", 1 => "bar");
echo $a[1], " ", $a["1"], "<br />\n";
print_r($a);
|
bar bar
Array
(
[1] => bar
)
|
After a little experimentation, I see that hash keys cannot be
functions, classes, floats, or strings which look like integers.
There are likely other invalid types as well. The only usable key
types I've found so far are integers, and strings that do not parse
as integers. (note that the parsing used here is different than
the automatic str-to-int coercion used for the "+" operator)
For details, see akey.php
(source).
- Awkward / overlapping names can exist...
foo and
$foo are completely unrelated.
- Magic quotes (and related mis-features) make data input needlessly
complex and error-prone. Instead of fixing vulnerabilities (such as
malformed SQL query exploits), PHP tries to mangle your data to avoid
triggering known flaws.
- The server-wide settings in PHP's configuration add a lot of
complexity to app code, requiring all sorts of checks and workarounds.
Instead of simplifying or shortening code (which the features are
supposed to do), they actually make the code longer and more complex,
since it must check to make sure each setting has the right value and
handle situations when the expected values aren't there.
- PHP's database libraries are among
the worst in any language. This is partially due to a lack of any
consistent API for different databases, but mostly because the database
interaction model in PHP is broken.
The SQL injection issues in PHP deserve particular attention. This
amusing exchange
explains a bit better...
How can it be that hard for web developers to check data before
it is submitted? I wouldn't imagine trusting the data that an anonymous user
can enter into my website.. so maybe I'm just trained to check data. Of course,
I'm also glad I use MySQL with PHP where a simple mysql_real_escape_string can
prevent any popular SQL Injection attempt.
You're glad that you use pretty much the only langauge where
this is not done automatically for you, but which instead forces you to use a
function with a name like mysql_real_escape_string()? And that actually has a
similarly-named function without the "_real_" that doesn't do the job right?
Just kidding with that other one, here's the real one!
- The performance is crippled for commercial reasons (zend). Free
optimizers are available, but aren't default or standard.
- Bad recursion support. Browse
bug 1901
for an example and some details. BTW, ever heard of tail recursion?
They might have mentioned it in the "Intro to Computer Science" course.
- Not thread safe.
- No unicode support. It's
planned for PHP 6
but that could be a long time away.
- Vague and unintuitive automatic coercion; "==" is unpredictable, and "==="
does not solve all the problems caused by "==". According to the manual,
"==" returns true if the operands are equal, and "===" returns true if
the operands are equal and of the same type. But that's not entirely
true. For example:
Two different strings are equal... sometimes.
"1e1" == "10" => True
"1e1.0" == "10" => False
So, they're "equal and of the same type", right?
"1e1" === "10" => False
Unexpected results:
"1 two 3" == 1 => True
1.0 === 1 => False
"11111111111111111117" == "11111111111111111118" =>
False
Equality is (apparently) not transitive:
$a = "foo"; $b = 0; $c = "bar";
$a == $b => True
$b == $c => True
$a == $c => False
Further, the coercion rules change depending on what you're doing. The
behavior for "==" is not the same as used for "+" or for making hash keys.
"22 cream puffs" == "22 bullfrogs" => False
"12 zombies" + "10 young ladies" + "bourbon" == "22 cream puffs" => True
Even though math asserts that, if A minus B equals zero, then A must equal B,
PHP disagrees:
"bourbon" - "scotch" => 0
"bourbon" == "scotch" => False
- Variable scoping is strange, inconsistent, and inconvenient --
particularly the notably unusual "global" scope which gave rise to
kludges like "superglobal" or "autoglobal" as workarounds.
Further, variables cannot be scoped beyond global or function-local.
- The mixture of PHP code with HTML markup tends to make code difficult
to read. Readability is important.
- Various "features" cause very unusual behavior and add complexity.
This tends to cause bugs for programmers who expect it to behave like
other languages.
For example, this will fail sporadically: Open a file. Write to
it. Close it. Open it. Read from the file. To make this
actually work, the programmer must A) know it will fail, B) have some
clue why it fails, and C) call the correct function
(clearstatcache()) before re-opening the file. Note that
the online docs aren't much help -- searching for "cache" takes the
viewer to the docs for cosh(), but returns nothing at all
related to files or caches.
- It provides no way to log errors verbosely, but only display critical
errors to the user. Further, some of the most critical errors (such as running
out of memory) give absolutely no response to the user -- not even a blank page.
- Poor security, and poor response to security issues. This is a large
and detailed topic, but regardless of whether it's caused by
inexperienced programmers or by PHP itself, the amount of
PHP-related exploits
is rather high. And
according to a PHP security insider,
the effort is futile.
- Its object model is (still) very lacking, compared to other
systems.
- Most of the development since v3 seems to be devoted to damage control,
and dealing with earlier mistakes... not a good sign.
- In general, has a tendency to create more problems than it solves.
I would not recommend using PHP, except as a template language for HTML.
It's very good at that, so long as you keep the complexity of related code
down. It's more powerful and (IMHO) more convenient than strict template
languages like TAL, but cannot compete with "normal" scripting languages like
Python, Perl, Ruby, and Lisp. PHP is a language optimized for a purpose,
at the expense of all other uses. It's very good at what it was originally
designed for, but has become stretched way too far since then.
This is waxing philosophical, but in my experience, PHP has an uncomfortably
low ceiling. Programming isn't just about putting one instruction after
another; it's about building abstractions to better represent and solve
problems. The more complex the problem, the higher the level of abstraction
needed to solve it cleanly. With PHP, I often hit my head on its low
ceiling of abstraction, and it seems to require a great deal more effort and
discipline (than in other languages) to avoid ducking down into the details
of implementation when I should be focusing on the upper-level design.
|