PHP floats, localization and landmines
Since PHP lacks a decimal type, it only has floats and integers. Arbitrary precision floating points are reasonable once you stop expecting them to be precise. I’ve learned to deal with PHP’s floats, and arbitrary precision floats in general. However, floats behaving totally different based on the current locale, was something I didn’t expect. PHP has this horribly wonderful feature, where using setlocale()
can butcher floats, and make other wise sane SQL and casting expressions an effort in futility. Take the following
- $pi = 3.141593;
- echo (float)(string)$pi; //should output '3.141593'
Now if by accident you have you’ve used setlocale()
to modify your locale to ‘pl_PL’ because you’re making a site in polish. The exact same code now behaves totally different. Instead of the useful pi you are used to you get 3
. Thats right, no decimals for you. Its very odd and unexpected that float/string casts are not bi-directional. Even worse is if you try and do some SQL with what you think is a float.
- $value = (float)'3.141593';
You’ll end up with a nice fat error like
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘(pi) VALUES (3,141593)
Having done a float cast, I expected that the value would be safe to insert. Getting a SQL error back, makes using setlocale()
extremely dangerous if you intend on doing anything with floats. Having floats magically mutate into SQL destroying values is really annoying and dangerous. It also creates difficult to debug issues that are caused by magical global runtime properties being changed. I’m sure there was a good reason for this implementation, but I really question its usefulness, when perfectly safe code transforms into SQL destroying, logic defying craziness because of a simple function call.
Ok I get how annoying it is, but it sorts of make sense to me.
The thing is that your locale sets how your user represents numbers. Now you, in Canada, write your decimals using a dot. In most latin countries (at least in spanish and portuguese), the comma is utilized to separate the decimals.
When you set your local you are telling PHP how YOU think a number is represented. Ergo, for spanish:
$pi = ’3,14159’
is a valid float number, while:
$pi = ’3.14159’
is not. Just as PHP will show you a real float ($float = 3.14159) as a “latin” float ($pi = “3,14159”), when you do the string -> float conversion, you are expected to specify the number in the current locale format. Now, internally, PHP ALWAYS understands the dot as a decimal separator.
You can find the same misunderstandings when working with timezones. Your MySQL is in a fixed timezone (say GMT). Now Mark logs in from New York, so the application timestamps are shown as GMT-2. When Mark enters a time, you have to convert it from his timezone, to the MySQL timezone. When you are fetching timezones from MySQL, you have to convert them to his timezone.
Mariano Iglesias on 8/19/10
The reason is likely the fact that European countries might use the
comma
as a delimiter for numbers with decimals in them.Jose Diaz-Gonzalez on 8/19/10
maybe you can create a wrapper/helper class for all of your float casting needs, and that class can check for those oddities and return the valid format.
i.e.
good tip though since most would not think of this behavior happening at first!
jblotus on 8/19/10
sorry that comment is all kinds of screwed up in the formatting department, but the idea was to create a wrapper to handle these things.
jblotus on 8/19/10
Sound a bit irritated by #1029.
Michael Clark on 8/19/10
Mariano: I understand that setting locales modifies how you want floats to be represented. But when the language modifies itself and breaks common and basic things like casting because of locale settings, I get twitchy.
(float)(string)$value;
should always work the same regardless of the locale, which it currently doesn’t.Micheal Clark: Yeah, because its a totally silly way for a language to work :)
mark story on 8/19/10
had this problems a few days ago and it’s indeed very unexpected but also makes sense.
as the part of the code had no i18n needs, i simply solved it by setting the locale back to ‘eng’ for this part of the code.
thanks for the article!
Martin on 11/1/10
@
echo “
greg on 12/2/10
I’m sorry. Your block messed up with my code, can You delete it. This is what I wanted to show You: http://bin.cakephp.org/view/1670706506
and output (on PHP Version 5.2.10-2ubuntu6.5) is http://bin.cakephp.org/view/540688017
It looks like settype know how to deal with locale
greg on 12/2/10