To cast or not to cast? - mysql

I am developing a system using MySQL queries written by another programmer, and am adapting his code.
I have three questions:
1.
One of the queries has this select statement:
SELECT
[...]
AVG(mytable.foo, 1) AS 'myaverage'`,
Is the 1 in AVG(mytable.foo, 1) AS 'myaverage' legitimate? I can find no documentation to support its usage?
2.
The result of this gives me average values to 2 decimal places, why?.
3.
I am using this to create a temp table. So:
(SELECT
[...]
AVG(`mytable`.`foo`, 1) AS `myaverage`,
FROM
[...]
WHERE
[...]
GROUP BY
[...])
UNION
(SELECT
[...]
FROM
[...]
WHERE
[...]
GROUP BY
[...])
) AS `tmptable`
ORDER BY
`tmptable`.`myaverage` DESC
When I sort the table on this column I get output which indicates that this average is being stored as a string, so the result is like:
9.3
11.1
In order to get around this what should I use?
Should I be using CAST or CONVERT, as DECIMAL (which I read is basically binary), BINARY itself, or UNSIGNED?
Or, is there a way to state that myaverage should be an integer when I name it in the AS statement?
Something like:
SELECT
AVG(myaverage) AS `myaverage`, INT(10)
Thanks.

On your last question: can you post the exact MySQL query that you are using?
The result type of a column from a UNION is determined by everything you get back. See http://dev.mysql.com/doc/refman/5.0/en/union.html .
So, even if your AVG() function returns a DOUBLE, the other part of the UNION may still return a string. In which case the column type of the result will be a string.
See the following example:
mysql> select a from (select 19 as a union select '120') c order by a;
+-----+
| a |
+-----+
| 120 |
| 19 |
+-----+
2 rows in set (0.00 sec)
mysql> select a from (select 19 as a union select 120) c order by a;
+-----+
| a |
+-----+
| 19 |
| 120 |
+-----+
2 rows in set (0.00 sec)

Just for anyone who's interested, I must have deleted or changed my predecessors code so this AVG question was incorrect. The correct code was ROUND(AVG(myaverage),1). Apologies to those who scrathed their heads over my stupidity.

on 1.
AVG() accepts exactly one argument, otherwise MySQL will raise an error:
mysql> SELECT AVG( id, 1 ) FROM anytable;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ' 1 )' at line 1
http://dev.mysql.com/doc/refman/5.1/en/group-by-functions.html#function_avg
Just because I'm curious - what should the second argument do?

Related

MySQL allows subqueries without IN or comparison operator in WHERE clauses. How does it work?

EDIT: I wrongly simplified queries in the original version. this query was what I meant.
SELECT *
FROM A
LEFT JOIN B ON A.seq = B.seq
WHERE (select max(B.data) where B.data is not null group by B.seq);
I'll leave the wrong version since it also helped me about a scalar subquery.
My coworker asked why this statement works in MySQL.
SELECT *
FROM anytable
WHERE (SELECT 'asdf')
This returns 0 rows. I simplified the query, but basically, the where clause contains a subquery without IN or comparison operator; just itself. We expected it would throw an error about SQL syntax, just like SQL Server. But MySQL didn't.
Interestingly, If I change the query like below, the query returns all rows from anytable.
SELECT *
FROM anytable
WHERE (SELECT 1);
I couldn't find documentation about this. How does this work?
This answers the original version of the question.
When you do:
WHERE (SELECT 'asdf')
You have a scalar subquery. This is equivalent to:
WHERE 'asdf'
Now, MySQL treats boolean values as integers and vice-versa, with 0 for false and anything else as true. MySQL does not treat strings as booleans, so it decides to convert the value to a number, using implicit conversion.
The implicit conversion converts all leading digits. There are none, so the value is converted to a 0. Voila! It is treated as false.
When you use WHERE (SELECT 1) the same logic holds, except the value is already a number. And it is 1, so it is treated as true.
SELECT *
FROM A
LEFT JOIN B ON A.seq = B.seq
WHERE (select max(B.data) where B.data is not null group by B.seq);
That can't be your real query. It is not valid SQL. This is what I get when I test it:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'where B.data is not null group by B.seq)' at line 4
No doubt because it's not legal syntax to use a where clause or any clause that follows it if you don't have a from clause to define a table reference.
The test I did above was in MySQL 5.6, and I confirmed it returns the same error when I run the query in MySQL Workbench.
However, MySQL 8.0 supports new syntax. You can use WHERE and other clauses without a FROM clause. I didn't realize that before.
I apologize for being condescending, accusing you of not testing for errors.
Here's a demo:
mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.17 |
+-----------+
1 row in set (0.00 sec)
mysql> select 42;
+----+
| 42 |
+----+
| 42 |
+----+
1 row in set (0.00 sec)
mysql> select 42 where true;
+----+
| 42 |
+----+
| 42 |
+----+
1 row in set (0.00 sec)
mysql> select 42 where false;
Empty set (0.00 sec)
MySQL always supported selecting a single row by specifying a set of fixed expressions in the select-list. Now with the new syntax, you can make that single row stay or be filtered out, depending on whether the WHERE clause is true or not.
In your query, the subquery is evaluated for each row of the outer query, so columns B.data and B.seq in the subquery have a single value each time the subquery runs. Just as if you had done a constant expression. The GROUP BY is irrelevant, since there's only one value for B.seq anyway.
Gordon Linoff is correct that the WHERE clause of the outer query will be either false if max(B.data) is 0 or else true if it's non-zero. If the subquery's WHERE clause is false, then the subquery will return no rows, which acts the same as if it had returned false.
mysql> SELECT * FROM A LEFT JOIN B ON A.seq = B.seq WHERE (select true where true);
+-----+------+------+
| seq | seq | data |
+-----+------+------+
| 1 | 1 | data |
+-----+------+------+
1 row in set (0.00 sec)
mysql> SELECT * FROM A LEFT JOIN B ON A.seq = B.seq WHERE (select true where false);
Empty set (0.00 sec)
Or if max(b.data) returns a value that is interpreted as a zero/false, it also makes the subquery evaluate to false, so the outer query returns no rows.
mysql> SELECT * FROM A LEFT JOIN B ON A.seq = B.seq WHERE (select 0 where true);
Empty set (0.00 sec)
mysql> SELECT * FROM A LEFT JOIN B ON A.seq = B.seq WHERE (select 'abc' where true);
Empty set, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+-----------------------------------------+
| Level | Code | Message |
+---------+------+-----------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: 'abc' |
+---------+------+-----------------------------------------+

Select all the (varied number of ) rows with the latest date

A MySQL database with data as follows:
project_id | updated | next_steps
1 | 2014-08-01 03:19:20 | new_com
2 | 2014-08-12 03:20:34 | NULL
3 | 2014-08-12 07:01:12 | NULL
4 | 2014-08-05 09:25:45 | comment
I want to select all the rows with the latest date in the column of 'update'. The difference in hours/minutes should be ignored. I expected to get the row 2 and row 3 from this example as follows:
2 | 2014-08-12 03:20:34 | NULL
3 | 2014-08-12 07:01:12 | NULL
Of course, for the real table, the number of rows meet my criteria is changed daily and the numbers could be 100, 200, 324, etc. (it is not a fixed number). I have tried the following code and always get errors.
SELECT * FROM `table` WHERE updated LIKE %DATE(MAX(updated))%;
or
SELECT * FROM `table` WHERE updated LIKE %CAST(DATE(MAX(updated)) AS CHAR)%;
Error message is
"#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '%CAST(DATE(MAX(updated)) AS CHAR)% LIMIT 0, 30' at line 1"
SELECT MAX(DATE(updated)) FROM table(this returns the 2014-08-12) use this as sub query. This gives back the max date. For example: SELECT * FROM table WHERE DATE(updated) = (SELECT MAX(DATE(updated)) FROM table) The sub query gives back the max date you want, after that you can query the right rows. This returns all the lines that were updated at the max date.
You need to use a WHERE query and use DATE(x) to calculate the maximum date without time and then select all values with that date without time.
Try this:
SELECT * FROM `table` WHERE DATE(`updated`) = (SELECT MAX(DATE(`updated`)) FROM `table`)
And if you still want them ordered
SELECT * FROM `table`
WHERE DATE(`updated`) = (SELECT MAX(DATE(`updated`)) FROM `table`) ORDER BY `updated` DESC
Happy Coding!
if you want two data,
SELECT * FROM `table` order by 'updated' desc limt 2

mysql strange behavior when inserting data

When inserting data to mysql via the phpmyadmin page, or via python I've seen something I can't explain:
cur.execute("INSERT INTO 28AA507A0500009E (timestamp, temp) VALUES ('2014-01-04 15:36:30',24.44)")
cur.execute("INSERT INTO 28D91F7A050000D9 (timestamp, temp) VALUES ('2014-01-04 15:36:30',24.44)")
cur.execute("INSERT INTO `28012E7A050000F5` (timestamp, temp) VALUES ('2014-01-04 15:36:30',24.44)")
Notice the last entry with the ` around the table name.
The first 2 entry's work fine without the apostrophe.
I can also put the apostrophes around all the table names and it still works.
Why can I remote the apostrophes from the first 2 lines, and not the 3rd one?
The tables are all created equally.
Edit 1:
In due respect to the following comments:
Your explanation is not entirely accurate. There is no alias in
the INSERT statement. I think that the part of the identifier after
28012E7 is just discarded as MySQL tries convert the identifier to
an integer value! – ypercube
these are table names not column names. – Sly Raskal
Well, MySQL sure have discarded the part of the table name identifier. My intention was to bring forward how a identifier name was interpreted when the system could not find it in the list of accessible table names ( I chose column/expression names in my examples ). As the engine interpreted it as a valid number but not as an identifier to represent a table, it threw an exception.
And I chose SELECT to clarify, why the table identifier was rejected for not putting in back quotes. Because it represents a number, it can't be used as an identifier directly, but should be surrounded with back quotes.
MySQL allows to suffix aliases just after numerics, numeric expressions surrounded by braces or literals. To one's surprise, a space between them is optional.
In your case, 28012E7A050000F5 is a valid exponent form ( 28012E7 ) of number 280120000000 suffixed with alias A050000F5. And hence 28012E7A050000F5 can't be used as a column name without back quotes. See following observations:
mysql> -- select 28012E7 as A050000F5;
mysql> select 28012E7A050000F5;
+--------------+
| A050000F5 |
+--------------+
| 280120000000 |
+--------------+
1 row in set (0.00 sec)
Following are some valid examples:
mysql> -- select ( item_count * price ) as v from orders;
mysql> select ( item_count * price )v from orders;
+-----+
| v |
+-----+
| 999 |
+-----+
1 rows in set (0.30 sec)
mysql> -- select ( 3 * 2 ) as a, 'Ravinder' as name;
mysql> select ( 3 * 2 )a, 'Ravinder'name;
+---+----------+
| a | name |
+---+----------+
| 6 | Ravinder |
+---+----------+
1 row in set (0.00 sec)

Mysql Like + Wild Card vs Equals Operator

I recently just fixed a bug in some of my code and was hoping someone could explain to me why the bug occurred.
I had a query like this:
SELECT * FROM my_table WHERE my_field=13
Unexpectedly, this was returning rows where my_field was equal to either 13 or 13a. The fix was simple, I changed the query to:
SELECT * FROM my_table WHERE my_field='13'
My question is, is this supposed to be the case? I've always thought that to return a similar field, you would use something like:
SELECT * FROM my_table WHERE my_field LIKE '13%'
What is the difference between LIKE + a Wild Card vs an equals operator with no quotes?
This statement returns rows for my_field = '13a':
SELECT * FROM my_table WHERE my_field=13
Because MySQL performs type conversion from string to number during the comparison, turning '13a' to 13. More on that in this documentation page.
Adding quotes turns the integer to a string, so MySQL only performs string comparison. Obviously, '13' cannot be equal to '13a'.
The LIKE clause always performs string comparison (unless either one of the operands is NULL, in which case the result is NULL).
My guess would be that since you didn't enclose it in quotes, and the column was a char/varchar column, MySQL tried to do an implicit conversion of the varchar column to an int.
If one of the rows in that table contained a value that couldn't be converted to an int, you would probably get an error. Also, because of the conversion, any indexes you might have had on that column would not be used either.
This has to do with types and type conversion. With my_field=13 , 13 is an integer, while my_field is in your case likely some form of text/string. In such a case, mysql will try to convert both to a floating point number and compare those.
So mysql tries to convert e,g, "13a" to a float, which will which be 13, and 13 = 13
In my_field = '13' , both operands are text and will be compared as text using =
In my_field like '13%' both operands are also text and will be compared as such using LIKE, where the special % means a wildcard.
You can read about the type conversion mysql uses here.
This is because the MySQL type conversion works this way. See here: http://dev.mysql.com/doc/refman/5.0/en/type-conversion.html
It releases a warning as well. see the code below
mysql> select 12 = '12bibo';
+---------------+
| 12 = '12bibo' |
+---------------+
| 1 |
+---------------+
1 row in set, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+--------------------------------------------+
| Level | Code | Message |
+---------+------+--------------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: '12bibo' |
+---------+------+--------------------------------------------+
1 row in set (0.00 sec)
Looks like someone raised a bug as well: http://bugs.mysql.com/bug.php?id=42241

ActiveRecord / MySQL Select Condition Comparing String Components

I have a string that is defined as one or more dot-separated integers like 12345, 543.21, 109.87.654, etc. I'm storing values in a MySQL database and then need to find the rows that compare with a provided value. What I want is to select rows by comparing each component of the string against the corresponding component of the input string. With standard string comparison in MySQL, here's where this breaks down:
mysql> SELECT '543.21' >= '500.21'
-> 1
mysql> SELECT '543.21' >= '5000.21'
-> 1
This is natural because the string comparison is a "dictionary" comparison that doesn't account for string length, but I want a 0 result on the second query.
Is there a way to provide some hint to MySQL on how to compare these? Otherwise, is there a way to hint to ActiveRecord how to do this for me? Right now, the best solution I have come up with is to select all the rows and then filter the results using Ruby's split and reject methods. (The entire data set is quite small and not likely to grow terribly much for the foreseeable future, so it is a reasonable option, but if there's a simpler way I'm not considering I'd be glad to know it.)
You can use REPLACE to remove dots and CAST to convert string to integer:
SELECT CAST(REPLACE("543.21", ".", "") AS SIGNED) >= CAST(REPLACE("5000.21", ".", "") AS SIGNED)
mysql> SELECT '543.21' >= '5000.21';
+-----------------------+
| '543.21' >= '5000.21' |
+-----------------------+
| 1 |
+-----------------------+
1 row in set (0.00 sec)
mysql> SELECT '543.21'+0 >= '5000.21'+0;
+---------------------------+
| '543.21'+0 >= '5000.21'+0 |
+---------------------------+
| 0 |
+---------------------------+
1 row in set (0.00 sec)
This indeed only works for valid floats. Doing it for more then 1 dot would require a LOT of comparing of SUBSTRING_INDEX(SUBSTRING_INDEX(field, '.', <positionnumber you're comparing>), '.', -1) (with a manual repeat for the maximum number of position's you are comparing)