According to the MySQL documentation:
As a general rule, you should never assign a value to a user variable and read the value within the same statement. You might get
the results you expect, but this is not guaranteed.
http://dev.mysql.com/doc/refman/5.6/en/user-variables.html
However, in the book High Perfomance MySQL there are a couple of examples of using this tactic to improve query performance anyway.
Is the following an anti-pattern and if so is there a better way to write the query while maintaining good performance?
set #last = null;
select tick, count-#last as delta, #last:=count from measurement;
For clarification, my goal is to find the difference between this row and the last. My table has a primary key on tick which is a datetime column.
Update:
After trying Shlomi's suggestion, I have reverted back to my original query. It turns out that using a case statement with aggregate functions produces unexpected behavior. See for example:
case when (#delta := (max(measurement.count) - #lastCount)) AND 0 then null
when (#lastCount := measurement.count) AND 0 then null
else #delta end
It appears that mysql evaluates the expressions that don't contain aggregate functions on a first pass through the results, and then evaluates the aggregate expressions on a second (grouping) pass. It appears to evaluate the case expression during or after that second pass and use the precalculated values from the first pass in that evaluation. The result is that the third line #delta is always the initial value of #delta (because assignment didn't happen until the grouping pass). I attempted to incorporate a group function into the line with #delta but couldn't get it to behave as expected. So I ultimately when back to my original query which didn't have this problem.
I would still love to hear any more suggestions about how to better handle a query like this.
Update 2:
Sorry for the lack of response on this question, I didn't have a chance to investigate further until now.
Using Shlomi's solution it looks like I had a problem because I was using a group by function when I read my #last variable but not when I set it. My code looked something like this:
CASE
WHEN (#delta := count - #last) IS NULL THEN NULL
WHEN (#last:= count ) IS NULL THEN NULL
ELSE (CASE WHEN cumulative THEN #delta ELSE avg(count) END)
END AS delta
MySQL appears to process expressions that don't contain aggregate functions in a first pass and ones that do in a second pass. The strange thing in the code above is that even when cumulative evaluates to true MySQL must see the AVG aggregate function in the ELSE clause and decides to evaluate the whole inner CASE expression in the second pass. Since #delta is set in an expression without an aggregate function it seems to be getting set on the first pass and by the time the second pass happens MySQL is done evaluating the lines that set #delta and #last.
Ultimately I seem to have found a fix by including aggregate functions in the first expressions as well. Something like this:
CASE
WHEN (#delta := max(count) - #last) IS NULL THEN NULL
WHEN (#last:= max(count) ) IS NULL THEN NULL
ELSE (CASE WHEN cumulative THEN #delta ELSE avg(count) END)
END AS delta
My understanding of what MySQL is doing is purely based on testing and conjecture since I didn't read the source code, but hopefully this will help others who might run into similar problems.
I am going to accept Shlomi's answer because it really is a good solution. Just be careful how you use aggregate functions.
I've researched this issue in depth, and wrote a few improvements on the above.
I offer a solution in this post, which uses functions whose order can be expected. Also consider my talk last year.
Constructs such as CASE and functions such as COALESCE have known underlying behavior (at least until this is changed, right?).
For example, a CASE clause inspects the WHEN conditions one by one, by order of definition.
Consider a rewrite of the original query:
select
tick,
CASE
WHEN (#delta := count-#last) IS NULL THEN NULL
WHEN (#last:=count ) IS NULL THEN NULL
ELSE #delta
END AS delta
from
measurement,
(select #last := 0) s_init
;
The CASE clause has three WHEN conditions. It executes them by order until it meets the first that succeeds. I've written them such that the first two will always fail. It therefore executes the first, then turns to execute the second, then finally returns the third. Always.
I thus overcome the problem of expecting order of evaluation, which is a real and true problem, mostly evident when you start adding more complex clauses such as GROUP BY, DISTINCT, ORDER BY and such.
As a final note, my solution differs from yours in the first row on the result set -- with yours' it returns NULL, with mine it returns the delta between 0 and count. Had I used NULL I would have needed to change the WHEN conditions in some other way -- making sure they would fail on NULL values.
Related
Suppose I have Post model that has is_verified column with smallint datatype, how can I get all records that is verified? One thing to do this is using this:
Post::where('is_verified', true)->get();
The code above will produce the following query:
select * from `posts` where `posts`.`is_verified` = true
... which will get me all verified Post records; in note that is_verified on all existing records is either 0 or 1.
However, after I get myself curious and try to manually change some is_verified's record value from 1 to another truthy number e.g. 2, the above eloquent query didn't work as expected anymore: records with is_verified value of 2 didn't get retrieved.
I tried to execute the sql query directly from HeidiSQL as well, but it was just the same. Then I tried to change the = in the sql query to is, and now it's working as expected i.e. all records with truthy is_verified get retrieved:
select * from `posts` where `posts`.`is_verified` is true
So my questions are:
Does the above behaviour is correct and expected?
How can I execute the last sql query in eloquent? One thing I can think of is where('is_verified', '!=', 0) but that feels weird in terms of readability especially when the query is pretty long and a bit complicated
As I stated before, the is_verified column is a smallint. Does this affects the behaviour? Because this conversation here states that boolean column datatype is typically tinyint, not smallint.
And that's it. Thank you in advance!
It is not the correct way to handle boolean values, you shouldn't save boolean columns as smallint, you can use the explicit boolean column type as described in the documentation.
Once you setup the boolean field correctly the logic you have in place will work. So Post::where('is_verified', true)->get(); will return the expected results.
Yes, the problem is the smallint column type, if you put tinyint it also should work like the boolean column. You can read more about the differences here.
After doing some deeper digging, I would like to write down the things I've found:
I have updated my mysql to the newest version as of now (v8) and boolean datatype defined in migration results in tinyint(1) in the db. This is happening turns out because in mysql bool or boolean are actually just the synonyms of tinyint(1), so that was a totally normal behaviour, not due to lower-version issues.
I found #dz0nika answer that states that smallint and tinyint results in different behaviour in the query to be quite incorrect. The two datatypes simply differ in terms of byte-size while storing integer value.
As of mysql documentation, it is stated that:
A value of zero is considered false. Nonzero values are considered true.
But also that:
However, the values TRUE and FALSE are merely aliases for 1 and 0, respectively.
Meaning that:
select * from `posts` where `posts`.`is_verified` = true;
Is the same as
select * from `posts` where `posts`.`is_verified` = 1;
Thus the query will only get Post records with is_verified value of 1.
To get Post records with truthy is_verified value, wether 1, or 2, or 3, etc; use is instead of = in the query:
select * from `posts` where `posts`.`is_verified` is true;
You can read more about these informations here and here (look for the "boolean" part)
So, how about the eloquent query? How can we get Post with truthy is_verified using eloquent?
I still don't know what's best. But instead of using where('is_verified', '!=', 0) as I stated in my question, I believe it's better to use whereRaw() instead:
Post::whereRaw('posts.is_verified is true')->get();
If you found this information to be quite missing or incorrect, please kindly reply. Your opinion is much appreciated.
I have my new Spring Boot project with SQL Server and I need to replace my MySQL native query on the Repository method in my old project with SQL Server native query. It's a complex query with the case when expression in where condition. When I try testing that query in SQL Server Management Studio it shows errors like the image below.
enter image description here
And here's my old native query use with MySQL on the Repository method I want to replace it with SQL Server
enter image description here
Please help me to find the solution.
Thank you in advance!!
This is what you have and what you should have posted as text within your question. As text it becomes searchable and copyable by people trying to help YOU.
case when #num = 1 then p.merchant_name = #query else 1=1 end
CASE is an expression in TSQL. It is not a control-of-flow construct like it is in many other languages. To use an "optional" filter, you need to construct a boolean expression using CASE which handles the "optional" attribute correctly. Often this is done with a bit more complexity using CASE like this:
case when #num = 1 and p.merchant_name <> #query then 0 else 1 end = 1
So here, CASE is used to return a value that can be tested in a comparison. There is no magic in using 0 or 1. Use any values of any type.
When #num is 1 and the values do NOT match, the THEN branch (0) is returned.
When #num is 1 and the values match, the ELSE branch (1) is returned.
When #num is anything but 1, the ELSE branch (1) is returned.
So when the CASE expression returns 0 (really - anything but 1), the row is ignored (removed from the resultset).
Given that your query is actually constructed in an application, you should considering dynamically building the query and adding parameters as needed. That will likely generate a more efficient query that can be better optimized by the database engine. Alternatively you can review this kitchen sink discussion and Erland's discussion of dynamic search conditions. TBH it looks like someone used #num as a kludge to avoid adding parameters for the eight specific filter values. If I want to filter on both merchant name and store name, I can't with this approach.
From MySQL Manual the output of the following query is not guaranteed to be same always.
SET #a := 0;
SELECT
#a AS first,
#a := #a + 1 AS second,
#a := #a + 1 AS third,
#a := #a + 1 AS fourth,
#a := #a + 1 AS fifth,
#a := #a + 1 AS sixth;
Output:
first second third fourth fifth sixth
0 1 2 3 4 5
Quoting from the Manual:
However,the order of evaluation for expressions involving user
variables is undefined;
I want to know the story behind.
So my question is : Why the order of evaluation for expressions involving user variables is undefined?
The order of evaluation of expressions in the select is undefined. For the most part, you only notice this when you have variables, because the errors result in erroneous information.
Why? The SQL standard does not require the order of evaluation, so each database is free to decide how to evaluate the expressions. Typically such decisions are left to the optimizer.
TL;DR MySQL user-defined variables are not intended to be used that way. An SQL statement describes a result set, not a series of operations. The documentation isn't clear about what variable assignments even mean. But you can't both read and write a variable. And assignment order within SELECT clause is not defined. And all you can assume is that assignments in an outer SELECT clause are done for some one output row.
Almost all the code you see like yours has undefined behaviour. Some sensible people demonstrate via the implementation code for operators & optimization what a particular implementation actually does. But that behaviour can't be relied on for the next release.
Read the documentation. Reading and writing the same variable is undefined. When it's not done, any variable read is fixed within a statement. There is no order to assignments. For SELECTs with only DETERMINISTIC functions (whose values are determined by argument values) the result is defined by a conceptual evaluation execution. But there is no connection between that and user variable. What an assignment ever means is not clear: the documention says "each select expression is evaluated only when sent to the client". This seems to be saying that there's no guarantee a row is even "selected" except in the sense of put into a result set per an outermost SELECT clause. The order of assignments in a SELECT is not defined. And even if assignments are conceptually done for every row, they can only depend on the row value, so that's the same as saying the assignment is done only once, for some row. And since assignment order is not defined, that row can be any row. So assuming that that is what the documentation means, all you can expect is that if you don't read and write from the same variable in a SELECT statement then each variable assignment in the outermost SELECT will have happened in some order for one output row.
It depends on database's optimizer's decision. That's why it's uncertain. But mostly optimizer decides as the way we predict the result.
I'm trying to learn how to use CASE/WHEN in SQL more specifically, mySQL.
I have two tables: adhoc_item and adhoc_vulnerability. ahdoc_item has one adhoc_vulnerability related to it. now adhoc_vulnerability may or may not have a vulnerability (original_vulnerability_id) attached to it.
In case it doesn't, then I want to return the name on itself.
The tables: (or relevant parts thereof)
ADHOC_ITEM
ADHOC_VULNERABILITY
This is my query:
SELECT adi.name as item, CASE ahv.original_vulnerability_id
WHEN ISNULL(ahv.original_vulnerability_id) THEN ahv.name
ELSE 'foo'
END NAME
FROM adhoc_item adi
JOIN adhoc_vulnerability ahv ON ahv.id = adi.adhoc_vulnerability_id
and these are the results!
Now I know this is a little bit complex but i've been thinking about it for about 2 hours and found no explanation on why that last item (which has an adhoc_vulnerability with NULL as original_vulnerability_id) shows the value that is supposed to appear precisely when that attribute is not null!
Thanks in advance.
The problem is that when ahv.original_vulnerability_id is NULL, the ISNULL() function returns TRUE (which is MySQL is the same as 1 but this is irrelevant). So, instead of comparing the column with NULL, you are comparing it with TRUE.
Rewrite the CASE expresion as:
CASE
WHEN ISNULL(ahv.original_vulnerability_id) THEN ahv.name
ELSE 'foo'
END NAME
or as:
CASE
WHEN ahv.original_vulnerability_id IS NULL THEN ahv.name
ELSE 'foo'
END NAME
I have a table of posts with titles that are in "human" alphabetical order but not in computer alphabetical order. These are in two flavors, numerical and alphabetical:
Numerical: Figure 1.9, Figure 1.10, Figure 1.11...
Alphabetical: Figure 1A ... Figure 1Z ... Figure 1AA
If I orderby title, the result is that 1.10-1.19 come between 1.1 and 1.2, and 1AA-1AZ come between 1A and 1B. But this is not what I want; I want "human" alphabetical order, in which 1.10 comes after 1.9 and 1AA comes after 1Z.
I am wondering if there's still a way in SQL to get the order that I want using string manipulation (or something else I haven't thought of).
I am not an expert in SQL, so I don't know if this is possible, but if there were a way to do conditional replacement, then it seems I could impose the order I want by doing this:
delete the period (which can be done with replace, right?)
if the remaining figure number is more than three characters, add a 0 (zero) after the first character.
This would seem to give me the outcome I want: 1.9 would become 109, which comes before 110; 1Z would become 10Z, which comes before 1AA. But can it be done in SQL? If so, what would the syntax be?
Note that I don't want to modify the data itself—just to output the results of the query in the order described.
This is in the context of a Wordpress installation, but I think the question is more suitably an SQL question because various things (such as pagination) depend on the ordering happening at the MySQL query stage, rather than in PHP.
My first thought is to add an additional column that is updated by a trigger or other outside mechanism.
1) Use that column to do the order by
2) Whatever mechanism updates the column will have the logic to create an acceptable order by surrogate (e.g. it would turn 1.1 into AAA or something like that).
Regardless...this is going to be a pain. I do not evny you.
You can create function which have logic to have human sort order like
Alter FUNCTION [dbo].[GetHumanSortOrder] (#ColumnName VARCHAR(50))
RETURNS VARCHAR(20)
AS
BEGIN
DECLARE #HumanSortOrder VARCHAR(20)
SELECT #HumanSortOrder =
CASE
WHEN (LEN(replace(replace(<Column_Name>,'.',''),'Figure ',''))) = 2
THEN
CONCAT (SUBSTRING(replace(replace(<Column_Name>,'.',''),'Figure ',''),1,1),'0',SUBSTRING(replace(replace(<Column_Name>,'.',''),'Figure ',''),2,2))
ELSE
replace(replace(<Column_Name>,'.',''),'Figure ','')
END
FROM <Table_Name> AS a (NOLOCK)
WHERE <Column_Name> = #ColumnName
RETURN #HumanSortOrder
END
this function give you like 104,107,119,10A, 10B etc as desired
And you can use this function as order by
SELECT * FROM <Table_Name> ORDER BY GetHumanSortOrder(<Column_Name>)
Hope this helps