Missing values in a query - mysql

I encounter some strange results in the following query :
SET #indi_id = 768;
SET #generations = 8;
SELECT num, sosa, seq, len, dernier, ful_ful_nom
FROM fullindi
LEFT JOIN lignee_new
ON ((ful_indi_id = dernier) AND (len BETWEEN 1 AND #generations))
RIGHT JOIN numbers
ON ((sosa = num) AND (premier = #indi_id))
WHERE num BETWEEN 1 AND pow(2, #generations)
GROUP BY num
ORDER BY num;
The result looks like this :
Why the row just before a full NULL one doesn't display the existing values 'sosa', 'len', 'dernier', ful_ful_nom') but only the 'seq' value (see rows 43 and 47 in this example) ?
What am I missing?
As requested, here are data :
table lignee_new :
table fullindi :

The problem is that MySQL does really dumb things when an Aggregate function is introduced, or a GROUP BY is included, but not all of the fields are in an Aggregate Function or your GROUP BY.
You are asking it to GROUP BY num but none of the other columns in your SELECT are included in the Group BY nor are they being aggregated with a function (SUM, MAX, MIN, AVG, etc..)
In any other RDBMS this query wouldn't run and would throw an error, but MySQL just carries on. It uses the logic to decide which value it should show for each field that isn't num by just grabbing the first value it finds in it's data storage which may be different between innoDB and whatever else folks use anymore.
My guess is that in your case you have more than one record in lignee_new that has a num of 43. Since you GROUP BY num and nothing else, it just grabs values randomly from your multiple records where num=43 and displays them... which is reasonable. By not including them in an aggregate function you are pretty much saying "I don't care what you display for these other fields, just bring something back" and so MySQL does.
Remove your GROUP BY clause completely and you'll see data that makes sense. Perhaps use WHERE to further filter your records to get rid of nulls or other things you don't need (don't use GROUP BY to filter).

Related

DENSE_RANK() OVER and IFNULL()

Let's say I have a table like this -
id
number
1
1
2
1
3
1
I want to return the second largest number, and if there isn't, return NULL instead. In this case, since all the numbers in the table are the same, there isn't the second largest number, so it should return NULL.
These codes work -
SELECT IFNULL((
SELECT number
FROM (SELECT *, DENSE_RANK() OVER(ORDER BY number DESC) AS ranking
FROM test) r
WHERE ranking = 2), NULL) AS SecondHighestNumber;
However, after I changed the order of the query, it doesn't work anymore -
SELECT IFNULL(number, NULL) AS SecondHighestNumber
FROM (SELECT *, DENSE_RANK() OVER(ORDER BY number DESC) AS ranking
FROM test) r
WHERE ranking = 2;
It returns blank instead of NULL. Why?
Explanation
This is something of a byproduct of the way you are using subquery in your SELECT clause, and really without a FROM clause.
It is easy to see with a very simple example. We create an empty table. Then we select from it where id = 1 (no results as expected).
CREATE TABLE #foo (id int)
SELECT * FROM #foo WHERE id = 1; -- Empty results
But now if we take a left turn and turn that into a subquery in the select statement - we get a result!
CREATE TABLE #foo (id int)
SELECT (SELECT * FROM #foo WHERE id = 1) AS wtf; -- one record in results with value NULL
I'm not sure what else we could ask our sql engine to do for us - perhaps cough up an error and say I can't do this? Maybe return no results? We are telling it to select an empty result set as a value in the SELECT clause, in a query that doesn't have any FROM clause (personally I would like SQL to cough up and error and say I can't do this ... but it's not my call).
I hope someone else can explain this better, more accurately or technically - or even just give a name to this behavior. But in a nutshell there it is.
tldr;
So your first query has SELECT clause with an IFNULL function in it that uses a subquery ... and otherwise is a SELECT without a FROM. So this is a little weird but does what you want, as shown above. On the other hand, your second query is "normal" sql that selects from a table, filters the results, and lets you know it found nothing -- which might not be what you want but I think actually makes more sense ;)
Footnote: my "sql" here is T-SQL, but I believe this simple example would work the same in MySQL. And for what it's worth, I believe Oracle (back when I learned it years ago) actually would cough up errors here and say you can't have a SELECT clause with no FROM.

MySQL returns all rows when field=0 from SECOND Select query

This case is similar to: S.O Question; mySQL returns all rows when field=0, and the Accepted answer was a very simple trick, to souround the ZERO with single quotes
FROM:
SELECT * FROM table WHERE email=0
TO:
SELECT * FROM table WHERE email='0'
However, my case is slightly different in that my Query is something like:
SELECT * FROM table WHERE email=(
SELECT my_column_value FROM myTable WHERE my_column_value=0 AND user_id =15 LIMIT 1 )
Which in a sense, becomes like simply saying: SELECT * FROM table WHERE email=0, but now with a Second Query.
PLEASE NOTE: It is a MUST that I use the SECOND QUERY.
When I tried: SELECT * FROM table WHERE email='( SELECT my_column_value FROM myTable WHERE my_column_value=0 LIMIT 1 )' (Notice the Single Quotes on the second query)
MySql SCREAMED Errors near '(.
How can this be achieved
Any Suggestion is highly honored
EDIT1: For a visual perspective of the Query
See the STEN_TB here: http://snag.gy/Rq8dq.jpg
Now, the main aim is to get the sten_h where rawscore_h = 0;
The CURRENT QUERY as a whole.
SELECT sten_h
FROM sten_tb
WHERE rawscore_h = (
SELECT `for_print_stens_rowscore`
FROM `for_print_stens_tb`
WHERE `for_print_stens_student_id` =3
AND `for_print_stens_factor_name` = 'Factor H' )
The result of the Second Query can be any number including ZERO.
Any number from >=1 Works and returns a single corresponding value from sten_h. Only =0 does not Work, it returns all rows
That's the issue.
CORRECT ANSWER OR SOLUTION FOR THIS
Just in case someone ends up in this paradox, the Accepted answer has it all.
SEE STEN_TB: http://snag.gy/Rq8dq.jpg
SEE The desired Query result here: http://snag.gy/wa4yA.jpg
I believe your issue is with implicit datatype conversions. You can make those datatype conversions explicit, to gain control.
(The "trick" with wrapping a literal 0 in single quotes, that makes the literal a string literal, rather than a numeric.)
In the more general case, you can use a CAST or CONVERT function to explicitly specify a datatype conversion. You can use an expression in place of a column name, wherever you need to...
For example, to get the value returned by my_column_value to match the datatype of the email column, assuming email is character type, something like:
... email = (SELECT CONVERT(my_column_value,CHAR(255)) FROM myTable WHERE ...
or, to get the a literal integer value to be a string value:
... FROM myTable WHERE my_column_value = CONVERT(0,CHAR(30)) ...
If email and my_column_value are just indicating true or false then they should almost certainly be both BIT NOT NULL or other two-value type that your schema uses for booleans. (Your ORM may use a particular one.) Casting is frequently a hack made necessary by a poor design.
If it should be a particular user then you shouldn't use LIMIT because tables are unordered and that doesn't return a particular user. Explain in your question what your query is supposed to return including exactly what you mean by "15th".
(Having all those similar columns is bad design: rawscore_a, sten_a, rawscore_b, sten_b,... . Use a table with two columns: rawscore, sten.)

Table statistics (aka row count) over time

i'm preparing a presentation about one of our apps and was asking myself the following question: "based on the data stored in our database, how much growth have happend over the last couple of years?"
so i'd like to basically show in one output/graph, how much data we're storing since beginning of the project.
my current query looks like this:
SELECT DATE_FORMAT(created,'%y-%m') AS label, COUNT(id) FROM table GROUP BY label ORDER BY label;
the example output would be:
11-03: 5
11-04: 200
11-05: 300
unfortunately, this query is missing the accumulation. i would like to receive the following result:
11-03: 5
11-04: 205 (200 + 5)
11-05: 505 (200 + 5 + 300)
is there any way to solve this problem in mysql without the need of having to call the query in a php-loop?
Yes, there's a way to do that. One approach uses MySQL user-defined variables (and behavior that is not guaranteed)
SELECT s.label
, s.cnt
, #tot := #tot + s.cnt AS running_subtotal
FROM ( SELECT DATE_FORMAT(t.created,'%y-%m') AS `label`
, COUNT(t.id) AS cnt
FROM articles t
GROUP BY `label`
ORDER BY `label`
) s
CROSS
JOIN ( SELECT #tot := 0 ) i
Let's unpack that a bit.
The inline view aliased as s returns the same resultset as your original query.
The inline view aliased as i returns a single row. We don't really care what it returns (except that we need it to return exactly one row because of the JOIN operation); what we care about is the side effect, a value of zero gets assigned to the #tot user variable.
Since MySQL materializes the inline view as a derived table, before the outer query runs, that variable gets initialized before the outer query runs.
For each row processed by the outer query, the value of cnt is added to #tot.
The return of s.cnt in the SELECT list is entirely optional, it's just there as a demonstration.
N.B. The MySQL reference manual specifically states that this behavior of user-defined variables is not guaranteed.

Stored procedure to execute a query and return selected values if the query returns only 1 result

So my query is the following, which may return many results:
SELECT P_CODE, NAME FROM TEST.dbo.PEOPLE
WHERE NAME LIKE '%JA%'
AND P_CODE LIKE '%003%'
AND DOB LIKE '%1958%'
AND HKID = ''
AND (MOBILE LIKE '%28%' OR TEL LIKE '%28%')
I would like to integrate this into a Stored Procedure (or View?) so that it will only return a result if the query results in exactly 1 row. If there's 0 or > 1, then it should return no results.
If you just want to return an empty resultset in cases other than 1:
;WITH x AS
(
SELECT P_CODE, NAME, c = COUNT(*) OVER()
FROM TEST.dbo.PEOPLE
WHERE NAME LIKE '%JA%'
AND P_CODE LIKE '%003%'
AND DOB LIKE '%1958%'
AND HKID = ''
AND (MOBILE LIKE '%28%' OR TEL LIKE '%28%')
)
SELECT P_CODE, NAME FROM x WHERE c = 1;
Otherwise, you'll have to run the query twice (or dump the results to intermediate storage, such as a #temp table) - once to get the count, and once to decide based on the count whether to run the SELECT or not.
Effectively you want something akin to FirstOrDefault() from the Linq-to-SQL implementation but done on the server-side which means you will need to execute the query in a stored procedure, dumping the results into a temp table variable and then access ##ROWCOUNT afterwards to get the number of rows that were returned and then decide whether or not to forward the results on to the caller. If you do, be sure to use TOP 1 in the query from the temp table so that you only get a single result out as you desire.
UPDATE:
I described the alternate solution from what Aaron describes in his answer (which I like better).
Removed unnecessary TOP specifier in solution specification.

IsNumeric in SQL Server JOIN

My problem seems to be very simple but I'm stuck here. I have a table which has an "nvarchar" column called "SrcID" and I store both numbers and strings in that. Now, when I try to check for "IsNumeric" on that column in a "Join" condition, something like below,
ISNUMERIC(SrcID) = 1 AND SrcID > 15
I am getting the following error:
Msg 245, Level 16, State 1, Line 47
Conversion failed when converting the nvarchar value 'Test' to data type int.
Amazingly, when I remove the check "SrcID > 15", my query is running properly. Should I include anything else in this statement?
Please help me in fixing the issue. Thanks in advance!!
You can't count on the order in which a database will evaluate filtering expressions. There is a query optimizer that will evaluate your SQL and build a plan to execute the query based on what it perceives will yield the best performance.
In this context, IsNumeric() cannot be used with an index, and it means running a function against every row in the table. Therefore, it will almost never provide the best perceived performance. Compare this with the SrcID > 15 expression, which can be matched with an index (if one exists), and is just a single operator expression even if one doesn't. It can also be used to filter down the number of potential rows where the IsNumeric() function needs to run.
You can likely get around this with a view, a subquery, a CTE, a CASE statement, or a computed column. Here's a CTE example:
With NumericOnly As
(
SELECT <columns> FROM MyTable WHERE IsNumeric(SrcID) = 1
)
SELECT <columns> FROM NumericOnly WHERE SrcID > 15
And here's a CASE statement option:
SELECT <columns> FROM MyTable WHERE CASE WHEN IsNumeric(SrcIC) = 1 THEN Cast(SrcID As Int) ELSE 0 END > 15
The filters in a WHERE clause are not evaluated in any particular order.
This is a common misconception with SQL Server - the optimizer will check whichever conditions it thinks it can the fastest/easiest, and try to limit the data in the most efficient way possible.
In your example, you probably have an index on SrcID, and the optimizer thinks it will be quicker to FIRST limit the results to where the SrcID > 15, then run the function on all those rows (since the function will need to check every single row otherwise).
You can try to force an order of operations with parentheses like:
WHERE (ISNUMERIC(SrcID) = 1) AND SrcID > 15
Or with a case statement:
WHERE CASE WHEN ISNUMERIC(SrcID) = 1 THEN SrcID > 15 ELSE 1=0 END