SQL:Show column only if has data in it - mysql

Suppose I have a table like this:
name | age | experience
abc | 18 | 0
def | 19 | 0
efg | 20 | 0
I want to select the experience column only if any one value is greater than zero.
In this case, my SQL query should return only name and age and not experience.
If experience of lets say "efg" is greater than 0, then query should return name, age and experience.
I have tried following query
SELECT EXISTS (SELECT name,age,experience FROM emp_info )
AND NOT
EXISTS (SELECT experience FROM emp_info WHERE experience=0 );
But it is not working.

Try this:-
IF ((select max(experience)FROM emp_info) > 0)
SELECT name,age,experience FROM emp_info
ELSE
SELECT name,age FROM emp_info

In almost all relational databases, your queries have to return fixed numbers of columns, i.e the same number of columns for all rows. So what you are asking for isn't reasonable. You could probably get something like this to work on Informix due to jagged tables support, but that's the only one I can think of.
Other options you have include serializing to JSON in your query, or generating XML but that's a bit advanced for this and it is not clear this is what you want.
Normally we handle this on the front end, not in the database query.

SELECT name, age, NULLIF(experience, 0) from emp_info
Your question is kind of tricky because returning different set of column depending on the result is maybe not what you want to do, you could do it directly from your code, not from your SQL projection.

This may work :
if(experience > 0)
begin
SELECT name,age,experience FROM emp_info
end
else
begin
SELECT name,age FROM emp_info
end

Related

How to sum daily resetting data using MySQL

I am attempting to plot data cumulatively from a MySQL table which logs a value, resetting to 0 every day. After selecting the values using select * from table where DateTime BETWEEN DateA AND DateB, the data looks like this: current data. I would like the output to look like this: preferred data, ignoring the daily resets.
As I am a novice in SQL I was unable to find a solution to this. I did, however, obtain the correct output in Matlab using a for loop:
output = data;
for k=1:(size(data, 1)-1)
% check if next value is smaller than current
if data(k+1)<data(k)
% add current value to all subsequent values
output = output + (1:size(data, 1)>k)'.*input(k);
end
end
I would like the final product to connect to a web page, so I am curious if it would be possible obtain a similar result using only SQL. While I have tried using SUM(), I have only been able to sum all values, but I need to add the last value each day to all subsequent values.
Using CTE and comparing dates, you can sum all values each date.
Let's say that table1 below is defined.
create table table1 (col_date date, col_value int);
insert into table1 values
('2020-07-15',1000),
('2020-07-15',2000),
('2020-07-16',1000),
('2020-07-16',3000),
('2020-07-16',4000),
('2020-07-17',1000),
('2020-07-18',2000),
('2020-07-19',1000),
('2020-07-19',1000),
('2020-07-19',2000),
('2020-07-19',3000),
('2020-07-20',4000),
('2020-07-20',5000),
('2020-07-21',6000)
;
In this case, the query looks like this:
with cte1 as (
select col_date, sum(col_value) as col_sum from table1
where col_date between '2020-07-16' and '2020-07-20'
group by col_date
)
select a.col_date, max(a.col_sum), sum(b.col_sum)
from cte1 a inner join cte1 b on a.col_date >= b.col_date
group by a.col_date;
The output is below:
col_date |max(a.col_sum) |sum(b.col_sum)
2020-07-16 |8000 | 8000
2020-07-17 |1000 | 9000
2020-07-18 |2000 |11000
2020-07-19 |7000 |18000
2020-07-20 |9000 |27000
The column of max() is just for reference.

Finding LEAST/GREATEST values from combined COLUMNS, ignore 0 & NULL- MYSQL

I've got a dataset with a bunch of rows for monthly salary payments for each account. we have 6 columns for this -
Salary_1, Salary_2, Salary_3, Salary_4, Salary_5 and Salary_6.
Sometimes salaries 3, 4, 5, 6 and occasionally 2 aren't populated, sometimes none are populated because they're unemployed. In this case, we have 0 in the field.
What I need to do is combine all salaries and find the MAX and MIN from these columns ---
Select
Greatest(Salary_1, Salary_2, Salary_3, Salary_4, Salary_5, Salary_6) as MaxSal,
Least(COALESCE(Salary_1, Salary_2, Salary_3, Salary_4, Salary_5, Salary_6),0) as MinSal
from
(select
sal1 as Salary_1, Select sal2 as Salary_2, Sal3 as Salary_3, sal4 as Salary_4, Sal5 as Salary_5, Sal6 as Salary_6
from ....)a
The problem is, this is returning the correct value for Max Sal but 0.00 for Min, because it is the minimum value but won't let me ignore 0s, but in this case 0 is not a minimum salary value I want, I need the second lowest value here.
I've tried setting the original Sal1-Sal6 values to NULLIF 0 and it returns NULL for max and 0 for Min.
What else could I have a look at? the COALESCE combined with NULLIF has not worked for me which is what has been recommended on previous questions. Thanks!
Greatest and Least do not ignore nulls like aggregation functions do; you'll need to do something to avoid them. One option is something like this:
Greatest(IFNULL(Salary_1 ,0), ...)
Least(
CASE WHEN Salary_1 IS NULL OR Salary_1 = 0 THEN /*some huge value*/ ELSE Salary_1 END
, CASE WHEN Salary_2
....)
This might be simplest to unpivot and aggregate the data:
select id, max(salary), min(salary)
from ((select id, salary_1 as salary from t) union all
(select id, salary_2 as salary from t) union all
. . .
) t
group by id;
This is definitely more expensive than a giant case expression. On the other hand, it is less prone to error.
The real suggestion is to fix your data model. Trying to store an array in multiple columns is generally a sign of a poor data model. The more appropriate method would have one row per salary rather than putting them in separate columns.

Selecting value corresponding with MAX value of a group

I am trying to get the OXSEOURL of my OXSEO table.
Structure:
oxobjectid | oxseourl | oxparams
Data:
http://imageshack.com/a/img268/7443/3xr4.png
http://imageshack.com/a/img42/315/8bdu.png
My deepest SEO URL always has the higher value in OXPARAMS field.
Only the numeric values, the others are never count..
Return should be:
http://imageshack.com/a/img29/8404/4jbv.png
I found a solution yesterday, but it was very slow, now I am trying to get a faster way to do it.
So I would like to get the oxseourl for the same oxobjectid with the max oxparams value.
I have more than 330.000 rows, so every ms counts..
I only have to select the urls for products staring with "tbproduct_" objectid.
My query:
SELECT seo2.oxseourl, seo2.oxobjectid, seo2.oxparams
FROM oxseo AS seo2
JOIN (
SELECT oxobjectid,
MAX(oxparams) AS maxparam
FROM oxseo
GROUP BY
oxobjectid
) AS usm
ON usm.maxparam = seo2.oxparams
WHERE seo2.oxobjectid LIKE '%tbproduct_%'
AND seo2.oxparams REGEXP '^-?[0-9]+$'
But this returns the same rows for the products.
Thanks for any help.
A bit optimized, and a lot faster:
SELECT seo.oxseourl, seo.oxobjectid, MAX(seo.oxparams)
FROM oxseo AS seo
WHERE seo.oxobjectid LIKE 'tbproduct_%' AND seo.oxparams REGEXP '^-?[0-9]+$'
GROUP BY seo.oxseourl, seo.oxobjectid

How to use result of an subquery multiple times into an query

A MySQL query needs the results of a subquery in different places, like this:
SELECT COUNT(*),(SELECT hash FROM sets WHERE ID=1)
FROM sets
WHERE hash=(SELECT hash FROM sets WHERE ID=1)
and XD=2;
Is there a way to avoid the double execution of the subquery (SELECT hash FROM sets WHERE ID=1)?
The result of the subquery always returns an valid hash value.
It is important that the result of the main query also includes the HASH.
First I tried a JOIN like this:
SELECT COUNT(*), m.hash FROM sets s INNER JOIN sets AS m
WHERE s.hash=m.hash AND id=1 AND xd=2;
If XD=2 doesn't match a row, the result is:
+----------+------+
| count(*) | HASH |
+----------+------+
| 0 | NULL |
+----------+------+
Instead of something like (what I need):
+----------+------+
| count(*) | HASH |
+----------+------+
| 0 | 8115e|
+----------+------+
Any ideas? Please let me know! Thank you in advance for any help.
//Edit:
finally that query only has to count all the entries in an table which has the same hash value like the entry with ID=1 and where XD=2. If no rows matches that (this case happend if XD is set to an other number), so return 0 and simply hash value.
SELECT SUM(xd = 2), hash
FROM sets
WHERE id = 1
If id is a PRIMARY KEY (which I assume it is since your are using a single-record query against it), then you can just drop the SUM:
SELECT xd = 2 AS cnt, hash
FROM sets
WHERE id = 1
Update:
Sorry, got your task wrong.
Try this:
SELECT si.hash, COUNT(so.hash)
FROM sets si
LEFT JOIN
sets so
ON so.hash = si.hash
AND so.xd = 2
WHERE si.id = 1
I normally nest the statements like the following
SELECT Count(ResultA.Hash2) AS Hash2Count,
ResultA.Hash1
FROM (SELECT S.Hash AS Hash2,
(SELECT s2.hash
FROM sets AS s2
WHERE s2.ID = 1) AS Hash1
FROM sets AS S
WHERE S.XD = 2) AS ResultA
WHERE ResultA.Hash2 = ResultA.Hash1
GROUP BY ResultA.Hash1
(this one is hand typed and not tested but you should get the point)
Hash1 is your subquery, once its nested, you can reference it by its alias in the outer query. It makes the query a little larger but I don't see that as a biggy.
If I understand correctly what you are trying to get, query should look like this:
select count(case xd when 2 then 1 else null end case), hash from sets where id = 1 group by hash
I agree with the other answers, that the GROUP BY may be better, but to answer the question as posed, here's how to eliminate the repetition:
SELECT COUNT(*), h.hash
FROM sets, (SELECT hash FROM sets WHERE ID=1) h
WHERE sets.hash=h.hash
and sets.ID=1 and sets.XD=2;

Need Help streamlining a SQL query to avoid redundant math operations in the WHERE and SELECT

*Hey everyone, I am working on a query and am unsure how to make it process as quickly as possible and with as little redundancy as possible. I am really hoping someone there can help me come up with a good way of doing this.
Thanks in advance for the help!*
Okay, so here is what I have as best I can explain it. I have simplified the tables and math to just get across what I am trying to understand.
Basically I have a smallish table that never changes and will always only have 50k records like this:
Values_Table
ID Value1 Value2
1 2 7
2 2 7.2
3 3 7.5
4 33 10
….50000 44 17.2
And a couple tables that constantly change and are rather large, eg a potential of up to 5 million records:
Flags_Table
Index Flag1 Type
1 0 0
2 0 1
3 1 0
4 1 1
….5,000,000 1 1
Users_Table
Index Name ASSOCIATED_ID
1 John 1
2 John 1
3 Paul 3
4 Paul 3
….5,000,000 Richard 2
I need to tie all 3 tables together. The most results that are likely to ever be returned from the small table is somewhere in the neighborhood of 100 results. The large tables are joined on the index and these are then joined to the Values_Table ON Values_Table.ID = Users_Table.ASSOCIATED_ID …. That part is easy enough.
Where it gets tricky for me is that I need to return, as quickly as possible, a list limited to 10 results where value1 and value2 are mathematically operated on to return a new_ value where that new_value is less than 10 and the result is sorted by that new_value and any other where statements I need can be applied to the flags. I do need to be able to move along the limit. EG LIMIT 0,10 / 11,10 / 21,10 etc...
In a subsequent (or the same if possible) query I need to get the top 10 count of all types that matched that criteria before the limit was applied.
So for example I want to join all of these and return anything where Value1 + Value2 < 10 AND I also need the count.
So what I want is:
Index Name Flag1 New_Value
1 John 0 9
2 John 0 9
5000000 Richard 1 9.2
The second response would be:
ID (not index) Count
1 2
2 1
I tried this a few ways and ultimately came up with the following somewhat ugly query:
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
ORDER BY New_Value
LIMIT 0,10
And then for the count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
Being able to filter on the different flags and such in my WHERE clause is important; that may sound stupid to comment on but I mention that because from what I could see a quicker method would have been to use the HAVING statement but I don't believe that will work in certain instance depending on what I want to use my WHERE clause to filter against.
And when filtering using the flags table :
SELECT INDEX, NAME, Flag1, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
ORDER BY New_Value
LIMIT 0,10
...filtered count:
SELECT ID, COUNT(TYPE) as Count, (Value1 * some_variable + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE (Value1 * some_variable + Value1) < 10 AND Flag1 = 0
GROUP BY TYPE
ORDER BY New_Value
LIMIT 0,10
That works fine but has to run the math multiple times for each row, and I get the nagging feeling that it is also running the math multiple times on the same row in the Values_table table. My thought was that I should just get only the valid responses from the Values_table first and then join those to the other tables to cut down on the processing; with how SQL optimizes things though I wasn't sure if it might not already be doing that. I know I could use a HAVING clause to only run the math once if I did it that way but I am uncertain how I would then best join things.
My questions are:
Can I avoid running that math twice and still make the query work
(or I suppose if there is a good way
to make the first one work as well
that would be great)
What is the fastest way to do this
as this is something that will
be running very often.
It seems like this should be painfully simple but I am just missing something stupid.
I contemplated pulling into a temp table then joining that table to itself but that seems like I would trade math for iterations against the table and still end up slow.
Thank you all for your help in this and please let me know if I need to clarify anything here!
** To clarify on a question, I can't use a 3rd column with the values pre-calculated because in reality the math is much more complex then addition, I just simplified it for illustration's sake.
Do you have a benchmark query to compare against? Usually it doesn't work to try to outsmart the optimizer. If you have acceptable performance from a starting query, then you can see where extra work is being expended (indicated by disk reads, cache consumption, etc.) and focus on that.
Avoid the temptation to break it into pieces and solve those. That's an antipattern. That includes temp tables especially.
Redundant math is usually ok - what hurts is disk activity. I've never seen a query that needed CPU work reduction on pure calculations.
Gather your results and put them in a temp table
SELECT * into TempTable FROM (SELECT INDEX, NAME, Type, ID, Flag1, (Value1 + Value2) as New_Value
FROM Values_Table
JOIN Users_Table ON ASSOCIATED_ID = ID
JOIN Flags_Table ON Flags_Table.Index = Users_Table.Index
WHERE New_Value < 10)
ORDER BY New_Value
LIMIT 0,10
Return Result for First Query
SELECT INDEX, NAME, Flag1, New_Value
FROM TempTable
Return Results for count of Types
Select ID, Count(Type)
FROM TempTable
GROUP BY TYPE
Is there any chance that you can add a third column to the values_table with the pre-calculated value? Even if the result of your calculation is dependent on other variables, you could run the calculation for the whole table but only when those variables change.