I have a table that has records
id_queue | user_id | id_book | status
69 | 5 | 4 | 1
133 | 3 | 4 | 2
142 | 1 | 4 | 0
I want a query that will give me this result
id_queue | id_queue
69 | 142
133 | null
I've tried something like this
SELECT s1.`id_queue`,s2.`id_queue` FROM `second` as s1
LEFT JOIN `second` as s2 ON s2.`book_id`=4 AND s2.`status` IN (0)
WHERE s1.`book_id`=4 AND s1.`status` IN (1,2,4)
but it keeps bringing me this result.
id_queue | id_queue
69 | 142
133 | 142
I think this is because I don't have anything identical for conditions. What can I do?
The issue is that you are joining on the book_id not the status, so you will always have a matching row.
If you want to hide the second instance of the second id_queue then you can use user defined variables:
select Queue1,
case when seq = 1 then queue2 else null end Queue2
from
(
select
Queue1,
Queue2,
#row:=(case when #prev=Queue2 then #row else 0 end) +1 as Seq,
#prev:=Queue2
from
(
SELECT
s1.`id_queue` Queue1,
s2.`id_queue` Queue2
FROM `second` as s1
LEFT JOIN `second` as s2
ON s1.`id_book` = s2.`id_book`
AND s2.`status` = 0
WHERE s1.`id_book`=4
AND s1.`status` IN (1,2,4)
) src, (SELECT #row:=0, #prev:=null) r
order by Queue1, Queue2
) s1
order by Queue1, Queue2
See SQL Fiddle with Demo
The result is:
| QUEUE1 | QUEUE2 |
-------------------
| 69 | 142 |
| 133 | (null) |
I believe this happens because you're setting multiple values of the same column on multiple columns... this way it may end up seeming as multiple rows.
And it seems that if you put a "DISTINCT" command just after the select it wouldn't cause any value to disappear, as it would check if all the columns of the row were the same to another one.
But you can always try to put the DISTINCT just before the value you don't want to repeat and check if it works.
Wish I could help you more.
Example:
SELECT s1.`id_queue`, DISTINCT (s2.`id_queue`)
FROM `second` as s1
LEFT JOIN `second` as s2 ON s2.`book_id`=4 AND s2.`status` IN (0)
WHERE s1.`book_id`=4 AND s1.`status` IN (1,2,4)
Related
Apologies if this has been answered elsewhere and I'm just not seeing it; this is the closest I've found, but it isn't quite what I'm trying to do.
MySQL - updating all records to match max value in group
I have a table on a production web server with about 15,000 rows. There are many records sharing an item_name with another record, but item_meters is (normally, but not always) a unique value for each row. item_id is always unique. Currently every record has a value of "0" in the item_flag column.
I would like to update all records with the largest item_meters value within each item_name group to have an item_flag value of "1".
Here is a simplified version of the table ordered by item_id ASC:
----------------------------------------------
mytable
----------------------------------------------
item_id | item_name | item_meters | item_flag
--------+-----------+-------------+-----------
001 | aaa | 224 | 0
002 | aaa | 359 | 0
003 | aaa | 456 | 0
004 | bbb | 489 | 0
005 | bbb | 327 | 0
006 | bbb | 215 | 0
007 | ccc | 208 | 0
008 | ccc | 756 | 0
009 | ccc | 756 | 0
--------+-----------+-------------+-----------
The desired result would be a table with "1" in the item_flag column for each "aaa" having the largest item_meters, each "bbb" having the largest item_meters, each "ccc" having the largest item_meters, etc. like this:
----------------------------------------------
mytable
----------------------------------------------
item_id | item_name | item_meters | item_flag
--------+-----------+-------------+-----------
001 | aaa | 224 | 0
002 | aaa | 359 | 0
003 | aaa | 456 | 1
004 | bbb | 489 | 1
005 | bbb | 327 | 0
006 | bbb | 215 | 0
007 | ccc | 208 | 0
008 | ccc | 756 | 1
009 | ccc | 756 | 0
--------+-----------+-------------+-----------
(In case there are 2 or more records having the same item_name and the same item_meters (e.g. item_id 008 and 009 above), the desired result would be for the record with the numerically lower item_id (item_id is always unique), to have an item_flag value of "1" while the row with a numerically higher item_id would still have an item_flag value of "0")
Also of note, even though this database is running behind a production web server with new rows added every day, there will be no need to update the table every time a new row is added. It is something that will only be required once, regardless of whether new rows are later added outside of the parameters. The reason I mention this, is because execution speed is not a big concern since the query will only be executed once.
Thank you in advance! Please let me know if I can provide more info or clarify my question in any way.
The approach I take is to first write a query (SELECT statement) that will return the item_id values of the rows we want to update.
As a starting point, get the maximum value for item_meters, a simple query like this:
SELECT m.item_name
, MAX(m.item_meters) AS max_item_meters
FROM my_table m
GROUP BY m.item_name
We can use that query as an inline view in another query, to get the lowest item_id for each of those item_name
SELECT MIN(o.item_id) AS min_item_id
FROM ( SELECT m.item_name
, MAX(m.item_meters) AS max_item_meters
FROM my_table m
GROUP BY m.item_name
) n
JOIN my_table o
ON o.item_name = n.item_name
AND o.item_meters = n.max_item_meters
GROUP BY o.item_name, o.item_meters
And we can use that query as an inline view that gets the whole row associated with the item_id values we returned...
SELECT t.item_id
, t.item_name
, t.item_meters
, t.item_flag
FROM my_table t
JOIN ( SELECT p.min_item_id
FROM ( SELECT MIN(o.item_id) AS min_item_id
FROM ( SELECT m.item_name
, MAX(m.item_meters) AS max_item_meters
FROM my_table m
GROUP BY m.item_name
) n
JOIN my_table o
ON o.item_name = n.item_name
AND o.item_meters = n.max_item_meters
GROUP BY o.item_name, o.item_meters
) p
) q
ON q.min_item_id = t.item_id
Once the SELECT query is working, convert that to an UPDATE statement... replace the SELECT ... FROM with UPDATE, and add a SET clause. (Sometimes, it's necessary to wrap the inline view in yet another SELECT, to avoid MySQL error about disallowing references to the table we are updating.)
UPDATE my_table t
JOIN ( SELECT p.min_item_id
FROM ( SELECT MIN(o.item_id) AS min_item_id
FROM ( SELECT m.item_name
, MAX(m.item_meters) AS max_item_meters
FROM my_table m
GROUP BY m.item_name
) n
JOIN my_table o
ON o.item_name = n.item_name
AND o.item_meters = n.max_item_meters
GROUP BY o.item_name, o.item_meters
) p
) q
ON q.min_item_id = t.item_id
SET t.item_flag = '1'
If the intent is not to UPDATE the existing table, but to return a resultset, we can write a query and do an outer join to that same inline view, and return either a 0 or a 1 for item_flag, testing whether the item_id matches one we want flagged as 1...
SELECT t.item_id
, t.item_name
, t.item_meters
, IF(q.min_item_id IS NULL,0,1) AS `item_flag`
FROM my_table t
LEFT
JOIN ( SELECT p.min_item_id
FROM ( SELECT MIN(o.item_id) AS min_item_id
FROM ( SELECT m.item_name
, MAX(m.item_meters) AS max_item_meters
FROM my_table m
GROUP BY m.item_name
) n
JOIN my_table o
ON o.item_name = n.item_name
AND o.item_meters = n.max_item_meters
GROUP BY o.item_name, o.item_meters
) p
) q
ON q.min_item_id = t.item_id
I have 2 tables that I am trying to join but I am not sure how to make it the most time efficient.
Tasks Table:
nid | created_by | claimed_by | urgent
1 | 11 | 22 | 1
2 | 22 | 33 | 1
3 | 33 | 11 | 1
1 | 11 | 43 | 0
1 | 11 | 44 | 1
Employee Table:
userid | name
11 | EmployeeA
22 | EmployeeB
33 | EmployeeC
Result I am trying to get:
userid | created_count | claimed_count | urgent_count
11 | 3 | 1 | 3
22 | 1 | 1 | 2
33 | 1 | 1 | 2
created_account column will show total # of tasks created by that user.
claimed_count column will show total # of tasks claimed by that user.
urgent_count column will show total # of urgent tasks (created or claimed) by that user.
Thanks in advance!
I would start by breaking this up into pieces and then putting them back together. You can get the created_count and claimed_count using simple aggregation like this:
SELECT created_by, COUNT(*) AS created_count
FROM myTable
GROUP BY created_by;
SELECT claimed_by, COUNT(*) AS claimed_count
FROM myTable
GROUP BY claimed_by;
To get the urgent count for each employee, I would join the two tables on the condition that the employee is either the created_by or claimed_by column, and group by employee. Instead of counting, however, I would use SUM(). I am doing this because it appears each row will be either 0 or 1, so SUM() will effectively count all non-zero rows:
SELECT e.userid, SUM(t.urgent)
FROM employee e
JOIN task t ON e.userid IN (t.created_by, t.claimed_by)
GROUP BY e.userid;
Now that you have all the bits of data you need, you can use an outer join to join all of those subqueries to the employees table to get their counts. You can use the COALESCE() function to replace any null counts with 0:
SELECT e.userid, COALESCE(u.urgent_count, 0) AS urgent_count, COALESCE(crt.created_count, 0) AS created_count, COALESCE(clm.claimed_count, 0) AS claimed_count
FROM employee e
LEFT JOIN(
SELECT e.userid, SUM(t.urgent) AS urgent_count
FROM employee e
JOIN task t ON e.userid IN (t.created_by, t.claimed_by)
GROUP BY e.userid) u ON u.userid = e.userid
LEFT JOIN(
SELECT claimed_by, COUNT(*) AS claimed_count
FROM task
GROUP BY claimed_by) clm ON clm.claimed_by = e.userid
LEFT JOIN(
SELECT created_by, COUNT(*) AS created_count
FROM task
GROUP BY created_by) crt ON crt.created_by = e.userid;
Here is an SQL Fiddle example.
I have a huge table where a new row could be an "adjustment" to a previous row.
TableA:
Id | RefId | TransId |Score
----------------------------------
101 | null | 3001 | 10
102 | null | 3002 | 15
103 | null | 3003 | 15
104 | 101 | | -5
105 | null | 3004 | 5
106 | 105 | | -10
107 | null | 3005 | 15
TableB:
TransId | Person
----------------
3001 | Harry
3002 | Draco
3003 | Sarah
3004 | Ron
3005 | Harry
In the table above, Harry was given 10 points in TableA.Id=101, deducted 5 of those points in TableA.Id=104, and then given another 15 points in TableA.Id=107.
What I want to do here, is return all the rows where Harry is the person connected to the score. The problem is that there is no name attached to a row where points are deducted, only to the rows where scores are given (through TableB). However, scores are always deducted from a previously given score, where the original transaction's Id is referred to in the tables as "RefId".
SELECT
SUM TableA.Score
FROM TableA
LEFT JOIN TableB ON TableA.Trans=TableB.TransId
WHERE 1
AND TableB.Person='Harry'
GROUP BY TableA.Score
That only gives me the points given to Harry, not the deducted ones. I would like to get the total scored returned, which would be 20 for Harry. (10-5+15=20)
How do I get MySQL to include the negative scores as well? I feel like it should be possible using the TableA.RefId. Something like "if there is a RefId, get the score from this row, but look at the corresponding TableA.Id for the rest of the data".
Select sum(total) AS total
From tableb
Join
(
Select t1.transid, sum(score) AS total
From tablea t1
Join tablea t2 on t1.id = t2.refid
group by t1.transid
) x on x.transid = tableb.transid
Where TableB.Person='Harry'
try this:
select sum(sum1 + sums) as sum_all from (
SELECT t1.id,T1.Score sum1, coalesce(T2.score,0) sums
FROM Table1 t1
inner JOIN Table2 ON T1.TransId=Table2.TransId
left JOIN Table1 t2 ON t2.RefId = t1.id
WHERE Table2.Person='Harry'
)c
DEMO HERE
OUTput:
SUM_ALL
20
If you assume that adjustments don't modify adjustments, you can do this without aggregating all the data:
select sum(a.score + coalesce(aref.score, 0)) as HarryScore
from tableA a left outer join
tableA aref
on a.refId = aref.id left outer join
tableB b
on a.TransId = b.Transid left outer join
tableB bref
on aref.TransId = bref.TransId
where b.Person = 'Harry' or bref.Person = 'Harry';
Update #1: query gives me syntax error on Left Join line (running the query within the left join independently works perfectly though)
SELECT b1.company_id, ((sum(b1.credit)-sum(b1.debit)) as 'Balance'
FROM MyTable b1
JOIN CustomerInfoTable c on c.id = b1.company_id
#Filter for Clients of particular brand, package and active status
where c.brand_id = 2 and c.status = 2 and c.package_id = 3
LEFT JOIN
(
SELECT b2.company_id, sum(b2.debit) as 'Current_Usage'
FROM MyTable b2
WHERE year(b2.timestamp) = '2012' and month(b2.timestamp) = '06'
GROUP BY b2.company_id
)
b3 on b3.company_id = b1.company_id
group by b1.company_id;
Original Post:
I keep track of debits and credits in the same table. The table has the following schema:
| company_id | timestamp | credit | debit |
| 10 | MAY-25 | 100 | 000 |
| 11 | MAY-25 | 000 | 054 |
| 10 | MAY-28 | 000 | 040 |
| 12 | JUN-01 | 100 | 000 |
| 10 | JUN-25 | 150 | 000 |
| 10 | JUN-25 | 000 | 025 |
As my result, I want to to see:
| Grouped by: company_id | Balance* | Current_Usage (in June) |
| 10 | 185 | 25 |
| 12 | 100 | 0 |
| 11 | -54 | 0 |
Balance: Calculated by (sum(credit) - sum(debits))* - timestamp does not matter
Current_Usage: Calculated by sum(debits) - but only for debits in JUN.
The problem: If I filter by JUN timestamp right away, it does not calculate the balance of all time but only the balance of any transactions in June.
How can I calculate the current usage by month but the balance on all transactions in the table. I have everything working, except that it filters only the JUN results into the current usage calculation in my code:
SELECT b.company_id, ((sum(b.credit)-sum(b.debit))/1024/1024/1024/1024) as 'BW_remaining', sum(b.debit/1024/1024/1024/1024/28*30) as 'Usage_per_month'
FROM mytable b
#How to filter this only for the current_usage calculation?
WHERE month(a.timestamp) = 'JUN' and a.credit = 0
#Group by company in order to sum all entries for balance
group by b.company_id
order by b.balance desc;
what you will need here is a join with sub query which will filter based on month.
SELECT T1.company_id,
((sum(T1.credit)-sum(T1.debit))/1024/1024/1024/1024) as 'BW_remaining',
MAX(T3.DEBIT_PER_MONTH)
FROM MYTABLE T1
LEFT JOIN
(
SELECT T2.company_id, SUM(T2.debit) T3.DEBIT_PER_MONTH
FROM MYTABLE T2
WHERE month(T2.timestamp) = 'JUN'
GROUP BY T2.company_id
)
T3 ON T1.company_id-T3.company_id
GROUP BY T1.company_id
I havn't tested the query. The point here i am trying to make is how you can join your existing query to get usage per month.
alright, thanks to #Kshitij I got it working. In case somebody else is running into the same issue, this is how I solved it:
SELECT b1.company_id, ((sum(b1.credit)-sum(b1.debit)) as 'Balance',
(
SELECT sum(b2.debit)
FROM MYTABLE b2
WHERE b2.company_id = b1.company_id and year(b2.timestamp) = '2012' and month(b2.timestamp) = '06'
GROUP BY b2.company_id
) AS 'Usage_June'
FROM MYTABLE b1
#Group by company in order to add sum of all zones the company is using
group by b1.company_id
order by Usage_June desc;
I have the following (simplified) result from SELECT * FROM table ORDER BY tick,refid:
tick refid value
----------------
1 1 11
1 2 22
1 3 33
2 1 1111
2 3 3333
3 3 333333
Note the "missing" rows for refid 1 (tick 3) and refid 2 (ticks 2 and 3)
If possible, how can I make a query to add these missing rows using the most recent prior value for that refid? "Most recent" means the value for the row with the same refid as the missing row and largest tick such that the tick is less than the tick for the missing row. e.g.
tick refid value
----------------
1 1 11
1 2 22
1 3 33
2 1 1111
2 2 22
2 3 3333
3 1 1111
3 2 22
3 3 333333
Additional conditions:
All refids will have values at tick=1.
There may be many 'missing' ticks for a refid in sequence, (as above for refid 2).
There are many refids and it's not known which will have sparse data where.
There will be many ticks beyond 3, but all sequential. In the correct result, each refid will have a result for each tick.
Missing rows are not known in advance - this will be run on multiple databases, all with the same structure, and different "missing" rows.
I'm using MySQL and cannot change db just now. Feel free to post answer in another dialect, to help discussion, but I'll select an answer in MySQL dialect over others.
Yes, I know this can be done in the code, which I've implemented. I'm just curious if it can be done with SQL.
What value should be returned when a given tick-refid combination does not exist? In this solution, I simply returned the lowest value for that given refid.
Revision
I've updated the logic to determine what value to use in the case of a null. It should be noted that I'm assuming that ticks+refid is unique in the table.
Select Ticks.tick
, Refs.refid
, Case
When Table.value Is Null
Then (
Select T2.value
From Table As T2
Where T2.refid = Refs.refId
And T2.tick = (
Select Max(T1.tick)
From Table As T1
Where T1.tick < Ticks.tick
And T1.refid = T2.refid
)
)
Else Table.value
End As value
From (
Select Distinct refid
From Table
) As Refs
Cross Join (
Select Distinct tick
From Table
) As Ticks
Left Join Table
On Table.tick = Ticks.tick
And Table.refid = Refs.refid
If you know in advance what your 'tick' and 'refid' values are,
Make a helper table that contains all possible tick and refid values.
Then left join from the helper table on tick and refid to your data table.
If you don't know exactly what your 'tick' and 'refid' values are, you maybe could still use this method, but instead of a static helper table, it would have to be dynamically generated.
The following has too many sub-selects for my taste, but it generates the desired result in MySQL, as long as every tick and every refid occurs separately at least once in the table.
Start with a query that generates every pair of tick and refid. The following uses the table to generate the pairs, so if any tick never appears in the underlying table, it will also be missing from the generated pairs. The same holds true for refids, though the restriction that "All refids will have values at tick=1" should ensure the latter never happens.
SELECT tick, refid FROM
(SELECT refid FROM chadwick WHERE tick=1) AS r
JOIN
(SELECT DISTINCT tick FROM chadwick) AS t
Using this, generate every missing tick, refid pair, along with the largest tick that exists in the table by equijoining on refid and θ≥-joining on tick. Group by the generated tick, refid since only one row for each pair is desired. The key to filtering out existing tick, refid pairs is the HAVING clause. Strictly speaking, you can leave out the HAVING; the resulting query will return existing rows with their existing values.
SELECT tr.tick, tr.refid, MAX(c.tick) AS ctick
FROM
(SELECT tick, refid FROM
(SELECT refid FROM chadwick WHERE tick=1) AS r
JOIN
(SELECT DISTINCT tick FROM chadwick) AS t
) AS tr
JOIN chadwick AS c ON tr.tick >= c.tick AND tr.refid=c.refid
GROUP BY tr.tick, tr.refid
HAVING tr.tick > MAX(c.tick)
One final select from the above as a sub-select, joined to the original table to get the value for the given ctick, returns the new rows for the table.
INSERT INTO chadwick
SELECT missing.tick, missing.refid, c.value
FROM (SELECT tr.tick, tr.refid, MAX(c.tick) AS ctick
FROM
(SELECT tick, refid FROM
(SELECT refid FROM chadwick WHERE tick=1) AS r
JOIN
(SELECT DISTINCT tick FROM chadwick) AS t
) AS tr
JOIN chadwick AS c ON tr.tick >= c.tick AND tr.refid=c.refid
GROUP BY tr.tick, tr.refid
) AS missing
JOIN chadwick AS c ON missing.ctick = c.tick AND missing.refid=c.refid
;
Performance on the sample table, along with (tick, refid) and (refid, tick) indices:
+----+-------------+------------+-------+-------------------+----------+---------+----------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+-------------------+----------+---------+----------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 1 | PRIMARY | c | ALL | tick_ref,ref_tick | NULL | NULL | NULL | 6 | Using where; Using join buffer |
| 2 | DERIVED | <derived3> | ALL | NULL | NULL | NULL | NULL | 9 | Using temporary; Using filesort |
| 2 | DERIVED | c | ref | tick_ref,ref_tick | ref_tick | 5 | tr.refid | 1 | Using where; Using index |
| 3 | DERIVED | <derived4> | ALL | NULL | NULL | NULL | NULL | 3 | |
| 3 | DERIVED | <derived5> | ALL | NULL | NULL | NULL | NULL | 3 | Using join buffer |
| 5 | DERIVED | chadwick | index | NULL | tick_ref | 10 | NULL | 6 | Using index |
| 4 | DERIVED | chadwick | ref | tick_ref | tick_ref | 5 | | 2 | Using where; Using index |
+----+-------------+------------+-------+-------------------+----------+---------+----------+------+---------------------------------+
As I said, too many sub-selects. A temporary table may help matters.
To check for missing ticks:
SELECT clo.tick+1 AS missing_tick
FROM chadwick AS chi
RIGHT JOIN chadwick AS clo ON chi.tick = clo.tick+1
WHERE chi.tick IS NULL;
This will return at least one row with tick equal to 1 + the largest tick in the table. Thus, the largest value in this result can be ignored.
In order to have the list of pairs (tick, refid) to insert get a whole list:
SELECT a.tick, b.refid
FROM ( SELECT DISTINCT tick FROM t) a
CROSS JOIN ( SELECT DISTINCT refid FROM t) b
Now substract from that query the existing ones:
SELECT a.tick tick, b.refid refid
FROM ( SELECT DISTINCT tick FROM t) a
CROSS JOIN ( SELECT DISTINCT refid FROM t) b
MINUS
SELECT DISTINCT tick, refid FROM t
Now you can join with t to obtain the final query (note that I use inner join + left join to obtain previous result but you could adapt):
INSERT INTO t(tick, refid, value)
SELECT c.tick, c.refid, t1.value
FROM ( SELECT a.tick tick, b.refid refid
FROM ( SELECT DISTINCT tick FROM t) a
CROSS JOIN ( SELECT DISTINCT refid FROM t) b
MINUS
SELECT DISTINCT tick, refid FROM t
) c
INNER JOIN t t1 ON t1.refid = c.refid and t1.tick < c.tick
LEFT JOIN t t2 ON t2.refid = c.refid AND t1.tick < t2.tick AND t2.tick < c.tick
WHERE t2.tick IS NULL