Explaning results out of JOIN clause - mysql

I have mycash table in MySQL with the following data :
Test Case 1 :
When I run the following query,
SELECT t.id, SUM(prev.cash) AS cash_sum FROM mycash t JOIN mycash prev ON (t.id > prev.id)
I get :
id | cash_sum
2 | 1303.00
Summing up the cash values from all the rows equals to 1302 and NOT 1303.
I change the comparison operator in the ON condition and get the following results:
Test Case 2 :
For ON (t.id < prev.id) , result is:
id | cash_sum
1 | 2603.00
Test Case 3 :
For ON (t.id >= prev.id) , result is:
id | cash_sum
1 | 2605.00
Test Case 4 :
For ON (t.id <= prev.id) , result is:
id | cash_sum
1 | 3905.00
What is the calculation behind each of the results? Step by step explanation would clarify them most.

The results are correct.
200 + 200 + 200 + 301 + 301 + 101 = 1303

Yes, I got the answer my self. It is quite simple.
For any comparison operator in the ON condition, we have to compare each row in table t with all other rows in table prev. The query in any case will output the sum total of all the cash values from all the rows (total number of rows returned may NOT be 4 at most).
Let me explain the test case 1:
SELECT t.id, SUM(prev.cash) AS cash_sum FROM mycash t JOIN mycash prev ON (t.id > prev.id)
We have to select those rows from table prev whose id numbers are less then a particular id in table t.
From the first table, i.e. table t, when the id number is 1, we get no corresponding rows from second table, i.e. table prev. When the id is 2 from 1st table, we have the id 1 (cash =200) from 2nd table. If the id is 3 from 1st table, we have 1 and 2 (cash =200 and 301 respectively) from 2nd table. When the id is 4 from first table, we have ids 1,2,3 (cash = 200,301,101 respectively). So after adding all the cash values, cash_sum becomes 200+200+301+200+301+101 = 1303.
Other test cases can be explained in a similar way.

Related

MySQL - select product_id from 2 different tables and group results

situation:
table 1 - #__virtuemart_products
virtuemart_product_id | product_special
PRODUCTS_IDS | 0 or 1
table 2 - #__virtuemart_product_badges
virtuemart_product_id | product_badge
PRODUCTS_IDS | for this situation code 3
I have a default SQL
SELECT p.`virtuemart_product_id`
FROM `#__virtuemart_products` as p
WHERE p.`product_special` = 1;
results is product IDs like 2,3,225,...
I need modify this SQL syntax for select IDs from 2 different tables and return one column.
If I modify syntax like that:
SELECT p.`virtuemart_product_id`, badges_table.`virtuemart_product_id`
FROM `#__virtuemart_products` as p, `#__virtuemart_product_badges` as badges_table
WHERE p.`product_special` = 1 OR badges_table.`badge` = 3
Result is:
virtuemart_product_id | virtuemart_product_id
1 | 123
1 | 321
1 | 231
....
why is first column 1,1,1,...? here must be product_id, no product_special code
I need group this results into one column virtuemart_product_id
What I doing wrong?
I think what you are looking for is UNION of the IDs fetched from two different tables.
SELECT p.`virtuemart_product_id`, badges_table.`virtuemart_product_id`
FROM `#__virtuemart_products` as p, `#__virtuemart_product_badges` as
badges_table
WHERE p.`product_special` = 1 OR badges_table.`badge` = 3
What the above query is doing is, it is performing a join between the two tables with the condition that product_special should be 1 or badge should be 3. Hence, each row from one table will be joined with each row of the other table where the condition will satisfy.
To get IDs from both the tables you can get the results from each table according to condition and then perform a UNION on them. So for example
(SELECT `virtuemart_product_id` FROM `#__virtuemart_products` WHERE
`product_special` = 1)
UNION
(SELECT `virtuemart_product_id` FROM
`#__virtuemart_product_badges` WHERE `badge` = 3)
I hope this helps.

MySQL Self-Join Clause (Incrementing ID)

I have a table called "stocks", and its records are described in the following
ID| Date | Qty
1 | 2017-01-03 | 10
2 | 2017-02-11 | 15
3 | 2017-03-15 | 16
4 | 2017-04-25 | 30
5 | 2017-06-20 | 40
I want to find the difference between the "Qty" of each successive rows. For that purpose, I use the query:
SELECT first_table.id as "First Table ID"
, first_table.date AS "From"
, first_table.qty AS "First Table Qty"
, second_table.id as "Second Table ID"
, second_table.date AS "To"
, second_table.qty AS "Second Table Qty"
, (second_table.qty - first_table.qty) AS Quantity_Difference
FROM stocks first_table
JOIN stocks second_table
ON first_table.id + 1 = second_table.id
The following depicts the result that I got from the above query.
My questions are:
1) In the above query, what does the clause first_table.id + 1 = second_table.id mean?
2) In the JOIN clause, I add "1" on the first_table ID (i.e. first_table.id + 1).
But, in the result that I got, why does the second_table ID that get incremented? I thought that, by adding 1 to the first_table ID, the first_table ID that should be incremented instead of the second table ID.
In the above query, what does the clause first_table.id + 1 = second_table.id mean?
It means to join rows in the table whose IDs differ by 1.
But, in the result that I got, why does the second_table ID that get incremented?
It's not incrementing IDs, it's adding 1 to the ID of one row and comparing that with the ID of another row. When first_table.id = 2, first_table.id + 1 is 3, so it joins that row with second_table.id = 3.
The addition is only done in the WHERE clause, you're not returning the result in the SELECT list. So it selects the original first_table.id, not first_table.id + 1.
As mentioned in the comments, this query will only work properly when IDs all increment by 1. If there are any gaps in the ID sequence, you'll skip the first_table.id before the gap and second_table.id after the gap. See Subtract Quantity From Previous Row MySQL for a better way to subtract values from adjacent rows that doesn't depend on IDs being sequential.

How to find rows with exact same value in one or more columns

Hi I'd like to know how can I find in PHP-MySQL all the rows that have the same value in one column or more than one. The value doesn't have to be specified, so I wanna find ALL the rows that have a value that is not unique in the table (meaning there is at least another row with the same value).
For examle the table columns are ID | number | value
And the rows are:
1 | 5 | hello
2 | 6 | goodbye
3 | 7 | see you
4 | 6 | hello
5 | 6 | goodbye
I would like a query to find all the rows that have the same value in the field value, so in this case the results would be rows 1,4 and 2,4
Also I would like a query to find all the rows that have the same value in both fields number and value, so in the example the result would be just rows 2,4
I need to retrieve all the rows that I find so SELECT DISTINCT doesn't fit since I would only retrieve the common value and not the entire row.
Try this:
SELECT ID, number, value
FROM mytable
WHERE value IN (SELECT value
FROM mytable
GROUP BY value
HAVING COUNT(*) >= 2)
and this for the second case:
SELECT ID, number, value
FROM mytable
WHERE (number, value) IN (SELECT number, value
FROM mytable
GROUP BY number, value
HAVING COUNT(*) >= 2)
One more way to do it using exists. This will output rows where the vals in different rows are equal,nums in different rows are equal, val-num combination on different rows if they are equal.
select *
from tablename t
where exists (select 1 from tablename t1
where t.id <> t1.id and (t.val = t1.val or t.num = t1.num)
)

Mysql count with case when statement

Consider:
SELECT(count(c.id),
case when(count(c.id) = 0)
then 'loser'
when(count(c.id) BETWEEN 1 AND 4)
then 'almostaloser'
when(count(c.id) >= 5)
then 'notaloser'
end as status,
...
When all is said and done, the query as a whole produces a set of results that look similar to this:
Count | status
--------|-------------
2 | almostaloser //total count is between 2 and 4
--------|-------------
0 | loser // loser because total count = 0
--------|-------------
3 | almostaloser //again, total count between 2 and 4
--------|-------------
What I would like to achieve:
a method to reatain the information from the above table, but add a third column that will give a total count of each status, something like
select count(c.id)
case when(count(c.id) = 0 )
then loser as status AND count how many of the total count does this apply to
results would look similar to:
Count | status |total_of each status |
--------|-------------|---------------------|
2 | almostaloser| 2 |
--------|-------------|---------------------|
0 | loser | 1 |
--------|-------------|---------------------|
3 | almostaloser| 2 |
--------|-------------|----------------------
I've been told this could be achieved using a derived table, but i've not yet been able to get them both, only one or the other.
This can be achieved with this query (you must place your original query as subquery in two places):
SELECT t1.*, t2.total_of_each_status
FROM (
-- put here your query --
) t1
INNER JOIN (
SELECT status, count(*) AS total_of_each_status
FROM (
-- put here your query --
) t2
GROUP BY status
) t2 ON t2.status = t1.status

Deleting cross referenced data

I have the following MySQL table:
id rid
----- ------
1 2
2 1
2 3
3 2
1 3
3 1
I want to change this so only one row per relation exists.
e.g:
id rid
----- ------
1 2
2 3
1 3
If you always have pairs (as in your example):
delete from table
where id > rid;
This keeps the record where id is smaller.
If there is the possibility that no all pairs exist, then:
delete t
from table t left outer join
(select least(id, rid) as lid, greatest(id, rid) as gid, count(*) as cnt
from table t2
group by least(id, rid), greatest(id, rid)
) t2
on least(t.id, t.rid) = t2.lid and greatest(t.id, t.rid) = gid
where id < rid or t2.cnt = 1;
EDIT (explanation):
How does the second query work? Let me be honest, what I want to write is this:
delete t from table t
where id < rid or
(id > rid and
not exists (select 1 from table t2 where t2.id = t.rid and t2.rid = t.id
);
That is, I want to keep all records where id < rid. But then, I also want to keep all singleton records where rid > id. I don't think MySQL allows the syntax with the where clause.
Instead, the query in the answer counts the number of times that a pair exists, by looking at the smallest value and the largest value. For the data in the question, the result of the subquery is:
id rid cnt
1 2 2
2 3 2
1 3 2
So, all of these would use the id < rid to select the row. If you had one more row, say 4, 1. It would look like:
lid gid cnt
1 2 2
2 3 2
1 3 2
1 4 1
In this case, the first three would take the row with id < rid. But the new row would also be selected because the cnt is 1.
If you had duplicates in the table and a primary key, there would be a slight variation on the query that would do the same thing.