MySQL using JOIN and reports - mysql

So, I've got a mysql query that looks like this:
SELECT rewards.name, redemptions.points, redemptions.value
FROM rewards
INNER JOIN redemptions ON rewards.id = redemptions.reward_id;
The problem is that it spits out a table like this:
-----------------------------
| Reward 1 | 500 | 30 |
-----------------------------
| Reward 1 | 500 | 30 |
-----------------------------
| Reward 1 | 500 | 30 |
-----------------------------
| Reward 2 | 100 | 10 |
-----------------------------
| Reward 2 | 100 | 10 |
-----------------------------
| Reward 3 | 250 | 20 |
-----------------------------
and so on. Ideally, what I would actually like it to do is to only list each one once, but sum certain columns. So for example, it would look something like this:
-----------------------------
| Reward 1 | 500 | 90 |
-----------------------------
| Reward 2 | 100 | 20 |
-----------------------------
| Reward 3 | 250 | 20 |
-----------------------------
Where it is summing the third column, but listing the first two columns just once. I thought maybe a union would do it because I know it ignores duplicates, but I don't think it works while using unions as well.

Use SUM, and remember to add GROUP BY:
SELECT rewards.name, redemptions.points, SUM(redemptions.value)
FROM rewards
INNER JOIN redemptions ON rewards.id = redemptions.reward_id
GROUP BY rewards.name;

SELECT a.name, b.points, SUM(b.value)
FROM rewards a
INNER JOIN redemptions b
ON a.id = b.reward_id
GROUP BY a.name, b.points

Related

How to calculate the remaining amount per row?

I have a wallet table like this:
// wallet
+----+----------+--------+
| id | user_id | amount |
+----+----------+--------+
| 1 | 5 | 1000 |
| 2 | 5 | -200 |
| 3 | 5 | -100 |
| 4 | 5 | 500 |
+----+----------+--------+
I want to make a view that calculates the remaining amount per row. Something like this:
+----+----------+--------+------------------+
| id | user_id | amount | remaining_amount |
+----+----------+--------+------------------+
| 1 | 5 | 1000 | 1000 |
| 2 | 5 | -200 | 800 |
| 3 | 5 | -100 | 700 |
| 4 | 5 | 500 | 1200 |
+----+----------+--------+------------------+
Any idea how can I do that?
MySQL 8 has window function for that purpose, like SUM() OVER
for your sample data, this will calculate the running SUM for every user_id
vital for th function to work is the PARTITION BY and the ORDER BY to get the right amount
The PARTITION BY is used to get sums for a user_id, so if you had user 5,6,7,8 it will correctly add (or subtract) the maount theat that user produced.
The ORDER BYis needed to get the right mount at the corect position. Tables are by nature unsortede, so an ORDER BY is needed to give the outout the corect order, if the ids where changed, you would get another mount, as it will be prior added to the running sum
SELECT
`id`, `user_id`, `amount`
, SUM(`amount`) OVER(PARTITION BY `user_id` ORDER BY `id`) run_sum
FROM wallet
id
user_id
amount
run_sum
1
5
1000
1000
2
5
-200
800
3
5
-100
700
4
5
500
1200
fiddle
Do not know if this meets your demands or not
SELECT
t1.id,t1.user_id,t1.amount,
(
SELECT sum(t2.amount) FROM yourtable t2 WHERE t2.id<=t1.id AND t1.user_id=t2.user_id
) as remaning_amount
FROM yourtable t1

Selecting related rows in MySQL

Let me elaborate. I have a table like this (updated to include more example)
| id | date | cust | label | paid | due |
+----+-----------+------+-------------------------+------+-------+
| 1 |2016-02-02 | 1 | SALE: Acme Golf Balls | 0 | 1000 |
| 20 |2016-03-01 | 1 | PAYMENT: transaction #1 | 700 | 0 |
| 29 |2016-03-02 | 1 | PAYMENT: transaction #1 | 300 | 0 |
| 30 |2016-03-02 | 3 | SALE: Acme Large Anvil | 500 | 700 |
| 32 |2016-03-02 | 3 | PAYMENT: transaction #30| 100 | 0 |
| 33 |2016-03-03 | 2 | SALE: Acme Rockets | 0 | 2000 |
Now I need to output a table that displays sales that haven't been paid in full and the remaining amount. How do I do that? There's not much info out there on how to relate rows from the same table.
EDIT: Here's the output table I'm thinking of making
Table: debts_n_loans
| cust | label | amount |
==========================================
| 3 | SALE: Acme Large Anvil | 100 |
| 2 | SALE: Acme Rockets | 2000 |
If cust is the key that ties them together, then you can just use aggregation and a having clause:
select cust, sum(paid), sum(due)
from t
group by cust
having sum(paid) <> sum(due);
If you want the details, you can use a join, in or exists to get the details.
EDIT:
If you need to do this using the transaction at the end of the string:
select t.id, t.due, sum(tpay.paid) as paid
from t left join
t tpay
on tpay.label like '%#' || t.id
where t.label like 'SALE:%' and
tpay.label like 'PAYMENT:%'
group by t.id, t.due
having t.due <> sum(tpay.paid);
So you only need the rows with a due greater than 0
SELECT * FROM <table> WHERE due > 0;
Try this:
SELECT
cust,
SUM(due) - SUM(paid) AS remaining
FROM t1
GROUP BY cust
HAVING SUM(due) > SUM(paid);

How do I get multiple COUNT with multiple JOINS and multiple conditions?

I have SQL (MySQL) that I've can't figure out. The application is using uploaded photos where there are many tagged participants in a photo and there is the possibility to give photos a vote between 1 to 5.
The original query gets all the votes for a photo and orders them by amount of votes and the average of those votes.
Now I need to limit the returned photos by the ones with more than 1 participant. So photos with only 1 participant should not be accounted for.
Simplified schema looks like this.
PHOTOS
----------------------
| id | title |
----------------------
| 1 | Fun stuff |
| 2 | Crazy girls |
| 3 | Single boy |
PHOTO_VOTES
-------------------------------------------
| photo_id | grade | date | user_id |
-------------------------------------------
| 1 | 3 | … | 12 |
| 1 | 3 | … | 12 |
| 2 | 5 | … | 14 |
| 2 | 4 | … | 14 |
| 3 | 4 | … | 15 |
| 3 | 4 | … | 18 |
PHOTO_PARTICIPANTS
-------------------------
| photo_id | user_id |
-------------------------
| 1 | 12 |
| 1 | 21 |
| 1 | 33 |
| 2 | 14 |
| 2 | 33 |
| 3 | 12 |
This is how far I got:
SELECT vote.photo_id,
COUNT(vote.photo_id) AS vote_count,
AVG(vote.grade) AS vote_average,
COUNT(pp.photo_id) AS participant_count
FROM photo_votes vote
LEFT JOIN photos p ON (vote.photo_id = p.id)
LEFT JOIN photo_participants pp ON (pp.photo_id = p.id)
GROUP BY vote.post_id,
HAVING vote_count >= 2
AND vote_average >= 3
AND participant_count > 1
ORDER BY count DESC, average DESC;
Basically what I'm looking for to end up with, excluding the photo with only one participant:
VOTES
-----------------------------------------------------------
| photo_id | vote_count | average | participant_count
-----------------------------------------------------------
| 1 | 2 | 3 | 3
| 2 | 2 | 4.5 | 2
Update
It turned out this is a very inefficient way of trying to do what I want. Gordons answer below did solve the problem, but as soon as I wanted to join fields from the photos table as well, the "cartesian product"-issue became a real problem - it became a very heavy and slow query.
The solution I finally ended up with is adding a cache-field into the photos table keeping track of how many participants are in the photo. In other words I added a 'participant_count' field to 'photos' that is being updated every time a change is made to the participants table. I also run a cron-job regularly to make sure all photos 'participant_count' are properly up-to-date.
First, you don't need left joins for this. But that shouldn't affect the results. The problem is that you have a cartesian product, because you have two 1-n relationships to photos: votes and participants.
The proper way to fix this is by using subqueries:
SELECT pv.photo_id, pv.vote_count, pv.vote_average, pp.participant_count
FROM (SELECT pv.photo_id, count(*) AS vote_count, avg(grade) AS vote_average
FROM photo_votes pv
GROUP BY pv.photo_id
) pv
JOIN
(SELECT pp.photo_id, count(*) AS participant_count
FROM photo_participants p;
GROUP bY pv.photo_id
) pp
ON pv.photo_id = pp.photo_id
WHERE pv.vote_count >= 2 AND
pv.vote_average >= 3 AND
pp.participant_count > 1
ORDER BY pv.vote_count DESC, pv.vote_average DESC;
Note that you don't even need the photos table, because you are not using any fields in it.

select query to calculate number of occurrence as well as total cost

I have one report page which displays summarized data of other report.I have used php and mysqli. Let me explain you in deep.
I have a web application of store, where you can add product details. Using these product details you can generate packaging list report of products. And based on the generated packaging list report I need to generate one other report which contains summarized data of the packaging list.
below are my tables:
product table:
id | name | desc_id | purity | style_no | type | duty
1 | ABC | 1 | 18 | TEST123 | R | 100
2 | XYZ | 2 | 14 | TEST456 | B | 80
3 | DEF | 1 | 14 | TEST122 | R | 80
4 | PQR | 1 | 18 | TEST124 | R | 120
5 | HJK | 3 | 18 | TEST134 | B | 300
Description table:
id | descrip
1 | Gold Diamond Ring
2 | Gold Diamond Pendant
3 | Gold Diamond Earring
packaging_master table
id | name
1 | pkg_1
2 | pkg_2
packging_details table
id | pkg_id | prod_id
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 1 | 4
5 | 1 | 5
I have used below query to generate the packaging list report for specific id, which works correctly.
SELECT id, (SELECT descrip FROM description WHERE id = desc_id ) AS descrip,
style_no, type , purity, duty FROM product WHERE id IN ( SELECT prod_id FROM packaging_list_details WHERE pkg_id =1 ) ORDER BY descrip ASC , purity ASC
which displays below result:
id | descrip | style_no | type | purity | duty
1 |Gold Diamond Ring | TEST123 | R | 18 | 100
4 |Gold Diamond Ring | TEST124 | R | 18 | 120
3 |Gold Diamond Ring | TEST122 | R | 14 | 80
2 |Gold Diamond Pendant| TEST456 | B | 14 | 80
5 |Gold Diamond Earring| TEST134 | B | 18 | 300
Now I want summarized data of above result using query.
Like:
id | descrip | purity | qty | duty
1 |Gold Diamond Ring | 18 | 2 | 220
2 |Gold Diamond Ring | 14 | 1 | 80
3 |Gold Diamond Pendant| 14 | 1 | 80
4 |Gold Diamond Earring| 18 | 1 | 300
How can I achieve this?
You need to use the GROUP_BY statement - See MySql docs for more info.
This will translate the query to such
SELECT d.descrip, p.purity, count(p.purity) as qty, sum(p.duty)
FROM product p
INNER JOIN Description d ON p.desc_id = d.id
LEFT OUTER JOIN packaging_details pg on pg.prod_id = p.id
GROUP BY d.descrip, p.purity
ORDER BY d.descrip desc, p.purity desc
You can also use the sub select methodology you were using, but I prefer using joins. INNER JOIN will link both tables so that all their records are returned. OUTER JOIN will return all rows from the tables on the LEFT of the statement and matches them to values from the tables on the RIGHT.
See a full SQL Fiddle sample.
NOTE: I am not sure where you are getting the values for Id in your sample - Are they simply row numbers?
I think you should rewrite your query using JOINs:
SELECT
P.id
,D.descrip
,P.style_no
,P.type
,P.purity
,P.duty
FROM
packaging_list_details PLD
JOIN
product P ON
(P.id = PLD.prod_id)
LEFT JOIN
description D on
(D.desc_id = P.id)
WHERE
(PLID.pkg_id = 1)
That should give you the same result you already have. To get the totals, you can write a new query, similar to the above:
SELECT
P.id
,D.descrip
,P.type
,P.purity
,COUNT(p.id) as total_products
,SUM(P.duty) as total_duty
FROM
packaging_list_details PLD
JOIN
product P ON
(P.id = PLD.prod_id)
LEFT JOIN
description D on
(D.desc_id = P.id)
WHERE
(PLID.pkg_id = 1)
GROUP BY
P.id
,D.descrip
,P.type
,P.purity
The second query gives you the totals you are looking for.

GROUP BY with aggregate and an INNER JOIN

I tried to narrow down the problem as much as possible, it is still quite something. This is the query that doesn't work the way I want it:
SELECT *, MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY id_edge
Here is what I get:
|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|2 | 23 | 2 | 23 | 35 | 1 | 9 |
|4 | 24 | 5 | 24 | 46 | 1 | 9 |
------------------------------------------------------------------
and this is what I would want:
|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|3 | 23 | 9 | 23 | 35 | 1 | 9 |
|5 | 24 | 9 | 24 | 46 | 1 | 9 |
------------------------------------------------------------------
But let me elaborate a bit...
I have a graph, let's say as such:
node1
|
node2
/ \
node3 node4
| |
node5 node6
Therefore I have a table I call tbl_edges like this:
| id | nodeA | node B |
------------------------
| 12 | 1 | 2 |
| 23 | 2 | 3 |
| 24 | 2 | 4 |
| 35 | 3 | 5 |
| 46 | 4 | 6 |
------------------------
Now each edge has "stop_overs" at a certain distance (to nodeA). Therefore I have a table tbl_stopover like this:
| id | edge | dist |
------------------------
| 1 | 12 | 5 |
| 2 | 23 | 2 |
| 3 | 23 | 9 |
| 4 | 24 | 5 |
| 5 | 24 | 9 |
| 6 | 35 | 5 |
| 7 | 46 | 5 |
------------------------
Why this query?
Let's assume I want to calculate the distance between the stop_overs. Within one edge that is no problem. Across edges it gets more difficult. But if I have two edges that are connected and there is no other connection I can also calculate the distance. Here an example assuming all edges have a length of 10. :
edge23 has a stop_over(id=3) at dist=9, edge35 has a stop_over(id=6) at dist=5. Therefore the distance between these two stop_overs is:
dist = (length - dist_id3) + dist_id5 = (10-9) + 5
I am not sure if I made my self clear. If this is not understandable, feel free to ask question and I will do my best to make this more understandable.
MySQL allows you to do something silly - display fields in an aggregate query that are not a part of the GROUP BY or an aggregate function like MAX. When you do this, you get random (as you said) results for the remaining fields.
In your query you are doing this twice - once in your inner query (id2 is not part of a GROUP BY or aggregate) and once in the outer.
Prepare for random results!
To fix it, try something like this:
SELECT tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn,
MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id, edges2.id
HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn
The major changes are the explicit field list (note that I removed the id_edge since you are joining on id1 and already have that field), and addition of additional fields to both the inner and outer GROUP BY clauses.
If this gives you more rows than you want then you may need to explain more about your desired result set. Something like this is the only way to ensure you get appropriate groupings.
Okay. This seems to be the answer to my question. I will do some further "investigation" though, because I'm not sure if this is reliable. If anybody has some though on this, please leave a comment.
SELECT tbl.id, tbl.dist, tbl.id1, tbl.id2, MAX(dist) maxDist
FROM
(
SELECT tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id
HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY tbl_stopover.dist, tbl_conn.id1
ORDER BY dist DESC) AS tbl
GROUP BY tbl.id1, tbl.id2
Thanks to JNK (my colleague at work) without whom I wouldn't have gotten this far.