Get minimum from result with GROUP BY in MySQL - mysql

I have table it store hierarchy data in MySQL this table store stable relation but if each user less than 1000 buy removed and user User a lower level replace this is my code and work fine, after GROUP BY it contain all ancestor of descendant with compare then COUNT(*) AS level count level each user. This I have SQL code to compress data According to minimum buy for each user
+-------------+---------------+-------------+
| ancestor_id | descendant_id | path_length |
+-------------+---------------+-------------+
| 1 | 1 | 0 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 1 | 4 | 2 |
| 1 | 5 | 3 |
| 1 | 6 | 4 |
| 2 | 2 | 0 |
| 2 | 4 | 1 |
| 2 | 5 | 2 |
| 2 | 6 | 3 |
| 3 | 3 | 0 |
| 4 | 4 | 0 |
| 4 | 5 | 1 |
| 4 | 6 | 2 |
| 5 | 5 | 0 |
| 5 | 6 | 1 |
| 6 | 6 | 0 |
+-------------+---------------+-------------+
This is table buy
+--------+--------+
| userid | amount |
+--------+--------+
| 2 | 2000 |
| 4 | 6000 |
| 6 | 7000 |
| 1 | 7000 |
SQL code
SELECT a.*
FROM
( SELECT userid
FROM webineh_user_buys
GROUP BY userid
HAVING SUM(amount) >= 1000
) AS buys_d
JOIN
webineh_prefix_nodes_paths AS a
ON a.descendant_id = buys_d.userid
JOIN
(
SELECT userid
FROM webineh_user_buys
GROUP BY userid
HAVING SUM(amount) >= 1000
) AS buys_a on (a.ancestor_id = buys_a.userid )
JOIN
( SELECT descendant_id
, MAX(path_length) path_length
FROM webineh_prefix_nodes_paths
where a.ancestor_id = ancestor_id
GROUP
BY descendant_id
) b
ON b.descendant_id = a.descendant_id
AND b.path_length = a.path_length
GROUP BY a.descendant_id, a.ancestor_id
I need get max path_length where ancestor_id have At least 1000 amount buy but have error in where in subquery where a.ancestor_id = ancestor_id error code
1054 - Unknown column 'a.ancestor_id' in 'where clause'
I add SQLFidle demo.

You could use this query:
select m.userid as descendant,
p.ancestor_id,
p.path_length
from (
select b1.userid,
min(case when b2.amount >= 1000
then p.path_length
end) as path_length
from (select userid, sum(amount) amount
from webineh_user_buys
group by userid
having sum(amount) >= 1000
) as b1
left join webineh_prefix_nodes_paths p
on p.descendant_id = b1.userid
and p.path_length > 0
left join (select userid, sum(amount) amount
from webineh_user_buys
group by userid) as b2
on p.ancestor_id = b2.userid
group by b1.userid
) as m
left join webineh_prefix_nodes_paths p
on p.descendant_id = m.userid
and p.path_length = m.path_length
order by m.userid
Output for sample data in the question:
| userid | ancestor_id | path_length |
|--------|-------------|-------------|
| 1 | (null) | (null) |
| 2 | 1 | 1 |
| 4 | 2 | 1 |
| 6 | 4 | 2 |
SQL fiddle

Related

MySQL query with sum fields from other table, with a twist

Sorry for the vague title, but I don't know how to word this type of problem better. Here is a simple example to explain it. I have to tables: OrderItemList and OrderHistoryLog.
OrderItemList:
|------------------------------|
| OrderNo | ItemNo | Loc | Qty |
|------------------------------|
| 100 | A | 1 | 1 |
| 101 | A | 1 | 2 |
| 102 | A | 1 | 1 |
| 103 | A | 2 | 1 |
| 104 | A | 2 | 1 |
OrderHistoryLog:
|------------------------------|
| OrderNo | ItemNo | Loc | Qty |
|------------------------------|
| 50 | A | 1 | 5 |
| 51 | A | 1 | 2 |
| 100 | A | 1 | 1 |
| 102 | A | 1 | 3 |
| 103 | A | 2 | 1 |
I need to show the records in the OrderItemList along with a LocHistQty field, which is the sum(Qty) from the OrderHistoryLog table for a given Item and Location, but only for the orders that are present in the OrderItemList.
For the above example, the result should be:
Result:
|------------------------------------------------------
| OrderNo | ItemNo | Loc | Qty | HistQty | LocHistQty |
|------------------------------|-----------------------
| 100 | A | 1 | 1 | 1 | 4 |
| 101 | A | 1 | 2 | 0 | 4 |
| 102 | A | 1 | 1 | 3 | 4 |
| 103 | A | 2 | 1 | 1 | 1 |
| 104 | A | 2 | 1 | 0 | 1 |
It is the last field, LocHistQty that I could use some help with. Here is what I started with (does not work):
select OI.OrderNo, OI.ItemNo, OI.Loc, OI.Qty, IFNULL(OL.Qty, 0) as HistQty, OL2.LocHistQty
from OrderItemList OI
left join OrderItemLog OL on OL.OrderNo = OI.OrderNo and OL.ItemNo = OI.ItemNo
join
(
select ItemNo, Loc, sum(qty) as LocHistQty
from zOrderItemLog
group by ItemNo, Loc
) as OL2
on OL2.ItemNo = OI.ItemNo and OL2.Loc = OI.Loc
order by OrderNo
The issue is with the above SQL is that LocHistQty contains the summary of the Qty for all orders (=11 for Loc 1 and 1 for Loc 2), not only the ones in OrderItemList.
Lastly, the real data is voluminous and query performance is important.
Help would be much appreciated.
The subquery can join with OrderItemList to restrict the order numbers that it sums.
select OI.OrderNo, OI.ItemNo, OI.Loc, OI.Qty, IFNULL(OL.Qty, 0) as HistQty, OL2.LocHistQty
from OrderItemList OI
left join OrderItemLog OL on OL.OrderNo = OI.OrderNo and OL.ItemNo = OI.ItemNo
join
(
select OL.ItemNo, OL.Loc, sum(OL.qty) as LocHistQty
from OrderItemLog AS OL
JOIN OrderItemList AS OI ON OL.OrderNo = OI.OrderNo
group by OL.ItemNo, OL.Loc
) as OL2
on OL2.ItemNo = OI.ItemNo and OL2.Loc = OI.Loc
order by OrderNo
DEMO
Option 1
SELECT
OrderNo,
ItemNo,
Loc,
Qty,
(SELECT
Qty
FROM
OrderHistoryLog AS A
WHERE
A.OrderNo = B.OrderNo AND A.Loc = B.Loc) AS HistQty,
(SELECT
SUM(Qty)
FROM
OrderHistoryLog AS D
WHERE
D.OrderNo = B.OrderNo AND D.Loc = B.Loc) AS LocHistQty
FROM
OrderItemList AS B;
Option 2
SELECT
B.OrderNo,
B.ItemNo,
B.Loc,
B.Qty,
C.Qty AS HistQty,
(SELECT
SUM(Qty)
FROM
OrderHistoryLog AS A
WHERE
A.OrderNo = B.OrderNo AND A.Loc = B.Loc) AS LocHistQty
FROM
OrderItemList AS B,
OrderHistoryLog AS C
WHERE
C.OrderNo = B.OrderNo AND C.Loc = B.Loc;

Selecting COUNT and MAX columns with 2 tables and a bridge table

so what I am trying to do is having 3 tables (pictures, collections, and bridge) with the following columns:
Collections Table:
| id | name |
------------------
| 1 | coll1 |
| 2 | coll2 |
------------------
Pictures Table: (timestamps are unix timestamps)
| id | name | timestamp |
-------------------------
| 5 | Pic5 | 1 |
| 6 | Pic6 | 19 |
| 7 | Pic7 | 3 |
| 8 | Pic8 | 892 |
| 9 | Pic9 | 4 |
-------------------------
Bridge Table:
| id | collection | picture |
-----------------------------
| 1 | 1 | 5 |
| 2 | 1 | 6 |
| 3 | 1 | 7 |
| 4 | 1 | 8 |
| 5 | 2 | 5 |
| 6 | 2 | 9 |
| 7 | 2 | 7 |
-----------------------------
And the result should look like this:
| collection_name | picture_count | newest_picture |
----------------------------------------------------
| coll1 | 4 | 8 |
| coll2 | 3 | 9 |
----------------------------------------------------
newest_picture should always be the picture with the heighest timestamp in that collection and I also want to sort the result by it. picture_count is obviously the count of picture in that collection.
Can this be done in a single statement with table joins and if yes:
how can I do this the best way?
A simple method uses correlated subqueries:
select c.*,
(select count(*)
from bridge b
where b.collection = c.id
) as pic_count,
(select p.id
from bridge b join
pictures p
on b.picture = b.id
where b.collection = c.id
order by p.timestamp desc
limit 1
) as most_recent_picture
from collections c;
A more common approach would use window functions:
select c.id, c.name, count(bp.collection), bp.most_recent_picture
from collections c left join
(select b.*,
first_value(p.id) over (partition by b.collection order by p.timestamp desc) as most_recent_picture
from bridge b join
pictures p
on b.picture = p.id
) bp
on bp.collection = c.id
group by c.id, c.name, bp.most_recent_picture;

Left join with unique values

I am looking to get all values from first table along with joinned values from second table.
Table 1 is fee_category with fields:
id | Category
1 | A
2 | B
3 | C
4 | D
Table 2 is fee_charge with fields:
id | std_id | particularID | CategoryID | assign | amount
1 | 1 | 1 | 1 | 0 | 1000
2 | 1 | 1 | 2 | 1 | 12000
3 | 1 | 2 | 3 | 0 | 3000
4 | 1 | 2 | 4 | 0 | 10
5 | 2 | 1 | 2 | 0 | 100
6 | 2 | 2 | 3 | 0 | 120
Base table is "fee_category" from which I need all values left joining with "fee_charge" from where I need values or NULL for a particular std_id and particularID
SELECT fee_category.id, fee_category.Category, fee_charge.std_id
, fee_charge.particularID, fee_charge.CategoryID, fee_charge.assign, fee_charge.amount FROM fee_category
LEFT join fee_charge on fee_category.id=fee_charge.CategoryID
where (fee_charge.std_id = 1 OR fee_charge.std_id IS NULL)
AND (fee_charge.particularID = 1 OR fee_charge.particularID IS NULL)
group By fee_category.id
order By fee_charge.assign DESC
Here I am trying to get all categories of std_id=1 and particularID=1
Correct result should be
id | Category | std_id | particularID | CategoryID | assign | amount
1 | A | 1 | 1 | 1 | 0 | 1000
1 | B | 1 | 1 | 2 | 1 | 12000
1 | C | 1 | NULL | NULL | NULL | NULL
1 | D | 1 | NULL | NULL | NULL | NULL
I am trying various versions of the above query but not getting proper result. Please help
SELECT fee_category.id
, fee_category.Category
, X.std_id
, X.particularID
, X.CategoryID
, X.assign
, X.amount
FROM fee_category
LEFT JOIN
(SELECT * FROM fee_charge
WHERE fee_charge.std_id = 1
AND fee_charge.particularID = 1) AS X
ON x.CategoryID = fee_category.id
It's very hard to follow when the fiddle doesn't match the question, so I may have misunderstood, but perhaps you're after something like this...
SELECT x.id
, z.category
, x.std_id
, y.particularID
, y.categoryID
, y.assign
, y.amount
FROM fee_charge x
LEFT
JOIN fee_charge y
ON y.id = x.id
AND y.particularID = 1
JOIN fee_category z
ON z.id = x.categoryID
WHERE x.std_id = 1;

group Items by column and order by other column

I have table as below , I want to take latest rating for the client
basically user whenever updates rating, count will be incremented and a entry will be made in table. Table goes as below
-----------------------------------------------------
|_id| name | client_id | user_id | rating | count |
-----------------------------------------------------
|1 | Four | 1 | 1 | 4 | 1 |
|2 | three | 1 | 1 | 3 | 2 |
|3 | two | 1 | 1 | 2 | 3 |
|4 | five | 1 | 1 | 5 | 4 |
|5 | two | 1 | 2 | 2 | 1 |
|6 | three | 1 | 2 | 3 | 2 |
|7 | two | 2 | 1 | 2 | 1 |
|8 | three | 2 | 1 | 3 | 2 |
-----------------------------------------------------
For rating of client_id 1 I want out put like
-----------------------------------------------------
|_id| name | client_id | user_id | rating | count |
-----------------------------------------------------
|4 | five | 1 | 1 | 5 | 4 |
|6 | three | 1 | 2 | 3 | 2 |
-----------------------------------------------------
so far I tried SELECT * FROM test
where client_id = 1 group by client_id order by count desc;
but not getting expected result, any help??
You can use left join on the same table as
select t1.* from test t1
left join test t2 on t1.user_id = t2.user_id
and t1.client_id = t2.client_id
and t1._id < t2._id
where
t2._id is null
and t1.client_id = 1
order by t1.`count` desc;
Using un-correlated subquery you may do as
select t1.* from test t1
join (
select max(_id) as _id,
client_id,
user_id
from test
where client_id = 1
group by client_id,user_id
)t2
on t1._id = t2._id
and t1.client_id = t2.client_id
order by t1.`count` desc;
UPDATE : From the comment how to join another table into above , for this here is an example
mysql> select * from users ;
+------+------+
| _id | name |
+------+------+
| 1 | AAA |
| 2 | BBB |
+------+------+
2 rows in set (0.00 sec)
mysql> select * from test ;
+------+-------+-----------+---------+--------+-------+
| _id | name | client_id | user_id | rating | count |
+------+-------+-----------+---------+--------+-------+
| 1 | four | 1 | 1 | 4 | 1 |
| 2 | three | 1 | 1 | 3 | 2 |
| 3 | two | 1 | 1 | 2 | 3 |
| 4 | five | 1 | 1 | 5 | 4 |
| 5 | two | 1 | 2 | 2 | 1 |
| 6 | three | 1 | 2 | 3 | 2 |
| 7 | two | 2 | 1 | 2 | 1 |
| 8 | three | 2 | 1 | 3 | 2 |
+------+-------+-----------+---------+--------+-------+
select t1.*,u.name from test t1
join users u on u._id = t1.user_id
left join test t2 on t1.user_id = t2.user_id
and t1.client_id = t2.client_id
and t1._id < t2._id
where
t2._id is null
and t1.client_id = 1
order by t1.`count` desc;
Will give you
+------+-------+-----------+---------+--------+-------+------+
| _id | name | client_id | user_id | rating | count | name |
+------+-------+-----------+---------+--------+-------+------+
| 4 | five | 1 | 1 | 5 | 4 | AAA |
| 6 | three | 1 | 2 | 3 | 2 | BBB |
+------+-------+-----------+---------+--------+-------+------+
Note that the join to users table is inner join and this will require all the user to be preset in users table which are in test table
If some users are missing in the users table then use left join this will have null values for the data selected from users table.
You may try something like
select _id, name, client_id, user_id, rating, max(count)
from clients
group by client_id
Try it
SELECT * FROM test
where client_id = 1
group by user_id
order by count desc

Advanced MySQL: Find correlations between poll responses

I've got four MySQL tables:
users (id, name)
polls (id, text)
options (id, poll_id, text)
responses (id, poll_id, option_id, user_id)
Given a particular poll and a particular option, I'd like to generate a table that shows which options from other polls are most strongly correlated.
Suppose this is our data set:
TABLE users:
+------+-------+
| id | name |
+------+-------+
| 1 | Abe |
| 2 | Bob |
| 3 | Che |
| 4 | Den |
+------+-------+
TABLE polls:
+------+-----------------------+
| id | text |
+------+-----------------------+
| 1 | Do you like apples? |
| 2 | What is your gender? |
| 3 | What is your height? |
| 4 | Do you like polls? |
+------+-----------------------+
TABLE options:
+------+----------+---------+
| id | poll_id | text |
+------+----------+---------+
| 1 | 1 | Yes |
| 2 | 1 | No |
| 3 | 2 | Male |
| 4 | 2 | Female |
| 5 | 3 | Short |
| 6 | 3 | Tall |
| 7 | 4 | Yes |
| 8 | 4 | No |
+------+----------+---------+
TABLE responses:
+------+----------+------------+----------+
| id | poll_id | option_id | user_id |
+------+----------+------------+----------+
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 2 |
| 3 | 1 | 2 | 3 |
| 4 | 1 | 2 | 4 |
| 5 | 2 | 3 | 1 |
| 6 | 2 | 3 | 2 |
| 7 | 2 | 3 | 3 |
| 8 | 2 | 4 | 4 |
| 9 | 3 | 5 | 1 |
| 10 | 3 | 6 | 2 |
| 10 | 3 | 5 | 3 |
| 10 | 3 | 6 | 4 |
| 10 | 4 | 7 | 1 |
| 10 | 4 | 7 | 2 |
| 10 | 4 | 7 | 3 |
| 10 | 4 | 7 | 4 |
+------+----------+------------+----------+
Given the poll ID 1 and the option ID 2, the generated table should be something like this:
+----------+------------+-----------------------+
| poll_id | option_id | percent_correlated |
+----------+------------+-----------------------+
| 4 | 7 | 100 |
| 2 | 3 | 66.66 |
| 3 | 6 | 66.66 |
| 2 | 4 | 33.33 |
| 3 | 5 | 33.33 |
| 4 | 8 | 0 |
+----------+------------+-----------------------+
So basically, we're identifying all of the users who responded to poll ID 1 and selected option ID 2, and we're looking through all the other polls to see what percentage of them also selected each other option.
Don't have an instance handy to test, can you see if this gets proper results:
select
poll_id,
option_id,
((psum - (sum1 * sum2 / n)) / sqrt((sum1sq - pow(sum1, 2.0) / n) * (sum2sq - pow(sum2, 2.0) / n))) AS r,
n
from
(
select
poll_id,
option_id,
SUM(score) AS sum1,
SUM(score_rev) AS sum2,
SUM(score * score) AS sum1sq,
SUM(score_rev * score_rev) AS sum2sq,
SUM(score * score_rev) AS psum,
COUNT(*) AS n
from
(
select
responses.poll_id,
responses.option_id,
CASE
WHEN user_resp.user_id IS NULL THEN SELECT 0
ELSE SELECT 1
END CASE as score,
CASE
WHEN user_resp.user_id IS NULL THEN SELECT 1
ELSE SELECT 0
END CASE as score_rev,
from responses left outer join
(
select
user_id
from
responses
where
poll_id = 1 and
option_id = 2
)user_resp
ON (user_resp.user_id = responses.user_id)
) temp1
group by
poll_id,
option_id
)components
After a few hours of trial and error, I managed to put together a query that works correctly:
SELECT poll_id AS p_id,
option_id AS o_id,
COUNT(*) AS optCount,
(SELECT COUNT(*) FROM response WHERE option_id = o_id AND user_id IN
(SELECT user_id FROM response WHERE poll_id = '1' AND option_id = '2')) /
(SELECT COUNT(*) FROM response WHERE poll_id = p_id AND user_id IN
(SELECT user_id FROM response WHERE poll_id = '1' AND option_id = '2'))
AS percentage
FROM response
INNER JOIN
(SELECT user_id FROM response WHERE poll_id = '1' AND option_id = '2') AS user_ids
ON response.user_id = user_ids.user_id
WHERE poll_id != '1'
GROUP BY option_id DESC
ORDER BY percentage DESC, optCount DESC
Based on a tests with a small data set, this query looks to be reasonably fast, but I'd like to modify it so the "IN" subquery is not repeated three times. Any suggestions?
This seems to give the right results for me:
select poll_stats.poll_id,
option_stats.option_id,
(100 * option_responses / poll_responses) as percent_correlated
from (select response.poll_id,
count(*) as poll_responses
from response selecting_response
join response on response.user_id = selecting_response.user_id
where selecting_response.poll_id = 1 and selecting_response.option_id = 2
group by response.poll_id) poll_stats
join (select options.poll_id,
options.id as option_id,
count(response.id) as option_responses
from options
left join response on response.poll_id = options.poll_id
and response.option_id = options.id
and exists (
select 1 from response selecting_response
where selecting_response.user_id = response.user_id
and selecting_response.poll_id = 1
and selecting_response.option_id = 2)
group by options.poll_id, options.id
) as option_stats
on option_stats.poll_id = poll_stats.poll_id
where poll_stats.poll_id <> 1
order by 3 desc, option_responses desc