MySQL nested query - mysql

I have 2 tables, one containing products, one purchases. I'm trying to find the average of the last 3 purchase prices for each product. So in the example below, for the product 'beans' I would want to return the average of the last 3 purchase prices before the product time 1230854663, i.e. average of customer C,D,E (239)
Products
+-------+------------+
| Name | time |
+-------+------------+
| beans | 1230854764 |
+-------+------------+
Purchases
+----------+------------+-------+
| Customer | time | price |
+----------+------------+-------+
| B | 1230854661 | 207 |
| C | 1230854662 | 444 |
| D | 1230854663 | 66 |
| E | 1230854764 | 88 |
| A | 1230854660 | 155 |
+----------+------------+-------+
I've come up with a nested select query which nearly gets me there i.e. it works if I hard code the time:
SELECT products.name,(SELECT avg(temp.price) FROM (select purchases.price from purchases WHERE purchases.time < 1230854764 order by purchases.time desc limit 3) temp) as av_price
from products products
But if the query references product.time rather than a hard coded time such as below I get an error that column products.time does not exist.
SELECT products.name,(SELECT avg(temp.price) FROM (select purchases.price from purchases WHERE purchases.time < products.time order by purchases.time desc limit 3) temp) as av_price from products products
I'm not sure if I'm making a simple mistake with nested queries or I'm going about this in totally the wrong way and should be using joins or some other construct, either way I'm stuck. Any help would be greatfully received.

The only problem in your query is you haven't mentioned products table in your inner query.
SELECT products.name,(SELECT avg(temp.price)
FROM (select purchases.price from purchases,products
WHERE purchases.time < products.time order by purchases.time desc limit 3) temp) as
av_price from products products

Related

MySQL GROUP_CONCAT with SUM() and multiple JOINs inside subquery

I'm very average with MySQL, but usually I can write all the needed queries after reading documentation and searching for examples. Now, I'm in the situation where I spent 3 days re-searching and re-writing queries, but I can't get it to work the exact way I need. Here's the deal:
1st table (mpt_companies) contains companies:
| company_id | company_title |
------------------------------
| 1 | Company A |
| 2 | Company B |
2nd table (mpt_payment_methods) contains payment methods:
| payment_method_id | payment_method_title |
--------------------------------------------
| 1 | Cash |
| 2 | PayPal |
| 3 | Wire |
3rd table (mpt_payments) contains payments for each company:
| payment_id | company_id | payment_method_id | payment_amount |
----------------------------------------------------------------
| 1 | 1 | 1 | 10.00 |
| 2 | 2 | 3 | 15.00 |
| 3 | 1 | 1 | 20.00 |
| 4 | 1 | 2 | 10.00 |
I need to list each company along with many stats. One of stats is the sum of payments in each payment method. In other words, the result should be:
| company_id | company_title | payment_data |
--------------------------------------------------------
| 1 | Company A | Cash:30.00,PayPal:10.00 |
| 2 | Company B | Wire:15.00 |
Obviously, I need to:
Select all the companies;
Join payments for each company;
Join payment methods for each payment;
Calculate sum of payments in each method;
GROUP_CONCAT payment methods and sums;
Unfortunately, SUM() doesn't work with GROUP_CONCAT. Some solutions I found on this site suggest using CONCAT, but that doesn't produce the list I need. Other solutions suggest using CAST(), but maybe I do something wrong because it doesn't work too. This is the closest query I wrote, which returns each company, and unique list of payment methods used by each company, but doesn't return the sum of payments:
SELECT *,
(some other sub-queries I need...),
(SELECT GROUP_CONCAT(DISTINCT(mpt_payment_methods.payment_method_title))
FROM mpt_payments
JOIN mpt_payment_methods
ON mpt_payments.payment_method_id=mpt_payment_methods.payment_method_id
WHERE mpt_payments.company_id=mpt_companies.company_id
ORDER BY mpt_payment_methods.payment_method_title) AS payment_data
FROM mpt_companies
Then I tried:
SELECT *,
(some other sub-queries I need...),
(SELECT GROUP_CONCAT(DISTINCT(mpt_payment_methods.payment_method_title), ':', CAST(SUM(mpt_payments.payment_amount) AS CHAR))
FROM mpt_payments
JOIN mpt_payment_methods
ON mpt_payments.payment_method_id=mpt_payment_methods.payment_method_id
WHERE mpt_payments.company_id=mpt_companies.company_id
ORDER BY mpt_payment_methods.payment_method_title) AS payment_data
FROM mpt_companies
...and many other variations, but all of them either returned query errors, either didn't return/format data I need.
The closest answer I could find was MySQL one to many relationship: GROUP_CONCAT or JOIN or both? but after spending 2 hours re-writing the provided query to work with my data, I couldn't do it.
Could anyone give me a suggestion, please?
You can do that by aggregating twice. First for the sum of payments per method and company and then to concatenate the sums for each company.
SELECT x.company_id,
x.company_title,
group_concat(payment_amount_and_method) payment_data
FROM (SELECT c.company_id,
c.company_title,
concat(pm.payment_method_title, ':', sum(p.payment_amount)) payment_amount_and_method
FROM mpt_companies c
INNER JOIN mpt_payments p
ON p.company_id = c.company_id
INNER JOIN mpt_payment_methods pm
ON pm.payment_method_id = p.payment_method_id
GROUP BY c.company_id,
c.company_title,
pm.payment_method_id,
pm.payment_method_title) x
GROUP BY x.company_id,
x.company_title;
db<>fiddle
Here you go
SELECT company_id,
company_title,
GROUP_CONCAT(
CONCAT(payment_method_title, ':', payment_amount)
) AS payment_data
FROM (
SELECT c.company_id, c.company_title, pm.payment_method_id, pm.payment_method_title, SUM(p.payment_amount) AS payment_amount
FROM mpt_payments p
JOIN mpt_companies c ON p.company_id = c.company_id
JOIN mpt_payment_methods pm ON pm.payment_method_id = p.payment_method_id
GROUP BY p.company_id, p.payment_method_id
) distinct_company_payments
GROUP BY distinct_company_payments.company_id
;

select query performs wrong results on parallel execution

Following is my scenario
I have tables named
Products
id | name | count | Price
-------------------------
1 | meat | 1 | 10
Users
id | name | balance
-----------------
1 | Tim | 10
2 | Joe | 10
Work flow
select products if count >= 1,
reduce user's balance and count = count - 1
if no_balance or count < 1 throw error
Let's say if both users placing an order for 1 product at exact same time, products table count updates to -1, means query executes for both users.
Products
id | name | count | Price
-------------------------
1 | meat | -1 | 10
During placeing of an order,I have used the below query to select matching products
Select * from products where count >= 1 and price >= 10
Also, if users place orders with even little time difference, the expecting output gathered.
Is there any solution to this ?
You should consider use lock for each row, for example.
Select * from products where count >= 1 and price >= 10 FOR UPDATE.
But in your scenario, I advice you use Redis to do that.
How to design a second kill system for online shop

Combining MySQL querys

This SQL tells me how much when the maximum occurred in the last hour, and is easily modified to show the same for the minimum.
SELECT
mt.mB as Hr_mB_Max,
mt.UTC as Hr_mB_Max_when
FROM
thundersense mt
WHERE
mt.mB =(
SELECT
MAX(mB)
FROM
thundersense mt2
WHERE
mt2.UTC >(UNIX_TIMESTAMP() -3600))
ORDER BY
utc
DESC
LIMIT 1
How do I modify it so it returns both maximum & minimum and their respective times?
Yours Simon M.
Based on my understanding of your question, you are looking to create a 4 column and 1 row answer where it looks like:
+-------+-----------------+----------+-----------------+
| event | time_it_occured | event | time_it_occured |
+-------+-----------------+----------+-----------------+
| fun | 90000 | homework | 12000 |
+-------+-----------------+----------+-----------------+
Below is a similar situation/queries you can adapt for your situation.
So, given a table called 'people' that looks like:
+----+------+--------+
| ID | name | salary |
+----+------+--------+
| 1 | bob | 40000 |
| 2 | cat | 12000 |
| 3 | dude | 50000 |
+----+------+--------+
You can use this query:
SELECT * FROM
(SELECT name, salary FROM people WHERE salary = (SELECT MAX(salary) FROM people)) t JOIN
(SELECT name, salary FROM people WHERE salary = (SELECT MIN(salary) FROM people)) a;
to generate:
+------+--------+------+--------+
| name | salary | name | salary |
+------+--------+------+--------+
| bob | 40000 | cat | 12000 |
+------+--------+------+--------+
Some things to note:
you can change the WHERE clauses to be the ones you have mentioned in question (for MAX and MIN).
Please be careful with the above query, here I am using a cartesian join (cross join in MYSQL) in order to get the 4 columns. To be honest, it doesn't make sense for me to get back data in one row but you said that's what you're looking for.
Here is what I would work with instead, getting two tuples/rows back:
+----------+--------+
| name | salary |
+----------+--------+
| dude | 95000 |
| Cat | 12000 |
+----------+--------+
And to generate this, you would use:
(SELECT name, salary FROM instructor WHERE salary = (SELECT MAX(salary) FROM instructor))
UNION
(SELECT name, salary FROM instructor WHERE salary = (SELECT MIN(salary) FROM instructor));
Also: A JOIN without a ON clause is just a CROSS JOIN.
How to use mysql JOIN without ON condition?
One method uses a join:
SELECT mt.mB as Hr_mB_Max, mt.UTC as Hr_mB_Max_when
FROM thundersense mt JOIN
(SELECT MAX(mB) as max_mb, MIN(mb) as min_mb
FROM thundersense mt2
WHERE mt2.UTC >(UNIX_TIMESTAMP() - 3600)
) mm
ON mt.mB IN (mm.max_mb, mm.min_mb)
ORDER BY utc DESC;
My only concern is your limit 1. Presumably, the mBs should be unique. If not, there is a bit of a challenge. One possibility would be to use an auto-incremented id rather than mB.

MySQL - count multiple tables by day

I have various tables that track stuff being created within the system, sales, customer accounts, etc, and they all have created times on them. I can summarize any one of these on a per day basis with the following query:
select date(created_time), count(*) from customers group by date(created_time)
Which produces output like:
+--------------------+----------+
| date(created_time) | count(*) |
+--------------------+----------+
| 2012-10-12 | 15 |
| 2012-10-13 | 4 |
That gets the job done although it does skip over days when nothing happened.
However what I'd like to do is generate the same thing for multiple tables at once, producing something like:
+--------------------+--------------+------------------+
| date(created_time) | count(sales) | count(customers) |
+--------------------+--------------+------------------+
| 2012-10-12 | 15 | 1 |
| 2012-10-13 | 4 | 3 |
I could run the query separately for each table and join them by hand, but the skipping 0 days makes that join difficult.
Is there a way I can do this in a single mysql query?
Try this:
SELECT created_time, SUM(customers), SUM(sales)
FROM (SELECT DATE(created_time) created_time, COUNT(*) customers, 0 sales
FROM customers
GROUP BY created_time
UNION
SELECT DATE(created_time) created_time, 0 customers, COUNT(*) sales
FROM sales
GROUP BY created_time
) as A
GROUP BY created_time;

Score algorithm in multiple join

I have a list of publications stored in publications table. Each publication has a many-to-many relation with categories and also a many-to-many relation with keywords.
Given a publication I'd like to find related ones based on a score value computed with the following algorithm:
each shared category with other publications counts as one point
each shared keyword with other publications counts as one point
the score value is the sum of the points computed with previous steps
I want to retrieve with a single query the list of related publications ordered by this score.
Now I have these two queries which compute the score for both categories and keyword
SELECT c.publication_id, (COUNT(c.category_id)) AS cscore
FROM cat_pub c
WHERE c.category_id IN <list of category ids obtained from the current publication>
GROUP BY c.publication_id
ORDER BY cscore DESC
and for the keyword score
SELECT k.publication_id, (COUNT(k.keyword_id)) AS kscore
FROM key_pub k
WHERE k.keyword IN <list of category ids obtained from the current publication>
GROUP BY k.publication_id
ORDER BY kscore DESC
Finally I need to JOIN the resulting query with a SELECT query which should retrieve publications data (title, intro, etc,) ordering them by score and with a limit clause to get the most relevant publications related to the selected one.
Currently I tried to use these two queries as subtables in a join:
SELECT mydata.*, (q1.cscore + q2.kscore) AS score
FROM publications p
INNER JOIN (<cscore query>) q1 ON p.id = q1.publication_id
INNER JOIN (<kscore query>) q2 ON p.id = q2.publication_id
ORDER BY score DESC
LIMIT 5
EXPLAIN shows me that a couple of temporary table will be used. Could it be a performance problem? Is there any better way to implement this?
Update
To answer to Johan's comment
Your solution is wrong. Use a LIMIT clause in subqueries could lead to inconsistent results with every value for the limit. What if I have the following results for the subqueries (I'll show 11 records, but your query will fetch only the first ten)
+-------+--------+ +-------+--------+
| p.id | cscore | | p.id | kscore |
+-------+--------+ +-------+--------+
| 27854 | 100 | | 27865 | 100 |
| 27853 | 100 | | 27864 | 100 |
| 27852 | 100 | | 27863 | 100 |
| 27851 | 100 | | 27862 | 100 |
| 27850 | 100 | | 27861 | 100 |
| 27849 | 100 | | 27860 | 100 |
| 27848 | 100 | | 27859 | 100 |
| 27847 | 100 | | 27858 | 100 |
| 27846 | 100 | | 27857 | 100 |
| 27845 | 100 | | 27856 | 100 |
| 27844 | 100 | | 27855 | 100 |
| 1000 | 99 | | 1000 | 99 |
+-------+--------+ +-------+--------+
If I have ten record with 100 as cscore and ten different records with 100 as kscore the join will produce an empty set. So I'm not getting any result, while the publication with id 1000 should be the solution and it's left out from the result set.
Furthermore I could consider your solution with a LEFT JOIN, in this case only records from the left table will be fetched, and each record will get a total score of 100 (because of the NULL given by the empty kscore field in the second table). Again, the result is wrong because the highest scored record should be p1000 with a total score of 198 (= 99 + 99)
Your solution cannot produce reliable results.
You only want 5 results each from the subqueries.
I think it is best to only select 5 from then and use that in the query.
Rewrite q1 as:
SELECT c.publication_id, COUNT(*) AS cscore
FROM cat_pub c
WHERE c.publication_id = p.id
AND c.category_id IN <list of category ids obtained from the current publication>
GROUP BY c.publication_id
ORDER BY cscore DESC
LIMIT 10
Rewrite q2 as:
SELECT k.publication_id, COUNT(*) AS kscore
FROM key_pub k
WHERE p.id = k.publication_id
AND k.keyword IN <list of category ids obtained from the current publication>
GROUP BY k.publication_id
ORDER BY kscore DESC
LIMIT 10
Leave the join as is:
SELECT p.*, (q1.cscore + q2.kscore) AS score
FROM publications p
INNER JOIN (<cscore query>) q1 ON p.id = q1.publication_id
INNER JOIN (<kscore query>) q2 ON p.id = q2.publication_id
ORDER BY score DESC
LIMIT 5
Note that count(*) is usually a faster choice, because it will not test of null If you can have null values and don't want to include those in the count, then name the count(field) explicitly.