Mysql: getting first x records of group by - mysql

I have a table with products from shops. These products have a valid_from and valid_to date. In my query, I want to have only the first x records of each shop which are currently valid, ordered by last insert desc.
I am currently using the following query:
SELECT * FROM
(
SELECT
s.id as shopid, ss.name as shopname, p.name as productname, p.validfrom, p.validto
FROM
product p
JOIN
shop s ON p.shopid = s.id
WHERE
s.status = 'Active' AND
(now() BETWEEN p.validfrom and p.validto)
ORDER BY p.insert DESC
) as a
GROUP BY
shopid
ORDER BY
shopname asc
This obviously only gives me the latest record of each shop, but I want to have latest 2 or 3 records. How can I achieve this?
Bonus question: I want to differ per shop. So for shop A I'd like to have only the first record and for shop B the first two records. You may assume I have some database field for each shop that holds this number, like s.num_of_records.
The similar issue (possible duplicate) got me in the right direction, but it not completely solve my problem (see latest comment)

It works by giving each record a rank based on the previous shopid. If the shopid is the same, I rank it +1. Otherwise I rank it 1.
There still is a problem however. Even though I'm using order by shopid, it is not ranking correctly for all the shops. I added the lastshopid in the query to check and although the records are ordered by shop in the result, the lastshopid sometimes has another id. I think it must be because the ordering is done at the end instead of the beginning.
Anyone has an idea how to solve this? I need the shops in the right order to get this rank solution working.

You can use an additional LEFT JOIN with the product table to count the number of products from the same shop that have been inserted later. That number can be compared with your num_of_records column.
SELECT s.id as shopid, s.name as shopname,
p.name as productname, p.validfrom, p.validto
FROM shop s
JOIN product p
ON p.shopid = s.id
LEFT JOIN product p1
ON p1.shop_id = p.shop_id
AND p1.validfrom <= NOW()
AND p1.validto >= NOW()
AND p1.`insert` > p.`insert`
WHERE s.status = 'Active'
AND p.validfrom <= NOW()
AND p.validto >= NOW()
GROUP BY p.id
HAVING COUNT(p1.id) + 1 <= s.num_of_records
ORDER BY shopname asc
Indexes that might help: shop(status, name), product(shop_id, validfrom) or product(shop_id, validto) (probably the second one).
Note 1: If you have inserted two products at the same time (same second) for the same shop and both of them are candidates to be the last in the limited list for that shop, they will be selected both. That will not happen, if you use the AUTO_INCREMENT column insted of the insert column.
Note 2: Depending on the group size (number of products per shop) this query can be slow.

Related

Need the list of products and orders

This is my homework task:
Products which were ordered along with the 5 most ordered products more than once and the count of orders they were included in. (Do not include the 5 most ordered products in the final result)
Products and orders are in same table. Order detail contain Order detail ID, order id, product id, quantity.
I've tried everything but I'm struggling with "along with" statement in the query.
Here is a query I have tried:
select
productid,
count
(
(select productid from orderdetails)
and
(select productid from orderdetails order by quantity desc limit 5)
) as ORDERS
from orderdetails
group by productid
order by ORDERS desc
You select from orderdetails, aggregate to get one result row per product and you count. It is very common to count rows with COUNT(*), but you can also count expressions, e.g. COUNT(mycolumn) where you just count those that are not null. You are counting an expresssion (because it is not COUNT(*) but COUNT(something else) that you are using). The expression to test for null and count is
(select productid from orderdetails)
and
(select productid from orderdetails order by quantity desc limit 5)
This, however is not an expression that leads to one value that gets counted (when it's not null) or not (when it's null). You are selecting all product IDs from the orderdetails table and you are selecting all the five product IDs from the orderdetails table that got ordered with the highest quantity. And then you apply AND as if these were two booleans, but they are not, they are data sets. Apart from the inappropriate use of AND which is an operator on booleans and not on data sets, you are missing the point here that you should be looking for products in the same order, i.e. compare the order number somehow.
So all in all: This is completely wrong. Sorry to say that. However, the task is not at all easy in my opinion and in order to solve it, you should go slowly, step by step, to build your query.
Products which were ordered along with the 5 most ordered products more than once
Dammit; such a short sentence, but that is deceiving ;-) There is a lot to do for us...
First we must find the 5 products that got ordered most. That means sum up all sales and find the five top ones:
select productid
from orderdetails
group by productid
order by sum(quantity) desc
limit 5
(The problem with this: What if six products got ordered most, e.g. products A, B, and C with a quantity of 200 and products D, E, and F with a quantity of 100? We would get the top three plus two of the top 4 to 6. In standard SQL we would solve this with a ties clause, but MySQL's LIMIT doesn't feature this.)
Anyway. Now we are looking for products that got ordered with these five products along. Does this mean with all five at once? Probably not. We are rather looking for products that were in the same order with at least one of the top five.
with top_5_products as
(query above)
, orders_with_top_5 as
(select orderid
from orderdetails
where productid in (select productid from top_5_products)
)
, other_products_in_order as
(select productid, orderid
from orderdetails
where orderid in (select orderid from orders_with_top_5)
and productid not in (select productid from top_5_products)
And once we've got there, we must even find products that got ordered with some of the top 5 "more than once" which I interpret as to appear in at least two orders containing top 5 products.
with <all the above>
select productid
from other_products_in_order
group by productid
having count(*) > 1;
And while we have counted how many orders the products share with top 5 products, we are still not there, because we are supposed to show the number of orders the products were included in, which I suppose refers to all orders, not only those containing top 5 products. That is another count, that we can get in the select clause for instance. The query then becomes:
with <all the above>
select
productid,
(select count(*) from orderdetails od where od.productid = opio.productid)
from other_products_in_order opio
group by productid
having count(*) > 1;
That's quite a lot for homework seeing that you are struggling with the syntax still. And we haven't even addressed that top-5-or-more ties problem yet (for which analytic functions come in handy).
The WITH clause is available since MySQL 8 and helps getting such a query that builds up step by step readable. Old MySQL versions don't support this. If working with an old version I suggest you upgrade :-) Else you can use subqueries directly instead.

MYSQL select all item numbers where the timestamp is highest and quantity sold is highest

SELECT *
FROM table
INNER JOIN
(SELECT itemno, MAX(last_updated) as TopDate
FROM table
WHERE userID = 'user'
GROUP BY itemno) AS EachItem ON
EachItem.TopDate = table.last_updated
AND EachItem.itemno = table.itemno
I have taken the solution above from a previous post and modified it to work with one of the functions that I have created but I now want to use this same query but adapt it to order the result by max(last_updated) (which is a timestamp in my table) and also max(qty_sold).
Basically I have multiple duplicates of itemnos in the table but only want to return the rows with the latest date and highest qty_sold for every row where a certain user ID is specified.
Many thanks in advance, I have spent hours searching and can't figure this out as I am fairly new to mysql.
Solved my own question after more trying by adding ORDER BY qty_sold DESC to the end.
SELECT *
FROM table
INNER JOIN
(SELECT itemno, MAX(last_updated) as TopDate
FROM table
WHERE userID = 'user'
GROUP BY itemno) AS EachItem ON
EachItem.TopDate = table.last_updated
AND EachItem.itemno = table.itemno
ORDER BY qty_sold DESC

MySQL: using main query variable on sub-sub-query

I'm trying to create query that will show me table of stock, name of the stock, id, date, url, price and list of prices from the last 2 weeks.
For the 14 days history I used sub-query with group_concat on the select.
But when I use group_concat it's return all results and ignore my limit, so I created another sub-query that will be the 14 prices and the group_concat will make it a list.
The table 'record_log' is records for all stocks:
parent_stock_id - the actual stock this line belongs
price - the price
search_date - date of the price
The second table is 'stocks':
id - id of the stock
name, market_volume....
Here is the problem:
In the sub-sub-query (last line of the SELECT), when i'm filtering parent_stock_id=stocks.id he don't recognize the stocks.id because it belongs to the main query.
How can I take the stock_id from top and pass it to the sub-sub-query? or maybe another idea?
SELECT
stocks.id AS stock_id,
record_log.price AS price,
record_log.search_date,
(SELECT GROUP_CONCAT(price) FROM (SELECT price FROM record_log WHERE parent_stock_id=stocks.id ORDER BY id DESC LIMIT 14) AS nevemind) AS history
FROM stocks
INNER JOIN record_log ON stocks.id = record_log.parent_stock_id
WHERE
record_log.another_check !=0
Thank you!
--- I'm are not really using it for stocks, it's just was the easiest way to explain :)
One method is to use substring_index() and eliminate the extra subquery:
SELECT s.id AS stock_id, rl.price AS price, rl.search_date,
(SELECT SUBSTRING_INDEX(GROUP_CONCAT(price ORDER BY id DESC), ',', 14)
FROM record_log rl2
WHERE rl2.parent_stock_id = s.id
) AS history
FROM stocks s INNER JOIN
record_log rl
ON s.id = rl.parent_stock_id
WHERE rl.another_check <> 0;
Note that MySQL has a settable limit on the length of the group_concat() intermediate result (group_concat_max_len). This parameter is defaulted to 1,024.

MySQL getting SUM of balance grouped by user's job location

Im trying to get the SUM of all user balances in a specific month, and grouped by the user's region, which depends on the Point of Sell they work at.
balance
id_balance
date
id_user
value ($$$)
user
id_user
id_pos
name (not relevant)
pos (Point of Sell)
id_pos
id_region
name (not relevant)
location_region
id_region
name (Florida, Texas, etc)
Basically, I would need it to present this data (filtered by month):
location_region.name | SUM(balance.value)
---------------------|-------------------
Florida | 45730
Texas | 43995
I've tried a few approaches with no luck. This was my closest attempt.
SELECT location_region.name, SUM(balance.value) AS money
FROM balance
LEFT JOIN user ON user.id_user
LEFT JOIN pos ON pos.id_pos = user.id_pos
LEFT JOIN location_region ON location_region.id_region = pos.id_region
WHERE balance.date BETWEEN '2014-02-01' AND DATE_ADD('2014-02-01', INTERVAL 1 MONTH)
GROUP BY location_region.id_region
ORDER BY money DESC
Any ideas? Thanks!
Your current query has a logical error, JOIN condition between balance and user tables is incomplete (missing balance.id_user). So instead of balance LEFT JOIN user ON user.id_user you should have balance LEFT JOIN user ON user.id_user=balance.id_user. This is causing the JOINed table to have more rows (number of rows in balance times number of rows in user table). So the final SUM is bringing a way too higher value.
I tried the following query on your sample data (I changed some values) and it seems to be working fine:
SELECT location_region.name, SUM(balance.value) AS money
FROM balance
LEFT JOIN user USING(id_user)
LEFT JOIN pos USING(id_pos)
LEFT JOIN location_region USING(id_region)
WHERE balance.date BETWEEN '2014-02-01' AND DATE_ADD('2014-02-01', INTERVAL 1 MONTH)
GROUP BY location_region.id_region
ORDER BY money DESC
Working demo: http://sqlfiddle.com/#!2/dda28/3
On having a detailed look at your table structure and the query that you gave, what I feel is this mismatch could happen because of duplicate number of rows that might be cropping up due to the JOIN. What I suggest in this case is to find the DISTINCT rows and summing it up so that you get an exact result. Now since SUM DISTINCT is not available in MySQL, you could try this different approach to accomplish what you want:
SELECT location_region.name,
SUM(balance.value)*COUNT(DISTINCT id_user)/COUNT(id_user) AS money
FROM balance
LEFT JOIN user ON user.id_user = balance.id_user
LEFT JOIN pos ON pos.id_pos = user.id_pos
LEFT JOIN location_region ON location_region.id_region = pos.id_region
WHERE balance.date BETWEEN '2014-02-01' AND DATE_ADD('2014-02-01', INTERVAL 1 MONTH)
GROUP BY location_region.id_region
ORDER BY money DESC
In my comment, was wondering why u did not JOIN user table with ON clause as user.id_user = balance.id_user. I have added that however in my query. Hope this helps.

How do I retrieve a set number of records in date order using joins

I'm having a bit of trouble getting the right results from a query.
At the moment I have two tables, main_cats and products.
The result I am after is 6 records, in date order, with only one unique main_cat_id.
The basic table structures are
Main_cats: main_cat_id, main_cat_title
Products: product_id, main_cat_id, product_name, date_added.
I am hitting problems when I join the main_cat table to the products table. It seems to totally ignore the ORDER BY clause.
SELECT date_added, product_name,main_cat_title FROM ic_products p
JOIN ic_main_cats icm on icm.main_cat_id=p.main_cat_id
WHERE p.main_cat_id IN (1,2,12,22,6,8)
GROUP BY p.main_cat_id
ORDER BY date_added ASC
LIMIT 6
If I leave the join out the query works but shows more than one main_cat_id and I cannot display the main_cat_title as needed.
Your question is (at heart) a "select min/max date per group and associated fields" question.
SELECT p.date_added, p.product_name, icm.main_cat_title
FROM ic_products p
LEFT JOIN ic_products p2
ON p.main_cat_id=p2.main_cat_id
AND p.date_added > p2.date_added
LEFT JOIN ic_main_cats icm ON icm.main_cat_id=p.main_cat_id
WHERE p2.date_added IS NULL
AND p.main_cat_id IN (1,2,12,22,6,8)
Let me explain: look at this table, being the first LEFT JOIN of the query above:
SELECT p.date_added, p.product_name
FROM ic_products p
LEFT JOIN ic_products p2
ON p.main_cat_id=p2.main_cat_id
AND p.date_added > p2.date_added
WHERE p2.date_added IS NULL
This joins products to itself: it produces a table with every combination of date_added pairs within each category, where the date in the first column is always greater than the date in the second.
Since this is a left join, when the date in the first column is the smallest for that category, the date in the second will be NULL.
So this basically selects the minimum date for each category (I assume you want the minimum date ie earliest occurence, based off your ORDER BY date_added ASC in your question -- if you wanted the newest date_added you'd change the > to a < in the above join).
The second LEFT JOIN to icm is just the one in your original question, so that we can retrieve main_cat_title.
There is no need to LIMIT 6 here because firstly, only one row is retrieved per main_cat_id thanks to the first LEFT JOIN, and secondly, your AND p.main_cat_id IN (1,2,12,22,6,8) only selects 6 categories. So 6 categories at one row per category retrieves 6 rows. (Or at most 6; if you have no items in a particular category of course no rows will be retrieved).
This should work ...
SELECT p.main_cat_id as cat_id, product_name,main_cat_title, MAX(date_added) FROM ic_products p
JOIN ic_main_cats icm on icm.main_cat_id=p.main_cat_id
WHERE p.main_cat_id IN (1,2,12,22,6,8)
GROUP BY p.main_cat_id
ORDER BY date_added ASC
LIMIT 6