Group by two values - mysql

I have the following query:
SELECT
items.*
FROM
`items`
INNER JOIN
`users` ON `items`.`owner` = `users`.`id`
GROUP BY
`items`.`owner`
LIMIT
10
I ensures it is grouped by the user (only one item fetched per user), but I also wish ensure that items with the category, say, "1" only appears once.
But that does not work. Well, query succeeds, but it does not group by category. Multiple categories is still shown. Any ideas?
I have created a SQLFiddle here: http://sqlfiddle.com/#!2/0a4bad/1
Instead of outputting:
+----+----------+-------+
| ID | CATEGORY | OWNER |
+----+----------+-------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 4 |
| 5 | 2 | 5 |
+----+----------+-------+
It should be outputting:
+----+----------+-------+
| ID | CATEGORY | OWNER |
+----+----------+-------+
| 1 | 1 | 1 |
| 4 | 2 | 2 |
| 5 | 2 | 4 |
| 5 | 2 | 5 |
| 8 | 3 | 3 |
+----+----------+-------+
(notice category 1 is only shown ONCE).
I want to ensure that only one item per owner is shown, and then adtionally ensure that a specific category (say 1 and 5) is only shown once. The category 1 and 5 are overpopulated, and if they are not limited, they will be 90% of the output.

You can use DISTINCT to retrieve unique data:
SELECT DISTINCT items.category

select * from items t1
where category not in (1,2)
or not exists (
select 1 from items t2
where t2.id < t1.id
and t2.category = t1.category
)
group by owner
http://sqlfiddle.com/#!2/0a4bad/27

Related

Merge references to duplicate rows in mysql

This feels very simple and complex at the same time, but I can't quite get my head around an appropriate way of going about this as mysql query.
I have a table of tags called categories that should only have unique titles for the field cat_title. However, I've noticed that there are multiple rows with the same cat_title field name.
I want to delete all but the first instance of any duplicates. Simple enough, yes. But another table, tagging has a field called tagging_cat_id that references the identifier field, cat_id in the categories table. Deleting duplicates will break these references and point to nothing.
So, the more complex aspect is finding any tagging_cat_id field that references a duplicate row that's about to be deleted and change it to reference the (soon to be unique, single) first row of this cat_title
I am a novice at mysql and this is a bit out of my depth. I was almost tempted to do this manually by hand in a gui. Is there a simple enough method of doing this as a query that I could run on occasion to perform the above? (until what's causing duplicates to be created is resolved). Distrib version is 5.7.21.
Sample Data
Categories
+--------+-----------+
| cat_id | cat_title |
+--------+-----------+
| 1 | green |
| 2 | red |
| 3 | blue |
| 4 | green |
| 5 | green |
| 6 | red |
| 7 | white |
+--------+-----------+
Tagging
+------------+-------------------+----------------+
| tagging_id | tagging_record_id | tagging_cat_id |
+------------+-------------------+----------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 7 |
| 4 | 3 | 5 |
| 5 | 4 | 6 |
| 6 | 5 | 4 |
| 7 | 5 | 3 |
| 8 | 6 | 5 |
+------------+-------------------+----------------+
I want to convert the above to the following:
Categories
+--------+-----------+
| cat_id | cat_title |
+--------+-----------+
| 1 | green |
| 2 | red |
| 3 | blue |
| 7 | white |
+--------+-----------+
Tagging
+------------+-------------------+----------------+
| tagging_id | tagging_record_id | tagging_cat_id |
+------------+-------------------+----------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 7 |
| 4 | 3 | 1 |
| 5 | 4 | 2 |
| 6 | 5 | 1 |
| 7 | 5 | 3 |
| 8 | 6 | 1 |
+------------+-------------------+----------------+
If your version of MySql is 8.0+ you can use this query:
SELECT cat_id, MIN(cat_id) OVER (PARTITION BY cat_title) min_id
FROM categories
to identify for each cat_id the minimum cat_id with the same cat_title so you can update the table:
WITH ids AS (
SELECT cat_id, MIN(cat_id) OVER (PARTITION BY cat_title) min_id
FROM categories
)
UPDATE tagging t
INNER JOIN ids i ON i.cat_id = t.tagging_cat_id
SET t.tagging_cat_id = i.min_id
Then you can delete the duplicates:
WITH ids AS (
SELECT cat_id, MIN(cat_id) OVER (PARTITION BY cat_title) min_id
FROM categories
)
DELETE c
FROM categories c INNER JOIN ids i
ON i.cat_id = c.cat_id AND i.min_id < c.cat_id
See the demo.
For previous versions of MySql that do not support window functions and CTEs:
UPDATE tagging t
INNER JOIN categories c ON c.cat_id = t.tagging_cat_id
INNER JOIN (
SELECT cat_title, MIN(cat_id) min_id
FROM categories
GROUP BY cat_title
) m ON m.cat_title = c.cat_title
SET t.tagging_cat_id = m.min_id
and:
DELETE c1
FROM categories c1 INNER JOIN categories c2
ON c2.cat_title = c1.cat_title
WHERE c1.cat_id > c2.cat_id
See the demo.
Results:
cat_id
cat_title
1
green
2
red
3
blue
7
white
and:
tagging_id
tagging_record_id
tagging_cat_id
1
1
1
2
1
2
3
2
7
4
3
1
5
4
2
6
5
1
7
5
3
8
6
1

Select all values that also exists with specific other value

I have a table in mysql like e.g.
--------------------------
Line | category | product |
==========================
1 | 1 | 500 |
2 | 10 | 500 |
3 | 1 | 510 |
4 | 11 | 510 |
5 | 2 | 520 |
6 | 10 | 520 |
--------------------------
Now I was wondering if its possible to select category from line 2 and 4 because they also exist with the category value 1 in the table.
I tried some stuff like
select
max(categorie),
product
from
products
group by
product
but this brings up all results. even those having a product that has 2 as category.
Expected output is:
| category |
|==========|
| 10 |
| 11 |
------------
I think the easiest way would be self joining the table, so that each product gets matched with the rows of the same product that have 1 as category
select t1.category
from yourTable t1
join yourTable t2
on t1.product = t2.product and
t1.category <> t2.category
where t2.category = 1

mysql return count = 0 without joins using group by

I have a table
Image
| ImageId | UserId | SourceId |
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
| 4 | 2 | 2 |
| 5 | 3 | 1 |
| 6 | 3 | 1 |
| 7 | 3 | 1 |
| 8 | 3 | 1 |
| 9 | 3 | 1 |
I then have a query:
SELECT UserId, IFNULL( COUNT( ImageId ) , 0 ) AS ImageCount, SourceId
FROM Image
GROUP BY UserId, SourceId
When I do the query, I get
| UserId | SourceId | ImageCount
| 1 | 1 | 1
| 1 | 2 | 1
| 2 | 1 | 1
| 1 | 2 | 1
| 3 | 1 | 5
However, the one row that I do NOT get back (which I want) is:
| 3 | 1 | 0
How do I go about fetching the row, even if the count is 0?
All of the questions I've seen have had to deal with joins (usually left joins) but as this doesn't require a join I'm a little confused as to how to go about this.
This does require a left join and a bit more. You need to start with all possible combinations, then use left join to bring in the existing values:
SELECT u.UserId, COUNT(i.ImageId ) AS ImageCount, s.SourceId
FROM (select distinct UserId from Image) u cross join
(select distinct SourceId from Image) s left join
Image i
on i.UserId = u.UserId and i.SourceId = s.SourceId
GROUP BY u.UserId, s.SourceId;
The count() will return 0 if there are no matches. There is no need for an if, coalesce, or case statement.
What this does is create every possible combination of UserId and SourceId (based on the values in the Image table). It then uses a left outer join to connect back to the image. What does this do? Well, for existing records, it actually does nothing. But for combinations that don't appear in the table, it will add a new row with u.UserId, s.SourceId, and NULL in the other fields. This is the basis for the final aggregation.

Mysql select row based on multiple rows in same table

I have the following table structure:
item_id | value |
==================
1 | 1 |
1 | 3 |
1 | 4 |
2 | 2 |
2 | 3 |
2 | 4 |
2 | 5 |
3 | 1 |
3 | 5 |
3 | 6 |
4 | 1 |
4 | 3 |
4 | 4 |
4 | 5 |
I have a query that returns those item_id whose value matches with 1, 3 and 4.
So here, the item_ids that should be returned are 1 and 4.
My query:
select item_id from table t
where exists (select item_id from table t1 where value = 1 and t1.item_id = t.item_id)
and exists (select item_id from table t1 where value = 2 and t1.item_id = t.item_id) group by item_id
This query is working fine. Here i am matching only 3 values. What if i want to match 50 such values from the table? (all the 50 values are stored in a php array) The query will be huge and also i want to do the same thing from two different tables in the same query. So, this will double the size of an already huge query. Please suggest me some other way around.
Edited::
table 2
--------
item_id | user_id |
==================
1 | 1 |
1 | 5 |
1 | 7 |
2 | 2 |
2 | 3 |
2 | 4 |
2 | 5 |
3 | 1 |
3 | 5 |
3 | 6 |
4 | 1 |
4 | 3 |
4 | 4 |
4 | 5 |
Now, i want item_id where values from table1 are 1,3,4 and user_id from table2 are 1,5,7
This problem is called Relational Division.
SELECT item_ID
FROM tableName
WHERE value IN (1,3,4)
GROUP BY item_ID
HAVING COUNT(*) = 3
if uniqueness was not enforce on column value for every item_id, DISTINCT is required to count only unique values,
SELECT item_ID
FROM tableName
WHERE value IN (1,3,4)
GROUP BY item_ID
HAVING COUNT(DISTINCT value) = 3
SQLFiddle Demo (both query included)
SQL of Relational Division

Mysql process data by groups

I have a few groups of data. Each group has a some property field.
For example:
_________________________
| id | value | property |
--------------------------
| 1 | 2 | 3 |
--------------------------
| 2 | 2 | 3 |
--------------------------
| 3 | 2 | 3 |
--------------------------
| 4 | 2 | 4 |
-------------------------
| 5 | 2 | 4 |
--------------------------
| 6 | 2 | 4 |
--------------------------
How can I update two strings ordered by id ASC with property = 3, and 2 strings ordered by id ASC with property = 4 by one query?
I want to update 2 of 3 rows with property = 3 and update 2 of 3 rows with property = 4. For example: rows with id 1 and 2, and rows with id 4 and 5
i.e. i want update groups of data with different conditions by one query
You can do it using calculated rank field, e.g. -
SELECT p1.*, COUNT(*) rank FROM properties p1
LEFT JOIN properties p2
ON p2.property = p1.property AND p2.id <= p1.id
GROUP BY p1.property, p1.id
This query will return dataset with row-number by property:
+------+-------+----------+------+
| id | value | property | rank |
+------+-------+----------+------+
| 1 | 2 | 3 | 1 |
| 2 | 2 | 3 | 2 |
| 3 | 2 | 3 | 3 |
| 4 | 2 | 4 | 1 |
| 5 | 2 | 4 | 2 |
| 6 | 2 | 4 | 3 |
+------+-------+----------+------+
Then you should update records with rank < 3:
UPDATE properties p
JOIN (SELECT p1.*, COUNT(*) rank FROM properties p1
LEFT JOIN properties p2
ON p2.property = p1.property AND p2.id <= p1.id
GROUP BY p1.property, p1.id) r
ON p.id = r.id
SET p.value = 100 -- set new value here
WHERE r.rank < 3
Here's the solution, and see discussion following:
update
t,
(select GROUP_CONCAT(ids) as matching_ids from (
select
SUBSTRING_INDEX(GROUP_CONCAT(id order by id), ',', 2) AS ids
from
t
where
property in (3,4)
group by
property
) s1
) s2
set value=12345
where
FIND_IN_SET(id, matching_ids) > 0
;
To illustrate, and assuming your table is called t, and the initial state is:
root#mysql-5.1.51> select * from t;
+----+-------+----------+
| id | value | property |
+----+-------+----------+
| 1 | 2 | 3 |
| 2 | 2 | 3 |
| 3 | 2 | 3 |
| 4 | 2 | 4 |
| 5 | 2 | 4 |
| 6 | 2 | 4 |
+----+-------+----------+
The result of running this query is:
root#mysql-5.1.51> select * from t;
+----+-------+----------+
| id | value | property |
+----+-------+----------+
| 1 | 12345 | 3 |
| 2 | 12345 | 3 |
| 3 | 2 | 3 |
| 4 | 12345 | 4 |
| 5 | 12345 | 4 |
| 6 | 2 | 4 |
+----+-------+----------+
A brief explanation of the query:
I pick up the first two ids for each property using the SUBSTRING_INDEX(GROUP_CONCAT(id order by id), ',', 2) statement.
I combine the above using GROUP_CONCAT(ids) as matching_ids to get all valid ids.
Finally, I update all rows in the table where the id is within combined matching_ids text.
Notes:
You should verify your group_concat_max_len variable is long enough. Default is 1024. You most probably want to have this in the millions, anyhow (regardless of my answer).
The query is far from being optimal. It answers your question, but you can't have an optimal query here.
You are most probably better off with a transaction containing two or three queries.
Good luck!
I'm assuming you mean to limit your two updates to two rows each. You can use ORDER BY and LIMIT in your update statements:
UPDATE yourtable
SET property = 'new_value'
WHERE value=2 AND property = 4
ORDER BY id ASC LIMIT 2
UPDATE yourtable
SET property = 'new_value'
WHERE value=2 AND property = 3
ORDER BY id DESC LIMIT 2
Update:
To force this into one query, you would need to JOIN against a subquery which retrieves the ids to update via UNION. I think this is legal:
UPDATE yourtable
JOIN (
(SELECT id FROM yourtable WHERE value=2 AND property=4 ORDER BY id ASC LIMIT 2)
UNION ALL
(SELECT id FROM yourtable WHERE value=2 AND property=3 ORDER BY id DESC LIMIT 2)
) updaterows ON yourtable.id = updaterows.id
SET property = 'new value'