Multiple Columns with Duplicate Values - mysql

Currently, I have a MySQL table that has a few columns in it. The following is a sample of the table with data:
+----------+---------------------+
| hospt_id | file_id | clinic_id |
+----------+---------------------+
| 212837 | 9 | NULL |
| 123837 | 14 | 2134319 |
| 345567 | 9 | NULL |
| 123456 | 14 | 2134320 |
| 123456 | 14 | 2134320 |
+----------+---------------------+`
What I am trying to do is to write a query that will return all records where the three columns are repeate.
For example, the last two rows are repeated. So I would want to get those returned. I know how to do duplicate searches for a single column, but not sure how to do for multiple columns.

You just need to group by all three records to get a count of how many rows are in each group. You can then filter it down to those that have more than one matching row in the having clause.
select hospt_id, file_id, clinic_id, count(*)
from <table>
group by hospt_id, file_id, clinic_id
having count(*) > 1;
Here's a demo: http://sqlfiddle.com/#!9/91bf9/2

Related

How can I merge two strings of comma-separated numbers in MySQL?

For example, there are three rooms.
1|gold_room|1,2,3
2|silver_room|1,2,3
3|brown_room|2,4,6
4|brown_room|3
5|gold_room|4,5,6
Then, I'd like to get
gold_room|1,2,3,4,5,6
brown_room|2,3,4,6
silver_room|1,2,3
How can I achieve this?
I've tried: select * from room group by name; And it only prints the first row. And I know CONCAT() can combine two string values.
Please use below query,
select col2, GROUP_CONCAT(col3) from data group by col2;
Below is the Test case,
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ab35e8d66ffe3ac6436c17faf97ee9af
I'm not making an assumption that the lists don't have elements in common on separate rows.
First create a table of integers.
mysql> create table n (n int primary key);
mysql> insert into n values (1),(2),(3),(4),(5),(6);
You can join this to your rooms table using the FIND_IN_SET() function. Note that this cannot be optimized. It will execute N full table scans. But it does create an interim set of rows.
mysql> select * from n inner join rooms on find_in_set(n.n, rooms.csv) order by rooms.room, n.n;
+---+----+-------------+-------+
| n | id | room | csv |
+---+----+-------------+-------+
| 2 | 3 | brown_room | 2,4,6 |
| 3 | 4 | brown_room | 3 |
| 4 | 3 | brown_room | 2,4,6 |
| 6 | 3 | brown_room | 2,4,6 |
| 1 | 1 | gold_room | 1,2,3 |
| 2 | 1 | gold_room | 1,2,3 |
| 3 | 1 | gold_room | 1,2,3 |
| 4 | 5 | gold_room | 4,5,6 |
| 5 | 5 | gold_room | 4,5,6 |
| 6 | 5 | gold_room | 4,5,6 |
| 1 | 2 | silver_room | 1,2,3 |
| 2 | 2 | silver_room | 1,2,3 |
| 3 | 2 | silver_room | 1,2,3 |
+---+----+-------------+-------+
Use GROUP BY to reduce these rows to one row per room. Use GROUP_CONCAT() to put the integers together into a comma-separated list.
mysql> select room, group_concat(distinct n.n order by n.n) as csv
from n inner join rooms on find_in_set(n.n, rooms.csv) group by rooms.room
+-------------+-------------+
| room | csv |
+-------------+-------------+
| brown_room | 2,3,4,6 |
| gold_room | 1,2,3,4,5,6 |
| silver_room | 1,2,3 |
+-------------+-------------+
I think this is a lot of work, and impossible to optimize. I don't recommend it.
The problem is that you are storing comma-separated lists of numbers, and then you want to query it as if the elements in the list are discrete values. This is a problem for SQL.
It would be much better if you did not store your numbers in a comma-separated list. Store multiple rows per room, with one number per row. You can run a wider variety of queries if you do this, and it will be more flexible.
For example, the query you asked about, to produce a result with numbers in a comma-separated list is more simple, and you don't need the extra n table:
select room, group_concat(n order by n) as csv from rooms group by room
See also my answer to Is storing a delimited list in a database column really that bad?

Properly SQL query

I need to skip results with high price per day. I've got a table like this:
+------+-------------+-------+
| days | return_date | value |
+------+-------------+-------+
| 2 | 2017-12-27 | 15180 |
| 3 | 2017-12-28 | 14449 |
| 4 | 2017-12-29 | 13081 |
| 5 | 2017-12-30 | 11203 |
| 6 | 2017-12-31 | 9497 |
| 6 | 2017-12-31 | 9442 |
+------+-------------+-------+
How can I print only the lowest price for 6 days (9442 in this example).
We can use a GROUP BY clause and an aggregate function. For example:
SELECT t.days
, t.return_date
, MIN(t.value) AS min_value
FROM mytable t
GROUP
BY t.days
, t.return_date
This doesn't really "skip" rows. It accesses all the rows that satisfy the conditions in the WHERE clause (in this example, every row in the table). Then MySQL collapses rows into groups (in this example, rows with identical values of days and return_date get put into a group. The MIN(t.value) aggregate function selects out the minimum (lowest) value out of the group.
The query above is just an example of one approach of satisfying a particular specification.

Only return an ordered subset of the rows from a joined table

Given a structure like this in a MySQL database
#data_table
(id) | user_id | time | (...)
#relations_table
(id) | user_id | user_coach_id | (...)
we can select all data_table rows belonging to a certain user_coach_id (let's say 1) with
SELECT rel.`user_coach_id`, dat.*
FROM `relations_table` rel
LEFT JOIN `data_table` dat ON rel.`uid` = dat.`uid`
WHERE rel.`user_coach_id` = 1
ORDER BY val.`time` DESC
returning something like
| user_coach_id | id | user_id | time | data1 | data2 | ...
| 1 | 9 | 4 | 15 | foo | bar | ...
| 1 | 7 | 3 | 12 | oof | rab | ...
| 1 | 6 | 4 | 11 | ofo | abr | ...
| 1 | 4 | 4 | 5 | foo | bra | ...
(And so on. Of course time are not integers in reality but to keep it simple.)
But now I would like to query (ideally) only up to an arbitrary number of rows from data_table per distinct user_id but still have those ordered (i.e. newest first). Is that even possible?
I know I can use GROUP BY user_id to only return 1 row per user, but then the ordering doesn't work and it seems kind of unpredictable which row will be in the result. I guess it's doable with a subquery, but I haven't figured it out yet.
To limit the number of rows in each GROUP is complicated. It is probably best done with an #variable to count, plus an outer query to throw out the rows beyond the limit.
My blog on Groupwise Max gives some hints of how to do such.

How can I show the counts of distinct values and include zeros?

I have a simple MySQL DB with the following fields:
mysql> SELECT * from table;
+----+-----------+------+
| id | location | name |
+----+-----------+------+
| 1 | NJ | Gary |
| 2 | MN | Paul |
| 3 | AZ | |
| 4 | MI | Adam |
| 5 | NJ | |
| 6 | MN | Dave |
+----+-----------+------+
6 rows in set (0.00 sec)
I need to retrieve a list of how many people are from each state, excluding those who don't have a name. In other words, I'm trying to reproduce the following result:
+----------+-------+
| location | count |
+----------+-------+
| AZ | 0 |
| MI | 1 |
| MN | 2 |
| NJ | 1 |
+----------+-------+
I'm able to get close with
SELECT location, COUNT(*) AS count FROM table WHERE name!='' GROUP BY location;
However, COUNT(*) excludes the zero counts. I attempted to use JOIN along with the table produced by
SELECT DISTINCT location, null as count from table;
but a LEFT JOIN throws out the count column from the right table, and a RIGHT JOIN doesn't seem to include the zero rows or the actual counts for some reason.
I feel as though there's a MySQL command or something simple that I'm missing. I just need to find a way to merge the two tables based on location.
Can anybody point me in the right direction?
COUNT will return number of non-null values, so you need a way to convert empty strings to nulls to get 0s.
SELECT location, COUNT(NULLIF(name,'')) AS count FROM table GROUP BY location;

getting the count of distinct duplicate ids in mysql

this is the query
select count(*),
ss.pname,
ttu.user_id,
ttl.location_name ,
group_concat(em.customer_id),
count(em.customer_id)
from seseal as ss,
track_and_trace_user as ttu,
track_and_trace_location as ttl,
eseal_mapping as em
where ss.real_id=em.e_id
and em.user_id=ttu.user_id
and ttu.location_id=ttl.location_id
group by ss.pname, ttu.user_id, ttl.location_name
having count(em.customer_id)>1 ;
and following is the results:
+----------+----------------+---------+---------------+------------------------------+-----------------------+
| count(*) | pname | user_id | location_name | group_concat(em.customer_id) | count(em.customer_id) |
+----------+----------------+---------+---------------+------------------------------+-----------------------+
| 6 | Nokia N91 | 1 | Malad | 60,51,60,51,58,58 | 6 |
| 2 | SUPERIA 1000gm | 4 | Raichur | 51,46 | 2 |
| 5 | SUPERIA 1000gm | 5 | west bengal | 51,46,51,51,46 | 5 |
| 2 | SUPERIA 500gm | 4 | Raichur | 59,59 | 2 |
| 3 | SUPERIA 500gm | 5 | west bengal | 59,46,59 | 3 |
+----------+----------------+---------+---------------+------------------------------+-----------------------+
Now the problem is, as you can see in result set, the second last column in some rows the customer_ids are duplicate and in some rows are unique. And the last column is giving the count of it.
Now what i want is to pick the 3rd row, there are two customer ids namely 51 and 46 and these are duplicate in that row, so my last column for this row should contain 2.
Similarly for last row my last column should contain 1 as there is only one customer id which is duplicated i.e. 59.
So if you understand the exact problem then the 2nd row should not be part of this result set as it doesn't contain any customer ids that are duplicate.
How about:
group_concat(distinct em.customer_id)
and
count(distinct em.customer_id)