SQL: multiple groupings needed to track duplicates - mysql

I have a table, node_saved_data, setup like this:
set_id - node_id - node_value
The values will look something like
2 - 1 - some text
2 - 2 - more text
2 - 3 - a bit more text
2 - 4 - some more text
2 - 5 - even more text
I want to see if there are any duplicate node ids within each set. I tried
SELECT set_id, node_id, COUNT(node_id) c FROM node_saved_data GROUP BY node_id, set_id
but the results were clearly not what I wanted. Any advice? Let me know if you need more information.

I determined I only had to reverse the grouping order I was using. After that It gave me the results I was expecting

I think you're pretty close. Drop a HAVING clause at the end to check for those combinations with more than 1 entry.
select
set_id, node_id, count(1)
from
node_saved_data
group by
set_id,
node_id
having
count(1) > 1

Related

Select all rows contains same value in a column

I want to select all package_id that contain product_id 2.
In this case, package_id 1,3,5 has product_id 2
Table: product_package
package_id package_name product_id
---------------------------------------------
1 Gold 1,2,3
2 Platinum 4,5,12
3 Diamond 2,11,5
4 Titanium 3,5
5 Basic 2
I tried:
SELECT
*
FROM
product_package
WHERE product_id IN(2)
It is outputting package_id 3 and 5 only. How do I output this properly?
product_id structure is varchar(256). Should I change the structure or add Foreign keys?
We always recommend not to stored delimited columns see Is storing a delimited list in a database column really that bad?
But you can use FIND_IN_SET but this is always slow
SELECT
*
FROM
product_package
WHERE FIND_IN_SET(2,product_id)
package_id
package_name
product_id
1
Gold
1,2,3
3
Diamond
2,11,5
5
Basic
2
fiddle
First, let me explain what is happening in your query.
You have WHERE product_id IN(2), but product_id is a misnomer and should rather be product_ids, because it is multiple IDs unfortunately stored in a string. IN is made to look up a value in a list. Your list, however, only consists of one element, so you can just as well use the equality operator: WHERE product_id = 2.
What you have is WHERE string = number, so the DBMS has to convert one of the values in order to compare the two. It converts the string to a number (so '2' matches 2 and '002' matches 2, too, as it should). But your strings are not numbers. The DBMS should raise an error on '1,2,3' for instance, because '1,2,3' is not a number. MySQL, however, has a design flaw here and still converts the string, regardless. It just takes as many characters from the left as they still represent a number. '1' does, but then the comma is not considered numerical (yes, MySQL cannot deal with a thousand separator when convertings strings to numbers implicitly). So converting '1,2,3' to a number results in 1. Equally, '2,11,5' results in 2, so rather surprisingly '2,11,5' = 2 in MySQL. This is why you are getting that row.
You ask "Should I change the structure", and the answer to this is yes. So far your table doesn't comply with the first normal form and should thus not exist in a relational database. You'll want two tables instead forming the 1:n relation:
Table: package
package_id
package_name
1
Gold
2
Platinum
3
Diamond
4
Titanium
5
Basic
Table: product_package
package_id
product_id
1
1
1
2
1
3
2
4
2
5
2
12
3
2
3
11
3
5
4
3
4
5
5
2
You ask "or add Foreign keys?", and the answer is and add foreign keys. So with the changed structure you want product_package(product_id) to reference product(product_id) and product_package(package_id) to reference package(package_id).
Disregarding that you should not be storing multiple values in a single field, you can use LIKE operator to achieve what you are looking for. I'm going with assumptions:
all values are delimited with commas
all values are integers
there are no whitespaces (or any other characters besides integers and commas)
select * from product_package
where product_id like '2,%'
or product_id like '%,2,%'
or product_id like '%,2'
or product_id like '2'
Alternatively, you can use REGEXP operator:
select * from product_package
where product_id regexp '^2$|^2,.+|.+,2,.+|.+,2'
References:
MySQL LIKE
MySQL REGEXP

How to make an inner join while maintaining unique rows

I have a ternary relationship in which I stablish the relation between Offers, Profiles, and Skills. The ternary relationship table, called ternary for example, has the IDs of the three tables as primary key. It could look something like this:
id_Offer - id_Profile - id_Skill
1 - 1 - 1
1 - 1 - 2
1 - 1 - 3
1 - 2 - 1
2 - 1 - 1
2 - 3 - 2
2 - 1 - 3
2 - 5 - 1
[and so on, there would be more registers for each id_Offer from Offer but I want to limit the example]
So I have 2 offers in total, with a number of profiles in each one.
The table Offer looks something like this:
Offer - business_name
1 - business-1
2 - business-1
3 - business-1
4 - business-1
5 - business-2
6 - business-2
7 - business-2
8 - business-3
So when I do a query like
select distinct id_offer, business_name, COUNT(*)
FROM Offer
GROUP BY business_name
Order by COUNT(*);
I get that for business-1 I have 4 offers.
Now if I want to take into account the offers for some Profile, I have to make a join with my ternary relationship. But even if I do something as simple as the following
select distinct business_name
from Offer
INNER JOIN ternary ON Offer.id_Offer = ternary.id_Offer
GROUP BY business_name
WHERE business_name = 'business-1'
No matter what I put on the group by, or if I write distinct or not, I do not get what I want. The reality is that for business-1, I have 4 offers. Right now in the ternary only appear two. So it should return 2 unique offers for this name with no filtering by profile.
But instead I get 8 offers, because that is how many times it appears in the ternary, the id_Offer's that match.
How should this be done? If I need no filters I can simply look at Offers table alone. But what if I need to filter by id_skill or id_Profile AND want to return the business_name?
I have seen solutions such as this but I can not make them work, I do not understand what the ? is, how is it called to learn more about it, if MariaDB works the same in this sense, I could not find information about it because I do not know how that operation is called. When I try to build that query for my data I get:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near '? ORDER BY COUNT(*) DESC' at line 1
But as I said, it is kind of hard to look for '?' as an... Operator? Function?
There are two basic solutions.
SELECT
o.business_name,
COUNT(DISTINCT o.id_offer) AS unique_offers
FROM
Offer AS o
INNER JOIN
ternary AS t
ON t.id_Offer = o.id_Offer
WHERE
o.business_name = 'business-1'
AND t.id_profile IN (1, 2, 3, 5)
GROUP BY
o.business_name
That's the simplest to write and think about. But, it can also be quite intensive because you're still joining each row in offer to 4 rows in ternary - Creating 8 rows to aggregate and process through DISTINCT.
The "better" (in my opinion) route is to filter then aggregate the ternary table in a sub-query.
SELECT
o.business_name,
COUNT(*) AS unique_offers
FROM
Offer AS o
INNER JOIN
(
SELECT id_Offer
FROM ternary
WHERE id_profile IN (1, 2, 3, 5)
GROUP BY id_Offer
)
AS t
ON t.id_Offer = o.id_Offer
WHERE
o.business_name = 'business-1'
GROUP BY
o.business_name
This ensures the t only ever has one row for any given offer. This in turn means that each row in offer only ever joins to one row in t; no duplication. That in turn means there is no need to use COUNT(DISTINCT) and relieves some overhead (By moving it to the inner query's GROUP BY).
Are you saying that you want to see offers for a particular business, but you want to limit these according to certain profiles or skills?
We limit query results in the WHERE clause. If we want to look up data in another table, we use IN or EXISTS. For instance:
select *
from offer
where business_name = 'business-1'
and id_offer in
(
select id_offer
from ternary
where id_profile = 1
and id_skill = 2
);

select records in given ids sorting order

i have table lets say - Students,
with 5 records and id(s) are 1 to 5, now i want to select the records - in a way that result should come like given sorting order of id column
id column should be resulted - 5,2,1,3,4
is there any other way to do this - then separate db calls for ids?
single db call ?
I guess if you really want a hard-coded order, you could do something like this:
order by case id
when 5 then 0
when 2 then 1
when 1 then 2
when 3 then 3
when 4 then 4
else 999
end
Or more simply (as #Strawberry points out in the comments):
order BY FIELD(id,4,3,1,2,5) desc

Make query to copy everything from one table to another one ORDERING BY

I'm curious how to create table2 of the same structure with the same data as table1, but with order by the column frequency.
Or, the equivalent of this problem is: to change the id of rows in the table properly.
It doesn't matter, whether by ASC, or DESC.
As result, the table1:
**id - name - frequency**
1 - John - 33
2 - Paul - 127
3 - Andy - 74
Should become table2:
**id - name - frequency**
1 - Paul - 127
2 - Andy - 74
3 - John - 33
What's the shortest way to do that?
Also, I would be interesting in the query that's fastest for huge tables (although performance is not so important for me).
Like this?
CREATE TABLE b SELECT col FROM a ORDER BY col
Be aware, there is no way to guarantee row order in a database (other than physically). You must always use an ORDER BY.
Reference
For this, you need to create the new id. Here is a MySQL way to do it:
create table table2 as
select #rn := #rn + 1 as id, name, frequency
from table1 cross join (select #rn := 0) const
order by frequency desc

MySql order by specific ID values

Is it possible to sort in MySQL by "order by" using a predefined set of column values (ID) like order by (ID=1,5,4,3) so I would get records 1, 5, 4, 3 in that order out?
UPDATE: Why I need this...
I want my records to change sort randomly every 5 minutes. I have a cron task to update the table to put different, random sort order in it.
There is just one problem! PAGINATION.
I will have visitors who come to my page, and I will give them the first 20 results. They will wait 6 minutes, go to page 2 and have the wrong results as the sort order has already changed.
So I thought that if I put all the IDs into a session on page 2, we get the correct records even if the sorting had already changed.
Is there any other better way to do this?
You can use ORDER BY and FIELD function.
See http://lists.mysql.com/mysql/209784
SELECT * FROM table ORDER BY FIELD(ID,1,5,4,3)
It uses Field() function, Which "Returns the index (position) of str in the str1, str2, str3, ... list. Returns 0 if str is not found" according to the documentation. So actually you sort the result set by the return value of this function which is the index of the field value in the given set.
You should be able to use CASE for this:
ORDER BY CASE id
WHEN 1 THEN 1
WHEN 5 THEN 2
WHEN 4 THEN 3
WHEN 3 THEN 4
ELSE 5
END
On the official documentation for mysql about ORDER BY, someone has posted that you can use FIELD for this matter, like this:
SELECT * FROM table ORDER BY FIELD(id,1,5,4,3)
This is untested code that in theory should work.
SELECT * FROM table ORDER BY id='8' DESC, id='5' DESC, id='4' DESC, id='3' DESC
If I had 10 registries for example, this way the ID 1, 5, 4 and 3 will appears first, the others registries will appears next.
Normal exibition
1
2
3
4
5
6
7
8
9
10
With this way
8
5
4
3
1
2
6
7
9
10
There's another way to solve this. Add a separate table, something like this:
CREATE TABLE `new_order` (
`my_order` BIGINT(20) UNSIGNED NOT NULL,
`my_number` BIGINT(20) NOT NULL,
PRIMARY KEY (`my_order`),
UNIQUE KEY `my_number` (`my_number`)
) ENGINE=INNODB;
This table will now be used to define your own order mechanism.
Add your values in there:
my_order | my_number
---------+----------
1 | 1
2 | 5
3 | 4
4 | 3
...and then modify your SQL statement while joining this new table.
SELECT *
FROM your_table AS T1
INNER JOIN new_order AS T2 on T1.id = T2.my_number
WHERE ....whatever...
ORDER BY T2.my_order;
This solution is slightly more complex than other solutions, but using this you don't have to change your SELECT-statement whenever your order criteriums change - just change the data in the order table.
If you need to order a single id first in the result, use the id.
select id,name
from products
order by case when id=5 then -1 else id end
If you need to start with a sequence of multiple ids, specify a collection, similar to what you would use with an IN statement.
select id,name
from products
order by case when id in (30,20,10) then -1 else id end,id
If you want to order a single id last in the result, use the order by the case. (Eg: you want "other" option in last and all city list show in alphabetical order.)
select id,city
from city
order by case
when id = 2 then city else -1
end, city ASC
If i had 5 city for example, i want to show the city in alphabetical order with "other" option display last in the dropdown then we can use this query.
see example other are showing in my table at second id(id:2) so i am using "when id = 2" in above query.
record in DB table:
Bangalore - id:1
Other - id:2
Mumbai - id:3
Pune - id:4
Ambala - id:5
my output:
Ambala
Bangalore
Mumbai
Pune
Other
SELECT * FROM TABLE ORDER BY (columnname,1,2) ASC OR DESC