Remove duplicate rows, except one, ** where rows contain NULL values ** - mysql

This question is similar to but different from other SO questions involving removal of duplicate rows in MySQL.
Chosen solutions in the referenced questions(below) fail where one or more of the columns contain NULL values. I'm including schema as well as two chosen answers in a sqlfiddle below, to illustrate where the previous solutions are not working.
Source schema:
CREATE TABLE IF NOT EXISTS `deldup` (
`or_id` int(11) NOT NULL AUTO_INCREMENT,
`order_id` varchar(5) NOT NULL,
`txt_value` varchar(20) NOT NULL,
`date_of_revision` date NOT NULL,
`status` int(3) DEFAULT NULL,
PRIMARY KEY (`or_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=16 ;
INSERT INTO `deldup` (`or_id`, `order_id`, `txt_value`, `date_of_revision`, `status`) VALUES
(1, '10001', 'ABC', '2003-03-06', NULL),
(2, '10001', 'RFE', '2003-03-11', NULL),
(3, '10002', 'ASE', '2009-08-05', NULL),
(4, '10003', 'PEF', '2001-11-03', NULL),
(5, '10004', 'OIU', '1999-10-29', NULL),
(6, '10005', 'FOO', '2002-03-01', NULL),
(7, '10006', 'RTY', '2005-08-19', NULL),
(8, '10001', 'NND', '2003-03-20', NULL),
(9, '10005', 'VBN', '2002-02-19', NULL),
(10, '10002', 'AAQ', '2009-08-13', NULL),
(11, '10002', 'EEW', '2009-08-07', NULL),
(12, '10001', 'ABC', '2003-03-06', 3),
(13, '10001', 'ABC', '2003-03-06', 3),
(14, '10001', 'ABC', '2003-03-06', NULL),
(15, '10001', 'ABC', '2003-03-06', NULL);
Solution Example 1:
http://sqlfiddle.com/#!2/983f3/1
create temporary table tmpTable (or_id int);
insert tmpTable
(or_id)
select or_id
from deldup yt
where exists
(
select *
from deldup yt2
where yt2.txt_value = yt.txt_value
and yt2.order_id = yt.order_id
and yt2.date_of_revision = yt.date_of_revision
and yt2.status = yt.status
and yt2.or_id > yt.or_id
);
delete
from deldup
where or_id in (select or_id from tmpTable);
Note that the rows containing non-null row values are successfully deleted, however rows containing null values are not removed (See row 14 and 15 in the resulting SELECT query).
Solution Example 2:
http://sqlfiddle.com/#!2/8a4f8/1
DELETE
n1
FROM
deldup n1,
deldup n2
WHERE
n1.or_id < n2.or_id AND
n1.order_id = n2.order_id AND
n1.txt_value = n2.txt_value AND
n1.date_of_revision = n2.date_of_revision AND
n1.status = n2.status
This solution involves less code and works just the same as Example 1, including the exclusion of rows containing null values.
How can I remove the duplicate rows where one of the column values contains NULL values?
References:
Delete all Duplicate Rows except for One in MySQL?
Remove duplicate rows in MySQL
How to remove duplicate entries from a mysql db?

How about simply adjusting the second solution with something like this:
WHERE
...
(n1.status IS NULL AND n2.status IS NULL
OR n1.status = n2.status)
I assume you consider the record a duplicate only if all values are repeated.

Related

How to implement IF condition in two relational tables?

I have two relational tables, and I would like to filter data using IF condition. The problem is that using LEFT JOIN I got records that cannot be grouped.
The tables that I have are:
calendar
bookers
The first table consists of lessons that can be booked by more people, and the second table contains data who booked each lesson. The IF condition that I would like to implement is: return '2' if lesson is booked by specific user, return '1' if lesson is booked, but by another user, and return '0' if lesson is not booked.
What I would like to get according to above tables is given in the figure below.
Expected result
But, when I use LEFT JOIN to link those tables, I got record for every user that booked specific lesson.
SELECT calendar.id, calendarId, lessonType, description,
CASE
WHEN bookedBy then IF(bookedBy = 8, '2', '1')
ELSE '0'
END AS bb,
(select count(bookedBy) from bookers where calendar.id = bookers.lessonId) as nOfBookers
FROM calendar
LEFT JOIN bookers ON calendar.id = bookers.lessonId
WHERE `calendarId`= 180
Without the LEFT JOIN (fiddle), counts are shown properly, but I cannot include IF condition, because the table bookers is not defined.
I would appreciate any help. Thank you very much in advance.
Here is the Fiddle.
CREATE TABLE `calendar` (
`id` int(11) NOT NULL,
`calendarId` varchar(50) NOT NULL,
`lessonType` varchar(255) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `calendar`
(`id`, `calendarId`, `lessonType`, `description`)
VALUES
(1, '180', 'A', ''),
(2, '180', 'A', ''),
(3, '180', 'A', ''),
(4, '180', 'B', ''),
(5, '180', 'B', ''),
(6, '180', 'B', ''),
(7, '180', 'B', ''),
(8, '180', 'B', ''),
(9, '180', 'B', '');
CREATE TABLE `bookers` (
`id` int(11) NOT NULL,
`lessonId` int(11) DEFAULT NULL,
`bookedBy` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
--
-- Dumping data for table `bookers`
--
INSERT INTO `bookers` (`id`, `lessonId`, `bookedBy`) VALUES
(4, 1, 8),
(5, 2, 8),
(6, 2, 28),
(7, 2, 17),
(8, 3, 11);
--
-- Indexes for dumped tables
--
ALTER TABLE `calendar`
ADD PRIMARY KEY (`id`),
ADD UNIQUE KEY `id` (`id`);
--
-- Indexes for table `bookers`
--
ALTER TABLE `bookers`
ADD PRIMARY KEY (`id`);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table `bookers`
--
ALTER TABLE `bookers`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=9;
COMMIT;
select version();
Try this:
SELECT id, calendarid, lessontype, description,
CASE WHEN FIND_IN_SET(8,vbb)>0 THEN 2
WHEN vbb IS NOT NULL THEN 1
ELSE 0 END AS bb,
nOfBookers
FROM
(SELECT c.id, calendarId, lessonType, GROUP_CONCAT(bookedby) AS vbb, description,
(SELECT COUNT(bookedby) FROM bookers WHERE c.id = bookers.lessonId) AS nOfBookers
FROM calendar c
LEFT JOIN bookers b ON c.id = b.lessonId
WHERE `calendarId`= 180
GROUP BY c.id, calendarId, lessonType, description) A;
In addition to your original LEFT JOIN attempt, I've added GROUP_CONCAT(bookedby) AS vbb which will return a comma separated bookedby value; which is 17,28,8. After that, I make the query as a sub-query and do CASE expression with FIND_IN_SET function on vbb to look for specific bookedby.
Here's an update fiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=0933e9fc3cb7445311c34c6705d11637

SQL Weighted averages of multiple rows -

How would you go about to find the real weighted average of multiple rows:
By real weighted, I mean like this calculator: https://www.rapidtables.com/calc/math/weighted-average-calculator.html (and not by multiplying value with weight).
The weight for each answer is set in answers and values for each question in child-table answer_items. We want the weighted average for each question (a,b,c,d). We know what questions to look for in advance.
The query will include between 2 and 500k answers (so preferably a speedy solution :) )
CREATE TABLE `answers` (
`id` int(10) NOT NULL,
`weight` varchar(255) NOT NULL
);
INSERT INTO `answers` (`id`, `weight`) VALUES
(1, '0.7'),
(2, '1'),
(3, '0.7'),
(4, '0.9');
CREATE TABLE `answer_items` (
`id` int(11) NOT NULL,
`answer_id` int(11) NOT NULL,
`question` varchar(5) NOT NULL,
`value` int(11) NOT NULL
);
INSERT INTO `answer_items` (`id`, `answer_id`, `question`, `value`) VALUES
(1, 1, 'a', 2),
(2, 1, 'b', 4),
(3, 1, 'c', 2),
(4, 1, 'd', 3),
(5, 2, 'a', 4),
(6, 2, 'b', 2),
(7, 2, 'c', 4),
(8, 2, 'd', 1),
(9, 3, 'a', 3),
(10, 3, 'b', 4),
(11, 3, 'c', 1),
(12, 3, 'd', 5),
(13, 4, 'a', 5),
(14, 4, 'b', 2),
(15, 4, 'c', 3),
(16, 4, 'd', 3);
ALTER TABLE `answers`
ADD PRIMARY KEY (`id`);
ALTER TABLE `answer_items`
ADD PRIMARY KEY (`id`);
ALTER TABLE `answers`
MODIFY `id` int(10) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=5;
ALTER TABLE `answer_items`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=17;
You can multiply the value times the weight and then divide by the sum of the weights. For the weighted average by question:
select question, sum(ai.value * a.weight) / sum(a.weight)
from answer_items ai join
answers a
on ai.answer_id = a.id
group by question;
Here is a db<>fiddle.
Somethin like:
SELECT
SUM(value * weight) / SUM (weight)
FROM
answers
JOIN
answer_items
ON
answer_items.answer_id = answers.id
(edit: and obviously include the question and GROUP BY question, as in the previous answer for the SUM aggregation, as per the previous answer. D'oh)

mysql query on an audit table to get latest row where no delete row exists for each user and type

so im sure this won't tax anyone but new to all this! looking for a way to pull out data from an audit table which gives a view of all the created rows where they have not subsequently been deleted.
so by example...
CREATE TABLE `test1` (
`tID` INT(11) NULL DEFAULT NULL,
`tDate` DATE NULL DEFAULT NULL,
`tName` VARCHAR(50) NULL DEFAULT NULL,
`tType` VARCHAR(50) NULL DEFAULT NULL,
`tAction` VARCHAR(50) NULL DEFAULT NULL
)
INSERT INTO `test1` VALUES (1, '2019-02-01', 'Bob', 'a', 'Create');
INSERT INTO `test1` VALUES (2, '2019-02-02', 'Frank', 'a', 'Create');
INSERT INTO `test1` VALUES (3, '2019-02-03', 'Jim', 'b', 'Create');
INSERT INTO `test1` VALUES (4, '2019-02-04', 'Frank', 'a', 'Delete');
INSERT INTO `test1` VALUES (5, '2019-02-05', 'Bob', 'b', 'Create');
INSERT INTO `test1` VALUES (6, '2019-02-06', 'Bob', 'a', 'Delete');
INSERT INTO `test1` VALUES (7, '2019-02-07', 'Bob', 'a', 'Create');
INSERT INTO `test1` VALUES (8, '2019-02-08', 'Frank', 'b', 'Create');
INSERT INTO `test1` VALUES (9, '2019-02-09', 'Bob', 'b', 'Delete');
INSERT INTO `test1` VALUES (10, '2019-02-10', 'Bob', 'b', 'Create');
INSERT INTO `test1` VALUES (11, '2019-02-11', 'kate', 'a', 'Create');
INSERT INTO `test1` VALUES (12, '2019-02-12', 'kate', 'a', 'Delete');
Want the output to be something like...
"Bob" "2019-02-07" "a"
"Jim" "2019-02-03" "b"
"Bob" "2019-02-10" "b"
"Frank" "2019-02-08" "b"
so i can can get max where create easy enough but how should i exclude the ones that have been deleted after creation?
this gets me close... but i need to remove the ones with a later deleted row..
SELECT
max(tDate),
tname,
tType
FROM test1
WHERE tAction = 'create'
GROUP BY tname, ttype
thanks for any help! j.
select *
from test1 t1
where t1.tAction <> 'Delete'
and not exists
(
select *
from test1 t2
where t2.tName = t1.tName -- same user
and t2.tType = t1.tType -- and same type
and t2.tAction = 'Delete' -- deleted
and t1.tDate < t2.tDate -- at a later time
)
Try using correlated subquery
DEMO
select * from test1 a where a.taction='Create' and not exists
(
select 1 from test1 b where a.tname=b.tname and a.ttype=b.ttype
and a.tdate<b.tdate and b.taction<>'Create'
)
OUTPUT:
tID tDate tName tType tAction
3 2019-02-03 Jim b Create
7 2019-02-07 Bob a Create
8 2019-02-08 Frank b Create
10 2019-02-10 Bob b Create
SELECT
tDate,
tname,
tType
FROM test1
WHERE tAction <> 'Delete'
I hope this is the query you are looking for.
Reference
SQL: How to perform string does not equal

Condition based selecting from group_concat in Mysql query

Here is my table and sample data.
CREATE TABLE `articles`
(
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `tags`
(
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `article_tags`
(
`article_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL
);
INSERT INTO `tags` (`id`, `name`) VALUES
(1, 'Wap Stories'),
(2, 'App Stories');
INSERT INTO `articles` (`id`, `title`) VALUES
(1, 'USA'),
(2, 'England'),
(3, 'Germany'),
(4, 'India'),
(5, 'France'),
(6, 'Dubai'),
(7, 'Poland'),
(8, 'Japan'),
(9, 'China'),
(10, 'Australia');
INSERT INTO `article_tags` (`article_id`, `tag_id`) VALUES
(1, 1),
(1, 2),
(4, 1),
(5, 1),
(2, 2),
(2, 1),
(6, 2),
(7, 2),
(8, 1),
(9, 1),
(3, 2),
(9, 2),
(10, 2);
How can I get the below output I have tried using group_concat function. It gives all the results. But my requirement is I need to groupconcat values as
a. Combination of 1,2 can be there, only 1 can be there but 2 alone cannot be there.
b. Combination of 2,1 can be there, only 2 can be there but 1 alone cannot be there
Below is the output I need
id, title, groupconcat
--------------------------
1, USA, 1,2
2, England, 1,2
4, India, 1
5, France, 1
8, Japan, 1
9, China, 1,2
SqlFiddle Link
The query which I am using is
select id, title, group_concat(tag_id order by tag_id) as 'groupconcat' from articles a
left join article_tags att on a.id = att.article_id
where att.tag_id in (1,2)
group by article_id order by id
You can try like this
SELECT id, title, GROUP_CONCAT(tag_id ORDER BY tag_id) AS 'groupconcat'
FROM articles a
LEFT join article_tags att on a.id = att.article_id
WHERE att.tag_id in (1,2)
GROUP BY article_id
HAVING SUBSTRING_INDEX(groupconcat,',',1) !='2'
ORDER BY id

Mysql - Help me alter this search query to get desired results

Following is a dump of the tables and data needed to answer understand the system:-
The system consists of tutors and classes.
The data in the table All_Tag_Relations stores tag relations for each tutor registered and each class created by a tutor. The tag relations are used for searching classes.
CREATE TABLE IF NOT EXISTS `Tags` (
`id_tag` int(10) unsigned NOT NULL auto_increment,
`tag` varchar(255) default NULL,
PRIMARY KEY (`id_tag`),
UNIQUE KEY `tag` (`tag`),
KEY `id_tag` (`id_tag`),
KEY `tag_2` (`tag`),
KEY `tag_3` (`tag`),
KEY `tag_4` (`tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Tags` (`id_tag`, `tag`) VALUES
(1, 'Sandeepan'),
(2, 'Nath'),
(3, 'first'),
(4, 'class'),
(5, 'new'),
(6, 'Bob'),
(7, 'Cratchit');
CREATE TABLE IF NOT EXISTS `All_Tag_Relations` (
`id_tag` int(10) unsigned NOT NULL default '0',
`id_tutor` int(10) default NULL,
`id_wc` int(10) unsigned default NULL,
KEY `All_Tag_Relations_FKIndex1` (`id_tag`),
KEY `id_wc` (`id_wc`),
KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `All_Tag_Relations` (`id_tag`, `id_tutor`, `id_wc`) VALUES
(1, 1, NULL),
(2, 1, NULL),
(3, 1, 1),
(4, 1, 1),
(6, 2, NULL),
(7, 2, NULL),
(5, 2, 2),
(4, 2, 2),
(8, 1, 3),
(9, 1, 3);
Following is my query:-
This query searches for "first class" (tag for first = 3 and for class = 4, in Tags table) and returns all those classes such that both the terms first and class are present in the class name.
SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =3)) AS
key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =4)) AS
key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =3
OR wtagrels.id_tag =4 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
LIMIT 0, 20
And it returns the class with id_wc = 1.
But, I want the search to show all those classes such that all the search terms are present in the class name or its tutor name
So that searching "Sandeepan class" (wtagrels.id_tag = 1,4) or "Sandeepan Nath" also returns the class with id_wc=1. And Searching. Searching "Bob First" should not return any classes.
Please modify the above query or suggest a new query, if possible using MyIsam - fulltext search, but somehow help me get the result.
I think this query would help you:
SET #tag1 = 1, #tag2 = 4; -- Setting some user variables to see where the ids go. (you can put the values in the query)
SELECT wtagrels.id_wc,
SUM(DISTINCT( wtagrels.id_tag =#tag1 OR wtagrels.id_tutor =#tag1)) AS key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =#tag2 OR wtagrels.id_tutor =#tag2)) AS key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =#tag1 OR wtagrels.id_tag =#tag2 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1 AND key_2_total_matches = 1
LIMIT 0, 20
It returns id_wc = 1.
For (6, 3) the query returns nothing.