need help writing a query (restructuring the table) - mysql

I need to write a select statement that will rewrite the table in the following manner... I'm not sure how to go about this using MySQL.
Example of table
user_id date a b c
123456 2020-01-01 1 1 1
234567 2020-03-04 1 0 0
453576 2020-05-05 1 0 1
Desired result
user_id date results
123456 2020-01-01 a
123456 2020-01-01 b
123456 2020-01-01 c
234567 2020-03-04 a
453576 2020-05-05 a
453576 2020-05-05 c

In MySQL you can unpivot with union all, while filtering on 1 values:
select user_id, date, 'a' as result from mytable where a = 1
union all select user_id, date, 'b' from mytable where b = 1
union all select user_id, date, 'c' from mytable where c = 1
order by user_id, date, result

If you have a large amount of data or your "table" is really a complex query (say a subquery or view), then unpivoting is usually faster with cross join than with union all:
select t.user_id, t.date, r.result
from t cross join
(select 'a' as result union all
select 'b' as result union all
select 'c' as result
) r
where (t.a = 1 and r.result = 'a') or
(t.b = 1 and r.result = 'b') or
(t.c = 1 and r.result = 'c') ;
For a single smallish table, performance probably doesn't matter.

Related

Select duplicates while concatenating every one except the first

I am trying to write a query that will select all of the numbers in my table, but those numbers with duplicates i want to append something on the end that shows it as a duplicate. However I am not sure how to do this.
Here is an example of the table
TableA
ID Number
1 1
2 2
3 2
4 3
5 4
SELECT statement output would be like this.
Number
1
2
2-dup
3
4
Any insight on this would be appreciated.
if you mysql version didn't support window function. you can try to write a subquery to make row_number then use CASE WHEN to judgement rn > 1 then mark dup.
create table T (ID int, Number int);
INSERT INTO T VALUES (1,1);
INSERT INTO T VALUES (2,2);
INSERT INTO T VALUES (3,2);
INSERT INTO T VALUES (4,3);
INSERT INTO T VALUES (5,4);
Query 1:
select t1.id,
(CASE WHEN rn > 1 then CONCAT(Number,'-dup') ELSE Number END) Number
from (
SELECT *,(SELECT COUNT(*)
FROM T tt
where tt.Number = t1.Number and tt.id <= t1.id
) rn
FROM T t1
)t1
Results:
| id | Number |
|----|--------|
| 1 | 1 |
| 2 | 2 |
| 3 | 2-dup |
| 4 | 3 |
| 5 | 4 |
If you can use window function you can use row_number with window function to make rownumber by Number.
select t1.id,
(CASE WHEN rn > 1 then CONCAT(Number,'-dup') ELSE Number END) Number
from (
SELECT *,row_number() over(partition by Number order by id) rn
FROM T t1
)t1
sqlfiddle
I made a list of all the IDs that weren't dups (left join select) and then compared them to the entire list(case when):
select
case when a.id <> b.min_id then cast(a.Number as varchar(6)) + '-dup' else cast(a.Number as varchar(6)) end as Number
from table_a
left join (select MIN(b.id) min_id, Number from table_a b group by b.number)b on b.number = a.number
I did this in MS SQL 2016, hope it works for you.
This creates the table used:
insert into table_a (ID, Number)
select 1,1
union all
select 2,2
union all
select 3,2
union all
select 4,3
union all
select 5,4

MySQL Group where any 3 of 5 columns match

I am searching an addresses table for duplicates, using SOUNDEX to find the duplicates. This works fine, and it requires all 5 soundex columns to match in order to group
However, I want to GROUP where ANY 3 of my 5 SOUNDEX columns match.
Here is my current query:
SELECT `Address`.`id`,
SOUNDEX(`Address`.`address_company_name`) as soundex_address_company_name,
SOUNDEX(`Address`.`contact_name`) as soundex_contact_name,
SOUNDEX(`Address`.`street_address`) as soundex_street_address,
SOUNDEX(`Address`.`suburb`) as soundex_suburb,
SOUNDEX(`Address`.`city`) as soundex_city,
`Address`.`address_country_id`,
`Address`.`address_zone_id`,
`Address`.`postcode`,
COUNT(*)
FROM
`addresses` AS `Address`
WHERE
((`Address`.`address_company_name` IS NOT NULL)
OR (`Address`.`contact_name` IS NOT NULL))
GROUP BY
SOUNDEX(address_company_name),
SOUNDEX(contact_name),
SOUNDEX(street_address),
SOUNDEX(suburb),
SOUNDEX(city),
address_country_id,
address_zone_id,
postcode
HAVING
COUNT(*) > 1
I understand how to do this with multiple queries, ie: loop through each address in our database and then re-query the database for addresses which match any 3 of the 5 columns, however I am hoping to do this in fewer queries as the above query executes very quickly.
I also understand that were this possible, some records may be grouped multiple times, I don't mind if this is the case but I am unsure whether this flies in the face of MySQL logic?
You can try something like this
SELECT a.id, b.id id2, COUNT(*) no_matches
FROM
(
SELECT id,
column_id,
CASE column_id
WHEN 1 THEN SOUNDEX(address_company_name)
WHEN 2 THEN SOUNDEX(contact_name)
WHEN 3 THEN SOUNDEX(street_address)
WHEN 4 THEN SOUNDEX(suburb)
WHEN 5 THEN SOUNDEX(city)
END column_value
FROM addresses a CROSS JOIN
(
SELECT 1 column_id UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) i
WHERE address_company_name IS NOT NULL
OR contact_name IS NOT NULL
) a CROSS JOIN
(
SELECT id,
column_id,
CASE column_id
WHEN 1 THEN SOUNDEX(address_company_name)
WHEN 2 THEN SOUNDEX(contact_name)
WHEN 3 THEN SOUNDEX(street_address)
WHEN 4 THEN SOUNDEX(suburb)
WHEN 5 THEN SOUNDEX(city)
END column_value
FROM addresses a CROSS JOIN
(
SELECT 1 column_id UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) i
WHERE address_company_name IS NOT NULL
OR contact_name IS NOT NULL
) b
WHERE a.column_value = b.column_value
AND a.id < b.id
GROUP BY a.id, b.id
HAVING COUNT(*) > 2
Sample output:
| ID | ID2 | NO_MATCHES |
|----|-----|------------|
| 1 | 2 | 4 |
| 4 | 5 | 3 |
Here is SQLFiddle demo

Count Duplicates with same id passing in one coulmn

Hi there m trying to calculate the row count for same value,
id,value
1 | a
2 | b
3 | c
4 | d
5 | e
and my query is
select value, count(*) as Count from mytable where id in('4','2','4','1','4') group by value having count(*) > 1
for which my expected output will be,
value,Count
d | 3
b | 1
a | 1
Thanks, any help will be appreciated
Try that:
SELECT value, count(value) AS Count
FROM mytable m
WHERE value = m.value
GROUP BY value
SELECT t.id, t.value, COUNT(t.id)
FROM
test t
JOIN
( SELECT 1 AS id
UNION ALL SELECT 3
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 1
UNION ALL SELECT 1 ) AS tmp
ON t.id = tmp.id
GROUP BY t.id
Sample on sqlfiddle.com
See also: Force MySQL to return duplicates from WHERE IN clause without using JOIN/UNION?
Of course, your IN parameter will be dynamic, and thus you will have to generate the corresponding SQL statement for the tmp table.
That's the SQL-only way to do it. Another possibility is to have the query like you have it in your question and afterwards programmatically associate the rows to the count passed to the IN parameter.

Parse in MySQL using comma as delimiter

I have two tables
Table1 with list of users comma separated
Name UserID
abc A,B,C,D
def A,B,C
Table2
Name UserID
abc A
abc B
abc C
def A
def B
I need to find the users that are in table1 for each Name but not in table2 (There won't ever be an instance when a UserID to Name pair is present in table2 but not in table1 as CSV).
The output should be
Name UserID
abc D
def C
I can do this with PHP but is there a way this can be done through a query? I am not sure where to begin in case I'm doing this as a query. Can I parse in MySQL using comma as delimiter?
I plugged your test data into a test schema in SQLFiddle and ran the query that follows.
Here's the link to SQLFiddle with the test and positive results:
http://sqlfiddle.com/#!2/83dfd/4/0
Here's the query:
SELECT
COALESCE(NORMALIZED_TABLE1.NAME, TABLE2.NAME) AS NAME,
COALESCE(NORMALIZED_TABLE1.USERID, TABLE2.USERID) AS USERID
FROM (
SELECT NAME,
SUBSTRING(
USERID
FROM CASE
WHEN INDEX_TABLE.POS = 1 THEN 1
ELSE INDEX_TABLE.POS + 1
END
FOR CASE LOCATE(',', USERID, INDEX_TABLE.POS + 1)
WHEN 0 THEN CHARACTER_LENGTH(USERID) + 1
ELSE LOCATE(',', USERID, INDEX_TABLE.POS + 1)
END
- CASE
WHEN INDEX_TABLE.POS = 1 THEN 1
ELSE INDEX_TABLE.POS + 1
END
) AS USERID
FROM TABLE1
INNER JOIN (
SELECT #rownum:=#rownum+1 POS
FROM (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6
UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) a, (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6
UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) b, (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3
UNION SELECT 4 UNION SELECT 5 UNION SELECT 6
UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) c, (SELECT #rownum:=0) r
) INDEX_TABLE
ON INDEX_TABLE.POS <= CHAR_LENGTH(TABLE1.USERID)
AND (
INDEX_TABLE.POS = 1
OR SUBSTRING(USERID FROM INDEX_TABLE.POS FOR 1) = ','
)
) AS NORMALIZED_TABLE1
LEFT OUTER JOIN TABLE2
ON NORMALIZED_TABLE1.NAME = TABLE2.NAME
AND NORMALIZED_TABLE1.USERID = TABLE2.USERID
WHERE TABLE2.NAME IS NULL;
If you have very long column width in table1 you might need to expand the "INDEX_TABLE" subquery. You can copy and paste over it with the code at this link to do so:
http://www.experts-exchange.com/Database/MySQL/A_3573-A-MySQL-Tidbit-Quick-Numbers-Table-Generation.html
If you table is fixed (you are wanted to work on those tables) then design a class to read data from them instead of directly saving the set data from those predefined tables; let your object to process the read data live across your application domain to get a full-blown accessibility. :-D

SQL: finding differences between rows

I want to count how many times each user has rows within '5' of eachother.
For example, Don - 501 and Don - 504 should be counted, while Don - 501 and Don - 1600 should not be counted.
Start:
Name value
_________ ______________
Don 1235
Don 6012
Don 6014
Don 6300
James 9000
James 9502
James 9600
Sarah 1110
Sarah 1111
Sarah 1112
Sarah 1500
Becca 0500
Becca 0508
Becca 0709
Finish:
Name difference_5
__________ _____________
Don 1
James 0
Sarah 2
Becca 0
Use the ABS() function, in conjunction with a self-join in a subquery:
So, something like:
SELECT name, COUNT(*) / 2 AS difference_5
FROM (
SELECT a.name name, ABS(a.value - b.value)
FROM tbl a JOIN tbl b USING(name)
WHERE ABS(a.value - b.value) BETWEEN 1 AND 5
) AS t GROUP BY name
edited as per Andreas' comment.
Assuming that each name -> value pair is unique, this will get you the count of times the value is within 5 per name:
SELECT a.name,
COUNT(b.name) / 2 AS difference_5
FROM tbl a
LEFT JOIN tbl b ON a.name = b.name AND
a.value <> b.value AND
ABS(a.value - b.value) <= 5
GROUP BY a.name
As you'll notice, we also have to exclude the pairs that are equal to themselves.
But if you wanted to count the number of times each name's values came within 5 of any value in the table, you can use:
SELECT a.name,
COUNT(b.name) / 2 AS difference_5
FROM tbl a
LEFT JOIN tbl b ON NOT (a.name = b.name AND a.value = b.value) AND
ABS(a.value - b.value) <= 5
GROUP BY a.name
See the SQLFiddle Demo for both solutions.
Because the OP also wants de zero counts, we'll need a self- left join. Extra logic is needed if one person has two exactly the same values, these should also be counted only once.
WITH cnts AS (
WITH pair AS (
SELECT t1.zname,t1.zvalue
FROM ztable t1
JOIN ztable t2
ON t1.zname = t2.zname
WHERE ( t1.zvalue < t2.zvalue
AND t1.zvalue >= t2.zvalue - 5 )
OR (t1.zvalue = t2.zvalue AND t1.ctid < t2.ctid)
)
SELECT DISTINCT zname
, COUNT(*) AS znumber
FROM pair
GROUP BY zname
)
, names AS (
SELECT distinct zname AS zname
FROM ztable
GROUP BY zname
)
SELECT n.zname
, COALESCE(c.znumber,0) AS znumber
FROM names n
LEFT JOIN cnts c ON n.zname = c.zname
;
RESULT:
DROP SCHEMA
CREATE SCHEMA
SET
CREATE TABLE
INSERT 0 14
zname | znumber
-------+---------
Sarah | 3
Don | 1
Becca | 0
James | 0
(4 rows)
NOTE: sorry for the CTE, I had not seen th mysql tag,I just liked the problem ;-)
SELECT
A.Name,
SUM(CASE WHEN (A.Value < B.Value) AND (A.Value >= B.Value - 5) THEN 1 ELSE 0 END) Difference_5
FROM
tbl A INNER JOIN
tbl B USING(Name)
GROUP BY
A.Name