Join TEXT and VARCHAR columns together in MySQL - mysql

I have the following query and error message:
WITH full_table AS (
SELECT TRANSACTION_ID AS Transaction,USER_ID AS USER, TRANSACTION_TIMESTAMP AS TIMESTAMP, ORDER_LINE_ITEM, TIME_TAKEN, QUANTITY, TIME_STANDARD, transactions_department.DEPARTMENT, (TIME_STANDARD*QUANTITY) AS NORMALIZED_TIME_STANDARD
FROM pivoted_standards
INNER JOIN
transactions_department
ON
pivoted_standards.DEPARTMENT = transactions_department.DEPARTMENT
ORDER BY
USER, TIMESTAMP, ORDER_LINE_ITEM
) SELECT * FROM full_table
Error:
Illegal mix of collations (utf8mb4_general_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='
This error is talking about where I joined pivoted_standards.DEPARTMENT = transactions_department.DEPARTMENT. Both of these have the same collation, but one is VARCHAR while the other is text. How can I join them together? Is it possible to do that in this query?

You should be able to workaround that by adding COLLATE utf8mb4_general_ci on transactions_department.DEPARTMENT like this:
WITH full_table AS (
SELECT TRANSACTION_ID AS Transaction,USER_ID AS USER, TRANSACTION_TIMESTAMP AS TIMESTAMP, ORDER_LINE_ITEM, TIME_TAKEN, QUANTITY, TIME_STANDARD, transactions_department.DEPARTMENT, (TIME_STANDARD*QUANTITY) AS NORMALIZED_TIME_STANDARD
FROM pivoted_standards
INNER JOIN
transactions_department
ON
pivoted_standards.DEPARTMENT = transactions_department.DEPARTMENT COLLATE utf8mb4_general_ci
ORDER BY /*here ^^^^^ */
USER, TIMESTAMP, ORDER_LINE_ITEM
) SELECT * FROM full_table;
Here's a fiddle example
However, this is not a recommended long-term solution. I know that there are reasons why people need to have different collation so consider using something like dept_id with INT datatype and join the tables with different collation using that instead.
Example fiddle using INT datatype for join

Related

PHPMYADMIN: Access field in EXISTS clause

I am sure it's just a typo, but how to write the following query correctly in PHPMyAdmin?
SELECT DISTINCT `email_address` as tmp1
FROM `already_customer_checks`
WHERE `is_customer` = 0
AND NOT EXISTS (
SELECT *
FROM `already_customer_checks`
WHERE `email_address` = tmp1
AND `is_customer` = 1
)
Error: #1054 - Unknown table field 'tmp1' in where clause
Background: I want to get all e-mail addresses which exist with 'is_customer' = 0 and do not have another existance in the table with 'is_customer' = 1.
Thank you very much in advance!
To do it with a subquery you need to put the alias tmp1 on the table, not on the column. And then:
SELECT DISTINCT `email_address`
FROM `already_customer_checks` as tmp1
WHERE `is_customer` = 0
AND NOT EXISTS (
SELECT *
FROM `already_customer_checks`
WHERE `email_address` = tmp1.`email_address`
AND `is_customer` = 1
)
You might also consider the comment proposed by #kmoser, which could be more efficient, if less clear. According to the MySQL docs:
A LEFT [OUTER] JOIN can be faster than an equivalent subquery because the server might be able to optimize it better—a fact that is not specific to MySQL Server alone.
But if you use that SQL proposed by #kmoser, you probably don't want to alias the email_address column with tmp1.

Can we use FIND_IN_SET() function for multiple column in same table

NOTE : I tried many SF solution, but none work for me. This is bit challenging for, any help will be appreciated.
Below is my SQL-Fiddle link : http://sqlfiddle.com/#!9/6daa20/9
I have tables below:
CREATE TABLE `tbl_pay_chat` (
nId int(11) NOT NULL AUTO_INCREMENT,
npayid int(11) NOT NULL,
nSender int(11) NOT NULL,
nTos varchar(255) binary DEFAULT NULL,
nCcs varchar(255) binary DEFAULT NULL,
sMailBody varchar(500) binary DEFAULT NULL,
PRIMARY KEY (nId)
)
ENGINE = INNODB,
CHARACTER SET utf8,
COLLATE utf8_bin;
INSERT INTO tbl_pay_chat
(nId,npayid,nSender,nTos,nCcs,sMailBody)
VALUES
(0,1,66,'3,10','98,133,10053','Hi this test maail'),
(0,1,66,'3,10','98,133,10053','test mail received');
_____________________________________________________________
CREATE TABLE `tbl_emp` (
empid int(11) NOT NULL,
fullname varchar(45) NOT NULL,
PRIMARY KEY (empid)
)
ENGINE = INNODB,
CHARACTER SET utf8,
COLLATE utf8_bin;
INSERT INTO `tbl_emp` (empid,fullname)
VALUES
(3, 'Rio'),
(10, 'Christ'),
(66, 'Jack'),
(98, 'Jude'),
(133, 'Mike'),
(10053, 'James');
What I want :
JOIN above two tables to get fullname in (nTos & nCcs) columns.
Also, I want total COUNT() of rows.
What I tried is below query but getting multiples time FULLNAME in 'nTos and nCcs column' also please suggest to find proper number of row count.
SELECT a.nId, a.npayid, e1.fullname AS nSender, sMailBody, GROUP_CONCAT(b.fullname ORDER BY b.empid)
AS nTos, GROUP_CONCAT(e.fullname ORDER BY e.empid) AS nCcs
FROM tbl_pay_chat a
INNER JOIN tbl_emp b
ON FIND_IN_SET(b.empid, a.nTos) > 0
INNER JOIN tbl_emp e
ON FIND_IN_SET(e.empid, a.nCcs) > 0
JOIN tbl_emp e1
ON e1.empid = a.nSender
GROUP BY a.nId ORDER BY a.nId DESC;
I hope I made my point clear. Please help.
You have a horrible data model. You should not be storing lists of ids in strings. Why? Here are some reasons:
Numbers should be stored as numbers not strings.
Relationships between tables should be declared using foreign key relationships.
SQL has pretty poor string manipulation capabilities.
The use of functions and type conversion in ON often prevents the use of indexes.
No doubt there are other good reasons. Your data model should be using properly declared junction tables for the n-m relationships.
That said, sometimes we are stuck with other people's really, really, really, really bad design decisions. There are some ways around this. I think the query that you want can be expressed as:
SELECT pc.nId, pc.npayid, s_e.fullname AS nSender, pc.sMailBody,
GROUP_CONCAT(DISTINCT to_e.fullname ORDER BY to_e.empid)
AS nTos,
GROUP_CONCAT(DISTINCT cc_e.fullname ORDER BY cc_e.empid) AS nCcs
FROM tbl_pay_chat pc INNER JOIN
tbl_emp to_e
ON FIND_IN_SET(to_e.empid, pc.nTos) > 0 INNER JOIN
tbl_emp cc_e
ON FIND_IN_SET(cc_e.empid, pc.nCcs) > 0 JOIN
tbl_emp s_e
ON s_e.empid = pc.nSender
GROUP BY pc.nId
ORDER BY pc.nId DESC;
Here is a db<>fiddle.

Mysql deduplicate records in single query

I have the following table:
CREATE TABLE `relations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`relationcode` varchar(25) DEFAULT NULL,
`email_address` varchar(100) DEFAULT NULL,
`firstname` varchar(100) DEFAULT NULL,
`latname` varchar(100) DEFAULT NULL,
`last_contact_date` varchar(25) DEFAULT NULL,
PRIMARY KEY (`id`)
)
In this table there are duplicates, these are relation with exact the same relationcode and email_address. They can be in there twice or even 10 times.
I need a query that selects the id's of all records, but excludes the ones that are in there more than once. Of those records, I only would like to select the record with the most recent last_contact_id only.
I'm more into Oracle than Mysql, In Oracle I would be able to do it this way:
select * from (
select row_number () over (partition by relationcode order by to_date(last_contact_date,'dd-mm-yyyy')) rank,
id,
relationcode,
email_address ,
last_contact_date
from RELATIONS)
where rank = 1
But I can't figure out how to modify this query to work in MySql. I'm not even dure it's possible to do the same thing in a single query in MySQl.
Any ideas?
Normal way to do this is a sub query to get the latest record and then join that against the table:-
SELECT id, relationcode, email_address, firstname, latname, last_contact_date
FROM RELATIONS
INNER JOIN
(
SELECT relationcode, email_address, MAX(last_contact_date) AS latest_contact_date
FROM RELATIONS
GROUP BY relationcode, email_address
) Sub1
ON RELATIONS.relationcode = Sub1.relationcode
AND RELATIONS.email_address = Sub1.email_address
AND RELATIONS.last_contact_date = Sub1.latest_contact_date
It is possible to manually generate the kind of rank that your Oracle query uses using variables. Bit messy though!
SELECT id, relationcode, email_address, firstname, latname, last_contact_date
FROM
(
SELECT id, relationcode, email_address, firstname, latname, last_contact_date, #seq:=IF(#relationcode = relationcode AND #email_address = email_address, #seq + 1, 1) AS seq, #relationcode := relationcode, #email_address := email_address
(
SELECT id, relationcode, email_address, firstname, latname, last_contact_date
FROM RELATIONS
CROSS JOIN (SELECT #seq:=0, #relationcode := '', #email_address :='') Sub1
ORDER BY relationcode, email_address, last_contact_date DESC
) Sub2
) Sub3
WHERE seq = 1
This uses a sub query to initialise the variables. The sequence number is added to if the relation code and email address are the same as the previous row, if not they are reset to 1 and stored in a field. Then the outer select check the sequence number (as a field, not as the variable name) and records only returned if it is 1.
Note that I have done this as multiple sub queries. Partly to make it clearer to you, but also to try to force the order that MySQL executes it is. There are a couple of possible issues with how MySQL says it may order the execution of things that could cause an issue. They never have done for me, but with sub queries I would hope for force the order.
Here is a method that will work in both MySQL and Oracle. It rephrases the question as: Get me all rows from relations where the relationcode has no larger last_contact_date.
It works something like this:
select r.*
from relations r
where not exists (select 1
from relations r2
where r2.relationcode = r.relationcode and
r2.last_contact_date > r.last_contact_date
);
With the appropriate indexes, this should be pretty efficient in both databases.
Note: This assumes that last_contact_date is stored as a date not as a string (as in your table example). Storing dates as strings is just a really bad idea and you should fix your data structure

Simple MySQL query runs very slow

I found some strange(for me) behavour in MySQL. I have a simple query:
SELECT CONVERT( `text`.`old_text`
USING utf8 ) AS stext
FROM `text`
WHERE `text`.`old_id` IN
(
SELECT `revision`.`rev_text_id`
FROM `revision`
WHERE `revision`.`rev_id`
IN
(
SELECT `page_latest`
FROM `page`
WHERE `page_id` = 108
)
)
when i run it, phpmyadmin show execution time of 77.0446 seconds.
But then i replace
WHERE `text`.`old_id` IN
by
WHERE `text`.`old_id` =
it's execution time falls to about 0.001 sec. Result of this query
SELECT `revision`.`rev_text_id`
FROM `revision`
WHERE `revision`.`rev_id`
IN
(
SELECT `page_latest`
FROM `page`
WHERE `page_id` = 108
)
is
+------------+
|rev_text_id |
+------------+
|6506 |
+------------+
Can somebody please explain this behavour?
try to add INDEX on the following columns,
ALTER TABLE `text` ADD INDEX idx_text (old_id);
ALTER TABLE `revision` ADD INDEX idx_revision (rev_text_id);
and Execute the following query
SELECT DISTINCT CONVERT(a.`old_text` USING utf8 ) AS stext
FROM `text` a
INNER JOIN `revision` b
ON a.`old_id` = b.`rev_text_id`
INNER JOIN `page` c
ON b.`rev_id` = c.`page_latest`
WHERE c.`page_id` = 108
PS: Can you run also the following query and post their respective results?
DESC `text`;
DESC `revision`;
DESC `page`;
There are two primary ways you can increase your query performance here
Add Indexes (such as Kuya mentioned)
Rid yourself of the subqueries where possible
For Indexes, add an index on the columns you are searching for your matches:
text.old_id, revision.rev_text_id & page.page_id
ALTER TABLE `text` ADD INDEX idx_text (old_id);
ALTER TABLE `revision` ADD INDEX idx_revision (rev_text_id);
ALTER TABLE `page` ADD INDEX idx_page (page_id);
Your next issue is that nested-sub-selects are hell on your query execution plan. Here is a good thread discussing JOIN vs Subquery. Here is an article on how to get execution plan info from mySQL.
First looks at an execution plan can be confusing, but it will be your best friend when you have to concern yourself with query optimization.
Here is an example of your same query with just joins ( you could use inner or left and get pretty much the same result). I don't have your tables or data, so forgive synax issues (there is no way I can verify the code works verbatim in your environment, but it should give you a good starting point).
SELECT
CONVERT( `text`.`old_text` USING utf8 ) AS stext
FROM `text`
-- inner join only returns rows when it can find a
-- matching `revision`.`rev_text_id` row to `text`.`old_id`
INNER JOIN `revision`
ON `text`.`old_id` = `revision`.`rev_text_id`
-- Inner Join only returns rows when it can find a
-- matching `page_latest` row to `page_latest`
INNER JOIN `page`
ON `revision`.`rev_id` = `page`.`page_latest`
WHERE `page`.`page_id` = 108
MySQLDB is looping through each result of the inner query and comparing it with each record in the outer query.
in the second inner query;
WHERE `revision`.`rev_id`
IN
( SELECT `page_latest`
FROM `page`
WHERE `page_id` = 108
you should definitely use '=' instead of IN, since you're selecting a distinct record, there would be no point in looping through a result when you know only one record will be returned each time

mysql improve performmance by create new table

the code below is work but it take long time to run sometime the php code stop but it work well with mysql graphical tool (becase it can wait) . Are there any method to make it faster I use index in some column it help Are there any method to create table if I create table i think it will improve performance but this table it must be update maybe one a week what is the good practice please suggest
SELECT member.own, member.provincecode, province.PROVINCE_NAME, member.amphurecode,
amphur.AMPHUR_NAME, member.Sname, member.Ssurname, member.Hno, member.Moo,
member.Sex, member.tambol, member.dateofbirth, member.migratedate,
Year( Current_Date( ) ) - Year( member.dateofbirth ) AS y,
Year( Current_Date( ) ) - Year( member.migratedate ) AS m
FROM member
LEFT JOIN amphur ON ( member.amphurecode
COLLATE utf8_general_ci = amphur.AMPHUR_CODE )
LEFT JOIN province ON member.provincecode
COLLATE utf8_general_ci = province.PROVINCE_CODE
Collate is an expensive operation. So assuming that member has more rows thanprovince, try to collate the table with the smaller number of rows:
LEFT JOIN province ON member.provincecode
= province.PROVINCE_CODE COLLATE <collation of member.provincecode>
It would be even better to give all columns in all tables the same collation.