mysql join if string contains similar values - mysql

Assuming I have a table that has column Description with below values:
His name was Jacob King
One of the guy was Jacob. He was taller than them
How do I join these two rows (in MySql) since they both contain the word Jacob? There will be more rows with other words too so Jacob is not the only word than can appear in more than one row. What I want is a way of joining rows with similar words appearing in them.
I tried using left join with LIKE keyword as shown below but it didn't work since i am just looking for similar word in sentences
select * from (SELECT id,description FROM `statement`) f1
left JOIN (SELECT id,description FROM `statement`) f2
on f1.description like concat('%' ,f2.description,'%')
The above doesn't work, I think because I am looking for a word as opposed to the entire sentence

You can try usinga join
SELECT concat (s1.description , ' ', s2.description)'
FROM `statement` s1
INNER JOIN `statement` s2 ON s1.description like ('%Jacob%')
AND s2.description like ('%Jacob%')
or using an input
set #my_word= 'jacob';
SELECT concat (s1.description , ' ', s2.description)'
FROM `statement` s1
INNER JOIN `statement` s2 ON lower(s1.description) like concat('%', #my_word, '%')
AND lower(s2.description) like concat('%', #my_word, '%')

Related

Complex SQL substr comparison for dups

slightly complex problem here I'd like to solve in SQL:
I have duplicate person records like these:
Many examples like this where the name was misspelled, so my inbound ETL code didn't detect them as duplicates.
I have a dedupping workflow, that culls suspected duplicates with same first/last names and let's the user collapse them. The query for this page is below:
SELECT * FROM
((address
INNER JOIN ((person AS a
INNER JOIN (SELECT
idperson, last_name, first_name, middle, suffix
FROM
person
GROUP BY last_name , first_name
HAVING Count(*) > 1) AS b ON (a.first_name = b.first_name)
AND (a.last_name = b.last_name))
INNER JOIN constituent ON a.constituent_idconstituent = constituent.idconstituent
INNER JOIN constituent_address ON a.constituent_idconstituent = constituent_address.constituent_idconstituent) ON address.idaddress = constituent_address.address_idaddress)
INNER JOIN city ON address.city_idcity = city.idcity)
INNER JOIN
state ON city.state_idstates = state.idstates
WHERE
a.last_name = 'Cascarano'
ORDER BY a.last_name , a.first_name , a.middle , address.line_1 ASC
However, my example above where the first names are spelled differently, isn't caught by this query.
Is there a substring or some other SQL trick I can apply here, to somehow maybe chop up the first_name field and look for, maybe 75% letter match? I know I'm reaching...
Thanks!!!!!

Working LIKE statement with % on column names but rows repeated

I have my SELECT query used with LIKE statement working but am shocked; that my rows fetched are repeated, and i don't know why?
SELECT *
FROM questions, counts
WHERE counts.test_coursecode LIKE '%' || questions.coursecode || '%'
You must include the inner join of the two tables
SELECT *
FROM questions q inner join counts c on a q.id and c.fk
WHERE counts.test_coursecode LIKE CONCAT('%', questions.coursecode, '%')
or
SELECT *
FROM questions q , counts c
where a q.id and c.fk
and counts.test_coursecode LIKE CONCAT('%', questions.coursecode, '%')
I'm assuming you want to use
SELECT *
FROM questions, counts
WHERE counts.test_coursecode LIKE CONCAT('%', questions.coursecode, '%')
instead.
|| is not the concatenation operator but the logical OR in the sql dialect of MySQL.
Your query will match every row, because any value will match the first expression, it evaluates to
SELECT *
FROM questions, counts
WHERE counts.test_coursecode LIKE '%' -- that's true, if test_course_code is not null
OR questions.coursecode
OR '%'

In MySQL, How can get the intersection of the 9 select results?

I am making the sql code. I have met the barrier, that is, so many select sentences in SQL query. Finally I want to get to the intersection of 9 select results
My sql code is same as below, just 1 select sentence. 8 select sentences are different from only search word, eg) cholera, diarrhea, fever, vomit, nausea, etc
First select sentence. Don't be suprised. That code is simple and repeatedly.
(SELECT code_co.code, code_co.disease_co, code_en.disease_en
FROM code_co
LEFT JOIN code_en ON code_en.code = code_co.code
LEFT JOIN note ON note.code = code_co.code
LEFT JOIN inclusion ON inclusion.code = code_co.code
LEFT JOIN exclusion ON exclusion.code = code_co.code
LEFT JOIN ds ON code_co.code = ds.code
LEFT JOIN tx ON code_co.code = tx.code
LEFT JOIN sx ON code_co.code = sx.code
WHERE
note LIKE CONCAT( '%', (
SELECT ds_word.ds_en
FROM ds_word
WHERE ds_co LIKE '%cholera%'
LIMIT 0 , 1
), '%' )
or
ds_content LIKE CONCAT( '%', (
SELECT ds_word.ds_en
FROM ds_word
WHERE ds_co LIKE '%cholera%'
LIMIT 0 , 1
), '%' )
...
inclusion LIKE CONCAT( '%', (
SELECT ds_word.ds_en
FROM ds_word
WHERE ds_co LIKE '%cholera%'
LIMIT 0 , 1
), '%' )
)
Below is the captured picture on phpmyadmin.
Really working code!
And 2nd select sentence is same as first sentence except cholera. Cholera is my search word.
In this way, I have 9 select sentences.
I want to get the intersection, but in MySQL, How can I care?
Intersect or minus can be used in just 2 sentences. (Right?)
(1st select sentence)
intersect
(2nd select sentence)
intersect
(3rd select sentence)
...
This way is right?
Please help me.
Thank you for your advice
You do the "intersect" by using and in the where clause. Using or is equivalent to a "union".
Also, you can simplify your expression by doing:
LEFT JOIN sx ON code_co.code = sx.code
CROSS JOIN (SELECT concat('%', ds_word.ds_en, '%') as pattern
FROM ds_word
WHERE ds_co LIKE '%cholera%'
LIMIT 0 , 1
) const
WHERE note LIKE const.pattern and
ds_content like const.pattern and
. . .

mysql GROUP_CONCAT DISTINCT multiple columns

I have a tag field for a blog posts. tags have unique id but their displayName might be duplicated. What I want is a query that selects posts and in all_tags field we get couples of (id,displayName) is this way:
id1,name1;id2,name2;id3,name3
My query looks like:
select ....
CONCAT_WS(';', DISTINCT (CONCAT_WS(',',tags.id,tags.displayName))) AS all_tags
Join ...post content ...
Join ...post_tags ...
Join ...tags ...
ORDER BY posts.id
This line causes problem:
CONCAT_WS(';', DISTINCT (CONCAT_WS(',',tags.id,tags.displayName))) AS all_tags
How should I modify it?
Some people use an inner (SELECT .. FROM) but as I have heard, it is so inefficien
SELECT `posts`.*,`categories`.*,`creators`.*,`editors`.*
CONCAT_WS(';', DISTINCT GROUP_CONCAT(CONCAT_WS(',',tags.id,tags.displayName))) AS all_ids
FROM (`posts`)
LEFT JOIN `languages` ON `posts`.`language_id`=`languages`.`id`
LEFT JOIN `users` as creators ON `posts`.`creatorUser_id`=`creators`.`id`
LEFT JOIN `users` as editors ON `posts`.`lastEditorUser_id`=`editors`.`id`
LEFT JOIN `userProfiles` as editors_profile ON `editors`.`profile_id`=`editors_profile`.`id`
LEFT JOIN `categories` ON `posts`.`category_id`=`categories`.`id`
LEFT JOIN `postTags` ON `postTags`.`post_id`=`posts`.`id`
LEFT JOIN `tags` ON `postTags`.`tag_id`=`tags`.`id`
LEFT JOIN `postTags` as `nodetag_checks` ON `nodetag_checks`.`post_id`=`posts`.`id`
LEFT JOIN `tags` as `tag_checks` ON `nodetag_checks`.`tag_id`=`tag_checks`.`id`
WHERE ( 9 IN(`tag_checks`.`id`,`tag_checks`.`cached_parents`) OR 10 IN(`tag_checks`.`id`,`tag_checks`.`cached_parents`) OR 11 IN(`tag_checks`.`id`,`tag_checks`.`cached_parents`))
GROUP BY `posts`.`id` ORDER BY `posts`.`created` desc LIMIT 0, 20
Try this:
GROUP_CONCAT(
DISTINCT CONCAT(tags.id,',',tags.displayName)
ORDER BY posts.id
SEPARATOR ';'
)
As advised by #Willa, I add my comment as an anwser :
GROUP_CONCAT allows you to concat multiple fields :
GROUP_CONCAT(tags.id, ',', tags.displayName)
The only difference with Stephan's answer is in case your code allows the same tag to be affected several times to one post OR if you JOIN sequence leads you to multiselect the same record in the tag table. In those case, my solution will return the same tags multiple times.
On top of #Stephan's great answer, to prevent the same content showing up multiple times due to multiple JOIN's in your query but you don't want the id to show in the output...
GROUP_CONCAT(
DISTINCT
tags.displayName,
'||__', tags.id, '__||'
SEPARATOR '\n'
)
And then loop over the result in the end removing everything between ||__ and __|| .
This example is for php:
$data = preg_replace("/\|\|__.*__\|\|/", '', $data);

return results of same ID from 2 tables

I'm using an opensource database, so it's setup is a bit over my head.
Its basically like this.
A persons normal information is in the table 'person_per'
There is custom information in the table 'person_custom'
both use 'per_ID' to organize.
select per_ID from person_custom where c3 like '2';
gives my the IDs of people who fit my search, I want to "join" (I think) their name, phone, ect from the 'person_per' table using the ID as the "key"(terms I read that seem to fit).
How can I do that in a single query?
select per.*
from person_per per
inner join person_custom cus on cus.per_id = per.per_id
where cus.c3 = 2
You can retrieve all the columns from both tables with a single query:
SELECT p.name
, p.phone
, p.ect
, c.custom_col
FROM person_per p
JOIN person_custom c
ON c.per_ID = p.per_ID
WHERE c.c3 LIKE '2'
Use a JOIN operator between the table names, and include the "matching" criteria (predicate) in the ON clause.