I am using this mysql query to fetch data from DB
SELECT DISTINCT CONCAT( streetObj.street_type, ' ',streetObj.street_name, ', ', neighborhoodObj.name , ', ', cityObj.name, ', ', stateObj.abbreviation ) namet
FROM street streetObj
LEFT
JOIN cep cepObj1
ON cepObj1.street_id = streetObj.street_id
LEFT
JOIN neighborhood neighborhoodObj
ON neighborhoodObj.neighborhood_id = cepObj1.start_neighborhood_id
LEFT
JOIN city cityObj
ON streetObj.city_id = cityObj.city_id
LEFT
JOIN state stateObj
ON stateObj.state_id = cityObj.state_id
WHERE CONCAT(streetObj.street_type,streetObj.street_name) LIKE '%rua%'
AND CONCAT(streetObj.street_type,streetObj.street_name) LIKE '%Gomes%'
AND CONCAT(streetObj.street_type,streetObj.street_name) LIKE '%de%'
AND CONCAT(streetObj.street_type,streetObj.street_name) like '%Ca%'
AND cityObj.city_id = '9668'
ORDER
BY namet ASC
LIMIT 10;
This query is executed when I type
rua Gomes de Ca
And this query result is this
Rua Baltazar Gomes de Alarcão, Jardim Miriam, São ...
Rua Cabo José Gomes de Barros, Conjunto Habitacion...
Rua Cabo Luís Gomes de Quevedo, Parque Novo Mundo,...
Rua Gomes de Carvalho, Vila Olímpia, São Paulo, SP
Rua João Gomes de Mendonça, Jaraguá, São Paulo, SP
Rua João Gomes de Mendonça, Jardim Taipas, São Pau...
Rua Pedro Gomes de Camargo, Vila Rio Branco, São P...
So as you can see i want those results on top which find exact match, But its not working.
In this query i want
Rua Gomes de Carvalho, Vila Olímpia, São Paulo, SP
on top position.
You need to rank the results by the strength of the match, and sort by that. You will have to define the sort yourself. For example:
select ..
from...
ORDER BY
case
when text like "%all my search phrase%" then 1
when text like "%all my%" then 2
when text like "%search phrase%" then 2
when text like "%phrase%" then 3
else 1000 end
DESCENDING
or
ORDER BY
case when text like "%word%" then 1 else 0 end
+
case when text like "%second_word%" then 1 else 0 end
+
.....
DESC
Specifically for your example
select namet from
(select 'Rua Baltazar Gomes de Alarcão, Jardim Miriam, São ...' as namet
union all select 'Rua Cabo José Gomes de Barros, Conjunto Habitacion...'
union all select 'Rua Cabo Luís Gomes de Quevedo, Parque Novo Mundo,...'
union all select 'Rua Gomes de Carvalho, Vila Olímpia, São Paulo, SP'
union all select 'Rua João Gomes de Mendonça, Jaraguá, São Paulo, SP'
union all select 'Rua João Gomes de Mendonça, Jardim Taipas, São Pau...'
union all select 'Rua Pedro Gomes de Camargo, Vila Rio Branco, São P...')tbl
order by
case when namet like "%rua gomes de ca%" then 100 else 0 end+ #high score for full match
case when namet like "%rua%" then 1 else 0 end+ #lower score for partial matches
case when namet like "%Gomes%" then 1 else 0 end+
case when namet like "%de%" then 1 else 0 end+
case when namet like "%ca%" then 1 else 0 end desc LIMIT 10
Although you probably want to write something to split your search phrase into words, search for every word, and rank on number of words matched. You could also look into soundex or levenstein distance for ranking similarity. Doing it in sql though is harder than doing it programatically.
Related
I am making a join with two tables, tab_usuarios (users) and tab_enderecos (address).
tab_usuarios structure:
id_usuario
nome
usuario
1
Administrador
admin
2
Novo Usuário
teste
3
Joao Silva
jao
tab_enderecos structure:
id_endereco
id_usuario
cidade
uf
2
1
cidade
SP
20
2
Lorena
SP
22
2
Lorena
SP
24
3
Campinas
SP
28
4
Lorena
SP
I have this simple query which brings me the following result:
Select
u.id_usuario,
u.usuario,
u.nome,
e.id_endereco,
e.cidade,
e.uf
From
tab_usuarios u Left Join
tab_enderecos e On u.id_usuario = e.id_usuario
id_usuario
usuario
nome
id_endereco
cidade
uf
1
admin
Administrador
2
cidade
SP
2
user 2
Novo Usuário
22
Lorena
SP
2
user 2
Novo Usuário
20
Lorena
SP
3
jao
Joao Silva
24
Campinas
SP
4
teste
fabio
28
Lorena
SP
What I want is, for example, for id_usuario = 2, I only want to bring the id_endereco = 20, which is the first address that have been inserted on the database.
I tried with min and a couple others.
This should do it, assuming you have MySql 8.0 and not some ancient 5.x version:
SELECT *
FROM (
SELECT u.id_usuario, u.usuario, u.nome, e.id_endereco, e.cidade, e.uf,
row_number() over (partition by u.id_usuario order by e.id_endereco) rn
FROM tab_usuarios u
LEFT JOIN tab_enderecos e On u.id_usuario = e.id_usuario
) t
WHERE rn = 1
See it work here:
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=c506baf8157f82390bb335d074e7614c
I am trying to get the words appear the most times in different articles.
For example :
Table : Articles
Id Article
1 <b>Une santé digitale au plus près des besoins des patients et des soignants ? Direction Medidays </b> <u> <br> </u><br/>Paris, le mercredi 29 mai 2019 – Si l’on en croit l’ensemble des programmes de santé publique et tous les projets publics et privés dédiés à l’organisation des soins, les outils digitaux seront demain incontournables pour faciliter la pratique des professionnels de santé et améliorer le quotidien des patients. Pourtant, aujourd’hui, un nombre non négligeable des outils qui ont déjà été développés ne se différencient guère de gadgets au pire ou ne présentent pas de valeur ajoutée fondamentale par rapport aux systèmes classiques au mieux. <br/><b>Quarante-huit heures d’effervescence</b><br/>Inclure les professionnels de santé et les représentants de patients dans la conception des projets digitaux est sans doute la voie à suivre pour corriger cet écueil. Aussi, étaient-ils des participants de premier plan lors des Medidays, premier hackaton e-santé organisé par l’Assistance publique – hôpitaux de Paris (AP)-(HP) et Doctolib le week-end dernier. Pendant quarante-huit heures, dans une belle effervescence, vingt-deux équipes comptant des professionnels de santé, des cadres de santé, des patients, des développeurs, des designers ou encore des graphistes ont travaillé sans relâche pour présenter à un jury de spécialistes des projets innovants mais également adaptés à la pratique quotidienne. <br/><b>De la dépression du post partum au coaching des infirmières hospitalières</b><br/>Cinq programmes sur les trente-cinq présentés ont retenu l’attention. Ils ont tous en commun de promouvoir une amélioration directe de la prise en charge des patients ou de la vie pratique des professionnels de santé. Ainsi, « <i>Docteur Simone</i> » est une application proposée par Anne-Charlotte Dimmy pour améliorer la prévention de la dépression post-partum. « Chat marche » imaginée par Flavien Quijoux promet grâce à un système de reconnaissance d’image de lutter contre la chute des personnes âgées. Quant à « <i>Post hop</i> », coup de cœur de l’AP-HP présentée par Romain Laurent, elle est dédiée à la rééducation améliorée après chirurgie. <br/>Du côté de l’amélioration de la vie pratique des professionnels de santé, deux applications ont été saluées : Supply Med, dessinée par Rubin Soudry, une marketplace digitale dédiée aux fournitures médicales dentaires et Coach My Nurse, programme de coaching destiné aux infirmières hospitalières produite par Martin Louvel. L’ensemble de ces applications bénéficieront de soutiens technologiques afin d’assurer leur développement. « <i>Nous sommes heureux de pouvoir faire émerger et d’accompagner des projets qui permettront, demain, de contribuer à la transformation du système de santé. En 48 heures, des premières solutions extrêmement prometteuses ont émergé. C’est la preuve que lorsque plusieurs acteurs de la santé se mettent en commun pour réfléchir au futur de la santé en France, des projets utiles et innovants peuvent voir le jour. Chez Doctolib, nous sommes très fiers d’avoir rendu cela possible</i> », a observé Stanislas Niox-Chateau, co-fondateur et président de Doctolib et membre du jury de la 1ère édition de Medidays. <br/> <b>Léa Crébat </b> </p>
I use this SQL query:
select DISTINCT val, cnt as result from(
select (substring_index(substring_index(t.article, ' ', n.n), ' ', -1)) val,count(*) as cnt
from articles t cross join(
select a.n + b.n * 10 + 1 n
from
(select 0 as n union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9) a,
(select 0 as n union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9) b
order by n
) n
where n.n <= 1 + (length(t.article) - length(replace(t.article, ' ', '')))
AND (substring_index(substring_index(t.article, ' ', n.n), ' ', -1)) NOT REGEXP '^[0-9]+$'
AND (substring_index(substring_index(t.article, ' ', n.n), ' ', -1)) > ''
group by val
order by cnt desc
) as x
ORDER BY `result` DESC LIMIT 5
For the moment I can get :
val result
des 8
de 4
et 4
santé 3
au 2
But I think there is a problem because if I search by hand in the article, I see that "des" appears 26 times, "de" appears 34 times, "et" appears 11 times, "santé" 12 times and "au" appears 5 times.
How can I get the exact number of times each word appears in the text?
You are only counting among the first 100 words. You can extend this to 1000:
(select a.n + b.n * 10 + c.n * 100 + 1 as n
from (select 0 as n union all select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
) a cross join
(select 0 as n union all select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
) b cross join
(select 0 as n union all select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
) c
) n
I have to use REPLACE() to remove some special characters
SQL DEMO
SELECT val, count(*)
FROM (
SELECT
DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(message, ' ', n.digit + m.digit*10 + o.digit*100 + p.digit*1000 +1), ' ', -1) val,
n.digit + m.digit*10 + o.digit*100 + p.digit*1000 as word
FROM
(SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(message, '<b>', ' '), '</p>', ' '), '</b>', ' '), '<br/>', ''), '<br>', ''), ',', ' '), '<i>', ' '), '</i>', ' '), '.', ' '), '<u>', ' '), '</u>', ' ') as message
FROM Table1
) as Table1
CROSS JOIN (SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) n
CROSS JOIN (SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) m
CROSS JOIN (SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) o
CROSS JOIN (SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) p
ON LENGTH(REPLACE(message, ' ' , '')) <= LENGTH(message)-n.digit + m.digit*10 + o.digit*100 + p.digit*1000
) as T
WHERE val <> ' '
GROUP BY val
ORDER BY COUNT(*) DESC
I have the following query:
SELECT * FROM (
SELECT codigo, protocolo, status, nome
FROM protocolo
GROUP BY protocolo.protocolo
UNION ALL
SELECT codigo, protocolo, status, nome
FROM simulador
) tabela
return
codigo protocolo status nome
559 2016000026 1 ALESSANDRO CAMPOS BONIFACIO
0 2016000026 0 ALESSANDRO CAMPOS BONIFACIO
0 2016000008 0 MARIA DE JESUS F. DA SILVA ***
0 2016000007 0 MARGARIDA BORGES DA SILVA
558 2016000008 1 MARIA DE JESUS F. DA SILVA ***
556 2015014035 1 MARIA DALVA DA SILVA
There are two identical protocolo (2016000008) with different status (0,1) . I want to display only one of the repeated protocolo , one that has status = 1
Is this what you want?
SELECT odigo, protocolo, MAX(status) as stat, nome
FROM (
SELECT codigo, protocolo, status, nome
FROM protocolo
GROUP BY protocolo.protocolo
UNION ALL
SELECT codigo, protocolo, status, nome
FROM simulador
) tabela
GROUP BY codigo, protocolo, nome ;
Note: In a GROUP BY query, all columns in the SELECT should be either in the GROUP BY or in aggregation functions, unless you really, really know what you are doing.
I am having a table with documents where each document has a doc_id but on the same date for the same case_id I might be having two different language versions
doc_id case_id date lang
001-89259 1012/02 2008-11-04 FRA
001-144945 10122/04 2014-06-19 ENG
001-57558 10126/82 1988-06-21 ENG
001-62116 10126/82 1988-06-21 FRA
001-91708 10129/04 2009-03-10 FRA
001-116955 10131/11 2013-03-07 FRA
001-102676 10143/07 2011-01-11 FRA
001-104520 10145/07 2011-04-12 FRA
001-72756 10162/02 2006-03-09 FRA
001-72757 10162/02 2006-03-09 ENG
001-82198 10163/02 2007-09-06 ENG
001-57555 10208/82 1988-05-26 ENG
001-62113 10208/82 1988-05-26 FRA
What I want to do is to select the english version, if available, per case_id, date, otherwise keep the french. My output would then look like:
doc_id case_id date lang
001-89259 1012/02 2008-11-04 FRA
001-144945 10122/04 2014-06-19 ENG
001-57558 10126/82 1988-06-21 ENG -- keep only the english version
001-91708 10129/04 2009-03-10 FRA
001-116955 10131/11 2013-03-07 FRA
001-102676 10143/07 2011-01-11 FRA
001-104520 10145/07 2011-04-12 FRA
001-72757 10162/02 2006-03-09 ENG -- keep only the english version
001-82198 10163/02 2007-09-06 ENG
001-57555 10208/82 1988-05-26 ENG -- keep only the english version
How can I do it with MySQL?
UPDATE:
All answers give the correct result but I marked Görkem's as correct as IMO is the most elegant and straight-forward as of why it works.
I initially accepted Görkem's answer but for some reason it returned one wrong result that Strawberry pointed out. That leaves Strawberry's answer as the most elegant and correct
SELECT DISTINCT COALESCE(e.doc_id,f.doc_id) doc_id
, f.case_id
, f.date
, COALESCE(e.lang,f.lang) lang
FROM my_table f
LEFT
JOIN my_table e
ON e.case_id = f.case_id
AND e.date = f.date
AND e.lang = 'ENG';
SELECT
sorted.doc_id,
sorted.case_id,
sorted.date,
sorted.lang
FROM (
SELECT
doc_id,
case_id,
date,
lang
FROM tbl
ORDER BY FIELD(lang, 'ENG', 'FRA')
) sorted
GROUP BY sorted.case_id
If this SQL is required for some research, there is a way to get the expected result set:
Select SUBSTRING_INDEX(GROUP_CONCAT(doc_id ORDER BY lang ), ',', 1) doc_id, case_id, date, SUBSTRING_INDEX(GROUP_CONCAT(lang ORDER BY lang), ',', 1) lang from table group by case_id,date
SELECT
doc_id,
case_id,
date,
lang,
max(case lang when 'ENG' then 1 else 0 end)
FROM tbl
GROUP BY case_id
I have a table with the records of different defects in a company. The table is something like this
ITMNBR Defect Reference_Designator RepairCenter
8800RTO001700 Componente / Placa abierto U1U2 FG
8800HIB001075V Componente Equivocado (NumeroParte) R53 SB
8800HIB001075V Ensamble Incorrecto (produccion) R19 SB
8800RTO000400 Componente / Placa abierto U1 SB
8800RTO003200 Componente Polaridad Inversa ZD2 SB
8800HIB001048 NO SOLDADURA T1 SB
8800HIB001048 Componente / Placa abierto U2 SB
8800HIB001048 Componente / Placa abierto U2 SB
Etc.
I want to consult only the three most repetitive defects of manufacture, I made this.
SELECT defect, COUNT(*) FROM reportefallas WHERE RepairCenter ='SB'
AND (CREADT BETWEEN NOW() - INTERVAL 7 DAY AND NOW()) #Select the Dates
AND (Defect IN ('Componente / Placa dañada X alto voltaje','Pin / Patita Quebrado','Componente / Placa Quemada','Componente Defecto Cosmetico','Falla no Duplicada','Soldadura Crackeada','Soldadura Fria','Parametros Incorrectos en la torre','Parametros Incorrectos en el dibujo','Componente dañado fisicamente','Conector mal colocado (inclinado)','Tornillo / Rondana Suelto','Pista Levantada (dañada)','Componente Ausente','Soldadura Derretida','Componente Equivocado (NumeroParte)','NO SOLDADURA','Componente/Placa no programada','Conector mal ensamblado','No se encontro problema','Tornillo / Rondana Flojo','Componente / Placa abierto','Pin Hole','Pin / Pata levantada (no Soldadura)','Componente Polaridad Inversa','Puente de Soldadura','Componente Desfasado Pad','Componente / Placa en corto','Splash de Soldadura','LEDs con VF diferente / equivocado','LEDs con VF alto','LEDs con VF bajo','Ensamble Incorrecto (produccion)','Componente posicion Equivocada (referencia)','Cable ensamblado posicion incorrecta'))
GROUP BY defect
ORDER BY COUNT(*) DES
LIMIT 3;
And I have the next result
Defect COUNT(*)
Componente/ Placa abierto 5
Componente / Placa dañada X alto voltaje 4
Componente dañado fisicamente 3
Now, I need a query from the same table where the defects are, with only the three most repetitive defects that I already obtained, this is the result that I want:
ITMNBR Defect Reference_Designator
8800ITH001700 Componente / Placa abierto F2-U1(SHORT)-U2(SHORT)
8800ITH001700 Componente / Placa abierto F2-U1(SHORT)-U2(SHORT)
8800ITH001700 Componente / Placa abierto F2-R29-R22-R19-R32-R13-U1(SHORT)-U2(SHORT)
8800ITH001700 Componente / Placa abierto F2-R29-R22-R19-R32-R13-U1(SHORT)-U2(SHORT)
8800ITH001700 Componente / Placa abierto F2
8850HZL0015EX Componente / Placa dañada X alto voltaje C6-C7
8800HIB001084 Componente / Placa dañada X alto voltaje R7-C20-MOV1
8850HIB004205 Componente / Placa dañada X alto voltaje C21-C42
8800HIB004220 Componente / Placa dañada X alto voltaje R22 SWITH-R44 SWITH
8850HIB004206 Componente dañado fisicamente C42
8850HIB004202 Componente dañado fisicamente F1
8800HIB0131EX Componente dañado fisicamente R37
I tried the code below, but it doesn’t accept the LIMIT.
SELECT ITMNBR, Defect, Reference_Designator FROM reportefallas
WHERE Defect IN (SELECT defect FROM reportefallas WHERE RepairCenter='SB'
AND(CREADT BETWEEN NOW() - INTERVAL 7 DAY AND NOW()) AND (Defect IN ('Componente / Placa dañada X alto voltaje','Pin / Patita Quebrado','Componente / Placa Quemada','Componente Defecto Cosmetico','Falla no Duplicada','Soldadura Crackeada','Soldadura Fria','Parametros Incorrectos en la torre','Parametros Incorrectos en el dibujo','Componente dañado fisicamente','Conector mal colocado (inclinado)','Tornillo / Rondana Suelto','Pista Levantada (dañada)','Componente Ausente','Soldadura Derretida','Componente Equivocado (NumeroParte)','NO SOLDADURA','Componente/Placa no programada','Conector mal ensamblado','No se encontro problema','Tornillo / Rondana Flojo','Componente / Placa abierto','Pin Hole','Pin / Pata levantada (no Soldadura)','Componente Polaridad Inversa','Puente de Soldadura','Componente Desfasado Pad','Componente / Placa en corto','Splash de Soldadura','LEDs con VF diferente / equivocado','LEDs con VF alto','LEDs con VF bajo','Ensamble Incorrecto (produccion)','Componente posicion Equivocada (referencia)','Cable ensamblado posicion incorrecta'))
GROUP BY defect
ORDER BY COUNT(*) DESC
LIMIT 3)
Does anyone have any ideas any ideas?
Sorry for the Spanglish and the bad English, I hope you can understand.
There are several options. Previous questions on this topic have suggested using JOIN to trim down your result set instead of IN, which would look something like this:
SELECT rf.ITMNBR, rf.Defect, rf.Reference_Designator
FROM (SELECT defect FROM reportefallas WHERE RepairCenter='SB'
AND(CREADT BETWEEN NOW() - INTERVAL 7 DAY AND NOW()) AND (Defect IN ('Componente / Placa dañada X alto voltaje','Pin / Patita Quebrado','Componente / Placa Quemada','Componente Defecto Cosmetico','Falla no Duplicada','Soldadura Crackeada','Soldadura Fria','Parametros Incorrectos en la torre','Parametros Incorrectos en el dibujo','Componente dañado fisicamente','Conector mal colocado (inclinado)','Tornillo / Rondana Suelto','Pista Levantada (dañada)','Componente Ausente','Soldadura Derretida','Componente Equivocado (NumeroParte)','NO SOLDADURA','Componente/Placa no programada','Conector mal ensamblado','No se encontro problema','Tornillo / Rondana Flojo','Componente / Placa abierto','Pin Hole','Pin / Pata levantada (no Soldadura)','Componente Polaridad Inversa','Puente de Soldadura','Componente Desfasado Pad','Componente / Placa en corto','Splash de Soldadura','LEDs con VF diferente / equivocado','LEDs con VF alto','LEDs con VF bajo','Ensamble Incorrecto (produccion)','Componente posicion Equivocada (referencia)','Cable ensamblado posicion incorrecta'))
GROUP BY defect
ORDER BY COUNT(*) DESC #Ordena de manera descendente
LIMIT 3) AS subquery
JOIN
reportefallas AS rf USING (Defect)
Alternatively, you could create a separate table to track the three most common defects, and periodically update that table (e.g. via a cron job). Then you would SELECT ... WHERE Defect IN this other table.
Either of these methods could provide better performance, depending on the situation. If you try one and have poor performance, try the other and see if it's an improvement.
(For that matter, you could also store that enormous list of defects in another table, to make your query cleaner.)
just like AirThomas said you can use a subquery.. you should also be able to do a simple select inside your IN() instead of listing out each one individually. this is another way to do the subquery though
SELECT rf.ITMNBR, rf.Defect, rf.Reference_Designator
FROM(
SELECT ITMNBR as itm_number, defect, COUNT(*) as top_three FROM reportefallas WHERE RepairCenter ='SB'
AND (CREADT BETWEEN NOW() - INTERVAL 7 DAY AND NOW()) -- Select the Dates
AND (Defect IN (SELECT defect from reportefallas))
GROUP BY defect
ORDER BY top_three DES
LIMIT 3
)as t
JOIN reportefallas rf ON rf.ITMNBR = t.itm_number