I'm loading text files into my db and trying to do some quick matching between a table that lists names of organizations, and a table that holds the text file and potential matches to those organizations.
I load the file using LOAD INFILE CONCURRENT and don't have any problems with that.
The twist comes from the fact that the field I'm trying to match in the raw text table (occupationoraffiliation) has more than just organization names in it. So I'm trying to use LIKE with wildcards to match the strings.
To match the text, I'm trying to use this query:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE CONCAT('%',o.org_name,'%')
SET raw_faca.org_id = o.org_id;
I've also tried without CONCAT:
UPDATE raw_faca JOIN orgs AS o
ON raw_faca.org_id IS NULL AND raw_faca.occupationoraffiliation LIKE ('%' + o.org_name + '%')
SET raw_faca.org_id = o.org_id;
The raw_faca table has ~40,000 rows and the orgs table has ~ 20,000 rows. I have indexes on all the The query has been running for a couple of hours or so -- this seems like way too long for the operation. Is the comparison I'm trying to run just that inefficient or am I doing something spectacularly stupid here? I was hoping to avoid going line-by-line with an external php or python script.
In response to comments below about using Match . . . Against, I've tried the following query as well:
UPDATE raw_faca JOIN orgs AS o ON raw_faca.org_id IS NULL AND MATCH(raw_faca.occupationoraffiliation) AGAINST (o.org_name IN NATURAL LANGUAGE MODE)
SET raw_faca.org_id = o.org_id;
And it's giving me this error:
incorrect arguments to AGAINST
Any thoughts?
A LIKE clause with a leading wild card is not going to be able to take advantage of any indexes.
Related
I would like to select rows in my table (I'm using Google Sheet for that purpose) which content is included in the string.
For example, rows included in table called Jobportal, column Test:
How to find work
Work permit
Jobs
Temporary jobs
I want to select all the rows that contain any word of my input, so if I write "i'm looking for a job", I need to select rows Jobs and Temporary jobs. If I write "where is my work?", I need to select How to find work and Work permit.
I've tried this query, but it's returning wrong/unexpected results.
select * from Jobportal where 'im looking for a job' LIKE CONCAT('%',Test,'%');
You can use regular expressions. Assuming that what the user types does not have special characters:
where test regexp replace('im looking for a job', ' ', '|')
That said, for performance you might want to consider using full text search capabilities.
I'm trying to get results when both tables have the same machine number and there are entries that have the same number in both tables.
Here is what I've tried:
SELECT fehler.*,
'maschine.Maschinen-Typ',
maschine.Auftragsnummer,
maschine.Kunde,
maschine.Liefertermin_Soll
FROM fehler
JOIN maschine
ON ltrim(rtrim('maschine.Maschinen-Nr')) = ltrim(rtrim(fehler.Maschinen_Nr))
The field I'm joining on is a varchar in both cases. I tried without trims but still returns empty
I'm using MariaDB (if that's important).
ON ltrim(rtrim('maschine.Maschinen-Nr')) = ltrim(rtrim(fehler.Maschinen_Nr)) seems wrong...
Is fehler.Maschinen_Nr really the string 'maschine.Maschinen-Nr'?
SELECT fehler.*, `maschine.Maschinen-Typ`, maschine.Auftragsnummer, maschine.Kunde, maschine.Liefertermin_Soll
FROM fehler
JOIN maschine
ON ltrim(rtrim(`maschine.Maschinen-Nr`)) = ltrim(rtrim(`fehler.Maschinen_Nr`))
Last line compared a string to a number. This should be doing it.
Also, use the backtick to reference the column names.
The single quotes are string delimiters. You are comparing fehler.Maschinen_Nr with the string 'maschine.Maschinen-Nr'. In standard SQL you would use double quotes for names (and I think MariaDB allows this, too, certain settings provided). In MariaDB the commonly used name qualifier is the backtick:
SELECT fehler.*,
`maschine.Maschinen-Typ`,
maschine.Auftragsnummer,
maschine.Kunde,
maschine.Liefertermin_Soll
FROM fehler
JOIN maschine
ON trim(`maschine.Maschinen-Nr`) = trim(fehler.Maschinen_Nr)
(It would be better of course not to use names with a minus sign or other characters that force you to use name delimiters in the first place.)
As you see, you can use TRIM instead of LTRIM and RTRIM. It would be better, though, not to allow space at the beginning or end when inserting data. Then you wouldn't have to remove them in every query.
Moreover, it seems Maschinen_Nr should be primary key for the table maschine and naturally a foreign key then in table fehler. That would make sure fehler doesn't contain any Maschinen_Nr that not exists exactly so in maschine.
To avoid this problems in future, the convention for DB's is snake case(lowercase_lowercase).
Besides that, posting your DB schema would be really helpfull since i dont guess your data structures.
(For friendly development, is usefull that variables, tables and columns should be written in english)
So with this, what is the error that you get, because if table "maschine" has a column named "Maschinen-Nr" and table "fehler" has a column named "Maschinen_Nr" and the fields match each other, it should be correct
be careful with Maschinen-Nr and Maschinen_Nr. they have - and _ on purpose?
a very blind solution because you dont really tell what is your problem or even your schema is:
SELECT table1Alias.*, table2Alias.column_name, table2Alias.column_name
FROM table1 [table1Alias]
JOIN table2 [table2Alias]
ON ltrim(rtrim(table1Alias.matching_column)) = ltrim(rtrim(table2Alias.matching_column))
where matching_columns are respectively PK and FK or if the data matches both columns [] are optional and if not given, will be consider table_name
I have set up a job to run reports and uses multiple tables with joins. I am joining two tables on a string field and if the field contains an apostrophe, it does not return any matches. This is weird and not sure why is is happening now and never before. I am perhaps not identifying the exact cause but will appreciate any help here:
Example query: "today's deals"
SET #TITLE = (SELECT MAX(B.DATEADDED) as 'td','',
(C.CLIENT + CHAR(10) + B.CLIENTKEY) as 'td','',
B.BADQ as 'td','',A.FULLQ as 'td','', B.BADERROR as 'td',
''
FROM BADQUERY AS B
LEFT JOIN QDATA AS A ON B.BADQ = A.QUERYT
LEFT JOIN Clients AS C ON C.clientKey = B.clientKey
WHERE DATEDIFF(minute,CAST(B.DATEADDED as datetime),GETDATE())<=420 AND
DAY(GETDATE()) = DAY(B.DATEADDED)
GROUP BY B.BADT,A.FULLQ, B.CLIENTKEY,C.CLIENT, B.BADERROR
FOR XML PATH ('tr'), ELEMENTS XSINIL)
For some reason A.FULLQ is being returned as NULL. When I do it separately with just a query the result set is also null but I know the matching record in QDATA as A is in the table. So if it is the query with apostrophe how can get the matching field or is sql server matching the data and something else is wrong.
If I try and match with a like it returns results but this is not accurate.
If B.BADQ and A.QUERYT don't exactly match, you won't get any records back. The fact that it works with a LIKE makes me wonder whether one of them has additional characters, either before or after the matching data (depending on how you set up the LIKE).
Michael Green is right, below, that trailing blanks by themselves don't prevent a match, but, depending on where your data originates, you might have some other character (such as an embedded CHAR(0) or a TAB character) that doesn't appear when you view the data in the record but which is enough to prevent the records from matching. You might use the CHECKSUM() function on the two strings to verify that they do represent the same data.
Another, similar possibility is that if there is a string of blanks in the values (something like "A, B, ' '") the number of blanks might be different between the two instances. They'd look the same in HTML (which it looks like you're generating) but they'd be different in reality and be enough to prevent a match.
Finally, the fact that you're generating XML and observing trouble with apostrophes made me think of this: if the content of an XML tag has an apostrophe, it will be converted to '. That ought to affect only the output, not the functioning, of the query, but I don't know what your data actually looks like.
I am trying to retrieve a list of database records which have specific 'interest codes' inside of the 'custom_fields' table. So for example right now there is 100 records, I need the Name, Email and Interest Code from each of those records.
I've tried with the following statement:
SELECT * FROM `subscribers` WHERE list = '27' AND custom_fields LIKE 'CV'
But with no luck, the response was:
MySQL returned an empty result set (i.e. zero rows). ( Query took 0.0003 sec )
You can see in this screenshot that at-least two rows have 'CV' inside custom_fields. Whilst within the database it's not called 'Interest Code', that's what they are so therefore why I am referencing it in this way.
You need to enclose your "search string" inside some wildcards:
select * from subscribers where list=27 and custom_fields like '%CV%';
The % wildcard means "zero or more chacarcters at this position". The "_" wildcard means "a character in this position". Please read the reference manual on the topic. Also, you may want to read about regular expressions in MySQL for more complex string comparissons.
I am struggling with this query and want to know if I am wasting my time and need to write a php script or is something like the following actually possible?
UPDATE my_table
SET #userid = user_id
AND SET filename('http://pathto/newfilename_'#userid'.jpg')
FROM my_table
WHERE filename
LIKE '%_%' AND filename
LIKE '%jpg'AND filename
NOT LIKE 'http%';
Basically I have 700 odd files that need renaming in the database as they do not match the filenames as I am changing system, they are called in the database.
The format is 2_gfhgfhf.jpg which translates to userid_randomjumble.jpg
But not all files in the database are in this format only about 700 out of thousands. So I want to identify names that contain _ but don't contain http (thats the correct format that I don't want to touch).
I can do that fine but now comes the tricky bit!!
I want to replace that file name userid_randomjumble.jpg with http://pathto/filename_userid.jpg So I want to set the column user_id in that row to a variable and insert it into my new filename.
The above doesn't work for obvious reasons but I am not sure if there is a way round what I'm trying to do. I have no idea if it's possible? Am I wasting my time with this and should I turn to PHP with mysql and stop being lazy? Or is there a way to get this to work?
Yes it is possible without the php. Here is a simple example
SET #a:=0;
SELECT * FROM table WHERE field_name = #a;
Yes you can do it using straightforward SQL:
UPDATE my_table
SET filename = CONCAT('http://pathto/newfilename_', userid, '.jpg')
WHERE filename LIKE '%\_%jpg'
AND filename NOT LIKE 'http%';
Notes:
No need for variables. Any columns of rows being updated may be referenced
In mysql, use CONCAT() to add text values together
With LIKE, an underscore (_) has a special meaning - it means "any single character". If you want to match a literal underscore, you must escape it with a backslash (\)
Your two LIKE predicates may be safely merged into one for a simpler query