Is there a way to get a value like this one: "300, 400, 500, 300" check each number separated with comma and if it is doubled delete it. So the value will look like this : "300, 400, 500".
I could do it in PHP script but I just wonder if it is possible using MySQL.
Create a temp table with unique index, insert values ignoring duplicate errors, select all records from the temp table, delete the table.
Quick play, but to get the unique values for each row you could use something like this
SELECT Id, GROUP_CONCAT(DISTINCT aWord ORDER BY aWord ASC)
FROM (SomeTable.Id, SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(concat(SomeColumn, ','), ' ', aCnt), ',', -1) AS aWord
FROM SomeTable
CROSS JOIN (
SELECT a.i+b.i*10+c.i*100 + 1 AS aCnt
FROM integers a, integers b, integers c) Sub1
WHERE (LENGTH(SomeColumn) + 1 - LENGTH(REPLACE(SomeColumn, ',', ''))) >= aCnt) Sub2
GROUP BY ID
This relies on having a table called integers with a single column called i with 10 rows with the values 0 to 9. It copes with up to ~1000 words but can easily be altered to cope with more
Probably easiest to use an INSERT / ON DUPLICATE KEY UPDATE to use this to make the values unique.
Related
I inherited a mysql database and am trying to migrate it to mongodb. There is a field called details that contains some key value "pairs" I want to split up. There could be a single key/value pair, or multiple pairs split by multiple delimiters. I put pairs in quotes because they are formatted strangely. They are delimited by colon : and key values split by commas ,. For example here is the value of one such field:
Normal Duty,5min:Heavy Duty,10min:Riser,10max:
This is 3 key value pairs, delimited by the colon. I want to get these into a json object if possible, like this:
{
'Normal Duty': '5min',
'Heavy Duty': '10min',
'Riser': '10max'
}
I think I could do it using substring_index if it were only a single key/value pair that had a single delimiter, but I get lost trying to think of a way to extract multiple key/value pairs with multiple delimiters. I'm able to get a count of the number of delimiters, SELECT id, details, LENGTH(details) - LENGTH(REPLACE(details, ':', '')) AS COUNT FROM type but not sure how I could use that number in a loop or something.
SELECT test.id,
JSON_OBJECTAGG(SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(test.value, ':', numbers.num), ':', -1), ',', 1),
SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX(test.value, ':', numbers.num), ':', -1), ',', -1))
FROM test
CROSS JOIN ( SELECT 1 num UNION SELECT 2 UNION SELECT 3 UNION
SELECT 4 UNION SELECT 5 UNION SELECT 6 ) numbers
WHERE numbers.num <= LENGTH(test.value) - LENGTH(REPLACE(test.value, ':', ''))
GROUP BY test.id
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=d10261e21a0fb1c1060091e8b4e58d80
Adjust numbers subquery if the amount of key-value pairs may be above 6.
PS. To work with JSON I'd strongly recommend to upgrade your server.
I am trying to grab a number by using SQL query. I need to grab the number before it says 'LEADS'
Sample of entry I might encounter:
PDIP300MIL-14LEADS
QFN6X6-40LEADS
QFN6X6-240LEADS
WSOIC/16LEADS
So as you can see the prefix can be any length. Also sometimes the delimeter is / or -. But it is fix that the suffix is LEADS.
On a sidenote. Other entries are like ICL7665 BCSA so it has no leads so it has to be skipped.
Edit: I am very sorry if I am not that clear. The one I am trying to grab is the number between the delimeter and Leads.
So in the four examples I am trying to grab: 14, 40, 240, 16.
You can do something like using substring_index
select
substring_index(
substring_index(
replace(col,'/','-')
,'LEADS'
,1),
'-'
,-1
)
from table1
DEMO
To skip entries you can filter result by using having clause
select
substring_index(
substring_index(
replace(col,'/','-')
,'LEADS'
,1),
'-'
,-1
) num
from table1
having num * 1 > 0
DEMO 2
https://dev.mysql.com/doc/refman/5.1/en/regexp.html
SELECT * FROM table WHERE field REGEXP '[0-9]+LEADS$'
I have a table called contacts and in that table there is a field called contact_type.
contact_type is varchar and stores comma separated values in a sting like this:
^Media^,^Historical^
However only a few rows out of thousands have more than one value stored and I need to to run a query that will return only the rows with more than one so if it stores just ^Historical^ then it will be ignored.
I’m pretty much stumped on how to build a query like this. I assume it will contain something like this:
SELECT LENGTH(#css) - LENGTH( REPLACE( #css, ',', '') ) + 1;
Basically you need to select the records where contact_type contains a comma
select * from your_table
where instr(contact_type, ',') > 0
i have a mysql table with this sort of data
TACOMA, Washington, 98477
Now i have thousands of such rows. I want the data to be manipulated in such a manner that it appears like:
TACOMA, Washington
Is it possible though mysql or do i have to manually do it.
You can use :
SELECT SUBSTRING_INDEX('TACOMA, Washington, 98477', ',', 2)
You can read more here.
And the update statement :
UPDATE my_table
SET my_col = SUBSTRING_INDEX(my_col, ',', 2)
Where you need to replace my_table with your table name and my_col with the column you need to be updated.
Possibly this way. Count the number of commas (by checking the length against the length with all the commas removed) and then use SUBSTRING_INDEX to get the string up to the number of commas:-
SELECT SUBSTRING_INDEX(col, ',', LENGTH(col) - LENGTH(REPLACE(col, ',', '')))
FROM SomeTable
substring_index(col, ',',-1)
will give the string from last index of comma to end of string
replace(col,concat(',',substring_index(col, ',',-1)),'')
I'm trying to figure out how to go about determining the most used words on a mysql dataset.
Not sure how to go about this or if there's a simpler approach. Read a couple posts where some suggests an algorithm.
Example:
From 24,500 records, find out the top 10 used words.
Right, this runs like a dog and is limited to working with a single delimiter, but hopefully will give you an idea.
SELECT aWord, COUNT(*) AS WordOccuranceCount
FROM (SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(concat(SomeColumn, ' '), ' ', aCnt), ' ', -1) AS aWord
FROM SomeTable
CROSS JOIN (
SELECT a.i+b.i*10+c.i*100 + 1 AS aCnt
FROM integers a, integers b, integers c) Sub1
WHERE (LENGTH(SomeColumn) + 1 - LENGTH(REPLACE(SomeColumn, ' ', ''))) >= aCnt) Sub2
WHERE Sub2.aWord != ''
GROUP BY aWord
ORDER BY WordOccuranceCount DESC
LIMIT 10
This relies on having a table called integers with a single column called i with 10 rows with the values 0 to 9. It copes with up to ~1000 words but can easily be altered to cope with more (but will slow down even more).
Why not do it all in PHP? Steps would be
Create a dictionary (word => count)
Read you data in PHP
Split it into words
Add each word to the dictionary (you might want to lowercase and trim them first)
If already in the dictionary, increment its count. If not already in the dictionary, set 1 as its value (count = 1)
Iterate your dictionary elements to find the highest 10 values
I wouldn't do it in SQL mainly because it'd end up more complex.
General idea would be to figure out how many delimiters (e.g. spaces) are in each field, and run SUBSTRING_INDEX() in a loop, for each such field. Populating this into a temporary table has the added benefit of being able to run this in chunks, in parallel, etc. Shouldn't be too cumbersome to throw some SPs together to do this.
SELECT `COLUMNNAME`, COUNT(*) FROM `TABLENAME` GROUP BY `COLUMNNAME`
its very simple and worked... :)
A little improve, remove stop words from the list with AND Sub2.aWord not in (list of stop words)
SELECT aWord, COUNT(*) AS WordOccuranceCount
FROM (SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(concat(txt_msg, ' '), ' ', aCnt), ' ', -1) AS aWord
FROM mensagens
CROSS JOIN (
SELECT a.i+b.i*10+c.i*100 + 1 AS aCnt
FROM integers a, integers b, integers c) Sub1
WHERE (LENGTH(txt_msg) + 1 - LENGTH(REPLACE(txt_msg, ' ', ''))) >= aCnt) Sub2
WHERE Sub2.aWord != '' AND Sub2.aWord not in ('a','about','above', .....)
GROUP BY aWord
ORDER BY WordOccuranceCount DESC
LIMIT 10