I have a column within my database that holds text similar to this
CNEWS # Trinidad : "By Any Means Necessary" Watson Duke Swims And Sails To Toco http://somewebsitehere.com
What can I do to remove the entire http address from the column? Please note that some links may be broken so it may have http:// somewebsitehere.com
I was thinking of using a substring index but not sure that would work.
You could use whichever your favorite programming language is to iterate through the rows in the table, pluck out that column, apply a regular expression replacement rule to it, then update the row in the table with the new value.
Here is some pseudo-code:
theRows = SELECT * FROM TheTable WHERE 1;
foreach row in theRows
BEGIN
oldColumnValue = row[theColumnName]
// Removes any link appearing at the end of the column
newColumnValue = oldColumnValue.replace(/http:\/\/[^\s]*$/, '')
UPDATE TheTable SET theColumnName = newColumnValue WHERE id = row[id]
END
For something as small and specific as this, you could use perl with the DBI library to connect to mySQL. Here's a useful resource on regular expressions if you want to go more into it: http://www.regular-expressions.info/perl.html
Related
I'm using Lucee 5.x and Maria DB (MySQL).
I have a user supplied comma delimited list. I need to query the database and if the item isn't in the database, I need to add it.
user supplied list
green
blue
purple
white
database items
black
white
red
blue
pink
orange
lime
It is not expected that the database list would grow to more than 30 items but end-users always find 'creative' ways to use the tools we provide them.
So using the user supplied list above, only green and purple should be added to the database.
Do I compare the user supplied list against the database items or vice versa? Would the process change if the user supplied list count exceeds what is in the database (meaning if the user submits 10 items and the database only contains 5 items)? I'm not sure which loop is the better way to determine which items are new. Needs to be in cfscript and I'm looking at the looping options as outlined here (https://www.petefreitag.com/cheatsheets/coldfusion/cfscript/)
FOR Loop
FOR IN Loop (Array)
FOR IN Loop (Query)
I tried MySQL of NOT IN but that left me with the existing database values in addition to the new ones. I know this should be simple and I'm over complicating this somewhere and/or am too close to the problem to see the solution.
You could do this:
get a list with existing items from database
append user supplied list
remove duplicates
update db if items were added
<cfscript>
var userItems = '"green","blue","purple","white"';
var dbItems = '"black","white","red","blue","pink","orange","lime"';
var result = ListRemoveDuplicates( ListAppend(dbItems, userItems));
if (ListLen(result) neq ListLen(dbItems)) {
// update db
}
</cfscript>
Update (only new items)
<cfscript>
var userItems = '"green","blue","purple","white"';
var dbItems = '"black","white","red","blue","pink","orange","lime"';
var newItems = '';
ListEach(userItems, function (item) {
if (not ListFind(dbItems, item)) {
newItems = ListAppend(newItems, item);
}
})
</cfscript>
trycf.com gist:
(https://trycf.com/gist/f6a44821165338b3c10b7808606979e6/lucee5?theme=monokai)
Again, since this is an operation that the database can do, I'd feed the input data to the database and then let it decide how to deal with multiple keys. I don't recommend using CF to loop through your values to check them and then doing the INSERT. This will require multiple trips to the database and then processing on the application server that isn't really needed.
My suggestion is to use MariaDB's INSERT....ON DUPLICATE KEY UPDATE... syntax. This will also require that whatever field you are trying to insert on actually has a UNIQUE constraint on it. Without that constraint, then your database itself doesn't care if you have duplicate data, when can cause its own set of issues.
For the database, we have
CREATE TABLE t1 (mycolor varchar(50)
, CONSTRAINT constraint_mycolor UNIQUE (mycolor)
) ;
INSERT INTO t1(mycolor)
VALUES ('black'),('white'),('red'),('blue'),('pink'),('orange'),('lime')
;
The ColdFusion is:
<cfscript>
myInputValues = "green,blue,purple,white" ;
myQueryValues = "" ;
function sanitizeValue ( String inVal required ) {
// do sanitization stuff here
var sanitizedInVal = arguments.inVal ;
return sanitizedInVal ;
}
myQueryValues = myInputValues.listMap(
function(i) {
return "('" & sanitizeValue(i) & "')" ;
}
) ;
// This will take parameterization out of the cfquery tag and
preform sanitization and validation before building the
query string.
myQuery = new query();
myQuery.name = "myQuery";
myQuery.setDataSource("dsn");
sqlString = "INSERT INTO t1(mycolor) VALUES "
& myQueryValues
& " ON DUPLICATE KEY UPDATE mycolor=mycolor;"
;
myQuery.setSQL(sqlString);
myQueryResult = myQuery.execute().getResult();
</cfscript>
First, build up your input values (myInputValues). You'll want to do validation and sanitization on them to prevent nastiness from entering your database. I created a sanitizeValue function to be the placeholder for the sanitization and validation operations.
myQueryValues will become a string list of the values in the proper format that we will use to insert into the database.
Then we just build up a new query(), using myQueryValues in the sqlString to get our query. Again, since we are building a string for multiple values to INSERT, I don't think there's a way to user queryparam for those VALUES. But since we cleaned up our string earlier, it should do much of what cfqueryparam does anyway.
We use MariaDB's INSERT INTO .... ON DUPLICATE KEY UPDATE ... syntax to only insert unique values. Again, this requires that the database itself has a constraint to prevent duplicates in whatever column we're inserting.
For a demo: https://dbfiddle.uk/?rdbms=mariadb_10.2&fiddle=4308da3addb9135e49eeee451c6e9e58
This should do what you're looking to do without beating up on your database too much. I don't have a Lucee or MariaDB server set up to test, so you'll have to give it a shot and see how it performs. I don't know how big your database is or will become, but this should still query pretty quickly.
Device_Users
R90FDJMVAdministrator
PG04373Administrator
I only want administrator in this field what update query should I run
You can use CHARINDEX practical because you can set a starting point for the search in the string, useful if you know how long is your prefix (I have put 6 but it depends on your data)
UPDATE table
SET column = 'Administrator'
WHERE CHARINDEX('Administrator',column,6) > 0
If needed you can refine the WHERE clause to make sure you're not replacing entries which have something after Administrator
AND LEN(column) = CHARINDEX('Administrator',column,6) + LEN('Administrator')
Maybe this is good enough:
update Device_Users
set field = 'Administrator'
where field like '%Administrator'
I have a column in MySQL table which has 'messy' data stored as text like this:
**SIZE**
2
2-5
6-25
2-10
26-100
48
50
I want to create a new column "RevTextSize" that rewrites the data in this column to a pre-defined range of values.
If Size=2, then "RevTextSize"= "1-5"
If Size=2-5, then "RevTextSize"= "1-5"
If Size=6-25, then "RevTextSize"="6-25"
...
This is easy to do in Excel, SPSS and other such tools, but how can I do it in the MySQL table?
You can add a column like this:
ALTER TABLE messy_data ADD revtextsize VARCHAR(30);
To populate the column:
UPDATE messy_data
SET revtextsize
= CASE
WHEN size = '2' THEN '1-5'
WHEN size = '2-5' THEN '1-5'
WHEN size = '6-25' THEN '6-25'
ELSE size
END
This is a brute-force approach, identifying each distinct value of size and specifying a replacement.
You could use another SQL statement to help you build the CASE expression
SELECT CONCAT(' WHEN size = ''',d.size,''' THEN ''',d.size,'''') AS stmt
FROM messy_data d
GROUP BY d.size
Save the result from that into your favorite SQL text editor, and hack away at the replacement values. That would speed up the creation of the CASE expression for the statement you need to run to set the revtextsize column (the first statement).
If you want to build something "smarter", that dynamically evaluates the contents of size and makes an intelligent choice, that would be more involved. If was going to do that, I'd do it in the second statement, generating the CASE expression. I'd prefer to review that, befor I run the update statement. I prefer to have the update statement doing something that's easy to understand and easy to explain what it's doing.
Use InStr() to locate "-" in your string and use SUBSTRING(str, pos, len) to get start & End number. Then Use Between clause to build your Case clause.
Hope this will help in building your solution.
Thanks
In some cases, we are using IN clause in our filters of SSRS reports. A lots of them are causing issues with the performance by using hundreds of items inside of IN clauses.
such as:
WHERE TableA.School IN (#School)
sometimes, the multi-value parameters are really tricky to handle, you might need to do =Join(Mypara.Value,",") in the RDL and write a SQL function to convert them into a set of SQL data to be able to feed the SQL SP. (especially some older version of SSRS).
FYI: function to use to break a comma deliminator string into record set:
CREATE function [dbo].[fnSpark_BreakUpList] (
#List VARCHAR(MAX)
)
RETURNS #csvlist TABLE (Item VARCHAR(MAX))
AS
BEGIN
DECLARE #Item VARCHAR(MAX)
-- Loop through each item in the comma delimited list
WHILE (LEN(#List) > 0)
BEGIN
IF CHARINDEX(',',#list) > 0
BEGIN
SET #Item = SUBSTRING(#List,1,(CHARINDEX(',', #List)-1))
SET #List = SUBSTRING(#List,(CHARINDEX(',', #List) + DATALENGTH(',')),DATALENGTH(#List))
END
ELSE
BEGIN
SET #Item = #List
SET #List = NULL
END
-- Insert each item into the csvlist table
INSERT into #csvlist (Item) VALUES (#Item)
END
RETURN
END
GO
I will post answers shortly to show how to increase the performance by using CHARINDEX. (So you dont have to anything like above....)
If you are not actually want to retrieve the item from the LONG delimited string, but to only filtering it, CHARINDEX is a better way to go for.
Instead of using IN Clause, you could just use:
WHERE CHARINDEX(','+TableA.School+',',','+#School+',') > 0
NOTE:
1. I pad an extra comma at the end of target string 'TableA.School' to avoid the satuation that if a big string contains a sub string same as the filtering item.
(Such as we have a school called 'AB' and another 'ABC' that we dont want the 'ABC' to be picked up when we are targeting for 'AB'.... )
I pad an extra comma at the end of resources string '#School' to ensure that the single item / the last item (they will end without a comma) to be picked up when we are targeting them.
I pad an extra comma at the begining of target string 'TableA.School' to avoid the satuation that if a big string contains a sub string same as the filtering item.
(Such as we have a school called 'AB' and another 'CAB' that we dont want the 'CAB' to be picked up when we are targeting for 'AB'.... )
Example:
I am using:
WHERE
CHARINDEX(','+CAST(DENTIST4.wStudentYear AS VARCHAR(10))+',',','+#StudentYear+',') > 0
TO replace:
WHERE
DENTIST4.wStudentYear IN (#StudentYear)
for one of the report i was doing, which makes 4000+ pages rendering improved from about 10 mins into under a min for a large database (11 G).
IMPORTANT NOTES:
Please make sure the filter report parameters you are passing into the dataset uses JOIN clause.
=Join(Parameters!MyParameter.Value,",")
Hope this helps....
IMPORTANT: This approach ONLY improves performance if the filter has large set of items, for any filters with small set of items IN clause will do better jobs.
So I have a bunch of users in a column that get refreshed as:
Bill#test.comXYZ
Tom#test.comXYZ
John#test.comXYZ
We refresh the database each week and I need to update these appropriate emails to:
Bill#domain.com
Tom#domain.com
John#domain.com
I figured I can use concat to do the latter, but I am stuck on the former issue. Is there a way to split the values (like split Bill#test.comXYZ into Bill - #test.comXYZ and then remove the #TEXT values?).
Anyways, any help will be much appreciated.
You can use the mySQL replace function, i.e.
UPDATE mytable
set myfield = replace (myfield, '#test.comXYZ', 'domain.com')
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_replace