Working Around SQL Replace Wildcards - mysql

I know that I cannot use a wildcard in a MySQL replace query through phpMyAdmin. But, I need some kind of workaround. I'm very open to ideas. Here's the skinny:
I have about 2,000 pages in a MySQL database that need to have image URL's updated. Some are local, some are hotlinked. Each one is different, the URL lengths vary, the image on the page and the new image are unique per page id number, and each one occurs at a different spot in the page.
I basically need to do the following:
UPDATE pages SET body = replace(body, 'src=\"%\"', 'src=\"http://newdomain/newimage.jpg\"') WHERE id="{page_number}"
But I know that the 'src=\"%\"' component doesn't jive.
So I fall at the feet of your collective knowledge to come up with some way to take the src="%" and replace it with a set URL for a set page id number. Thanks in advance.

If there's only one image per page, a quick solution would be like this:
UPDATE pages
SET
body = CONCAT(
SUBSTRING_INDEX(body, 'src="', 1),
'src=\"http://newdomain/newimage.jpg\"',
SUBSTRING(
SUBSTRING_INDEX(body, 'src="', -1)
FROM LOCATE('"', SUBSTRING_INDEX(body, 'src="', -1))+1)
)
WHERE
id="{page_number}" AND
body NOT LIKE '%<img%<img%';
First SUBSTRING_INDEX extract the body part at the left of src=", the last two nested SUBSTRING_INDEX extracts the body part at the right of the first " next to src=".
Last check is a very dirty check to make sure that only one image is present in the string. It could fail under some circumstances, but it might help.

My suggestion would be to build a table with your replace strings that would look like this:
page_id replace
1 src="..."
Then you can update across a JOIN like this
UPDATE pages AS p
INNER JOIN replace AS r
ON p.page_id = r.page_id
SET p.body = REPLACE(p.body, CONCAT('src="', SUBSTRING_INDEX(SUBSTRING_INDEX(p.body, 'src="', -1), '"', 1), '"', r.replace);
This would replace the last occurrence anything of format src="..." with a new value in same format, so this would work for all records with a single src value.

Related

Filling a Word template with data from Exact Online query returns itgendps155: Publication failed

When filling a Word template through SQL on Exact Online the following error occurs:
Publication failed.
Context:
value-of expression
16Hjjhhhasdhfjhasjhfjha;jsfhsahfdahskj;dhkhsdkjhskjhkKashdhasdjhjsahdjhjsadJashdkaskjdjsakdkjhDocumentnrKlantnrOffertedatum<invantive:value-of expression="$F{qtk.quotationnumber}" ***/><invantive:foreach> <invantive:value-of expression="$F{qtk.orderaccountcode}" /></invantive:foreach><invantive:foreach> <invantive:value-of expression="$F{qtk.quotationdate}
The location of the error is indicated by the marker '***'.
Evaluation of expression "$F{qtk.quotationnumber}" failed.
Cannot find field with the name 'quotationnumber'. Check that you have inserted a surrounding repeating block and that the field exists in that block.
The underlying SQL of the block in Composition for Word is:
select qtk.quotationnumber
, qtk.versionnumber
, qtk.quotationdate
, qtk.orderaccountcode
, acc.name
, acc.addressline1
, acc.postcode
, acc.city
, acc.countryname
, acc.phone
, acc.fax
, acc.vatnumber
, itm.code
, qtl.itemdescription
, qtl.notes
, qtl.quantity
, qtl.netprice
, qtl.amountdc
, qtl.vatpercentage*100
from exactonlinerest..Quotations Qtk
left
outer
join exactonlinerest..Accounts Acc
on acc.id = qtk.orderaccount
left
outer
join exactonlinerest..QuotationLines Qtl
on qtl.quotationid = qtk.quotationid
and qtl.quotationnumber = $P{P_OFFERTE}
and qtl.versionnumber = $P{P_VERSIE}
left
outer
join exactonlinerest..Items Itm
on itm.id = qtl.item
where qtk.status = 40
and qtk.quotationnumber = $P{P_OFFERTE}
and qtk.versionnumber = $P{P_VERSIE}
The SQL returns a list of quotations and their lines and items (articles) from Exact Online.
The content of the document is:
I've tried various options, but the error keeps appearing. What am I doing wrong?
I guess you want to create a Word document with quotation header information, plus some quotation lines (just like traditional Order -> Order Lines master - detail table).
In this case I would first recommend splitting the query in two pieces:
One outer Composition block with query that retrieves the quotation information.
An inner Composition block with query that handles the quotation lines from Exact Online, including article information.
But that is not your current. In your Word template you have no repeating block specified. This is the area that should be repeated for each and every occurrence of a row in a block. You can easily insert a repeating block by clicking on Building Block in the Modeler menu. And then choosing the right block and then "Create repeating block".
As an alternative, you can also put
<invantive:foreach block="BLOCK-CODE">
before the text and other tags that you want to repeat in the resulting Word document including layout and pictures.
And put
</invantive:foreach>
after it.
The <invantive:foreach> that you already are not containing all tags. The <invantive:value-of expression="$F{qtk.quotationnumber}"/> is not contained. Also, the other foreaches are not at usable position. Better is to put the repeater outside the table.

Extract text from column in select of MySql query

I have a table named sentEmails where the body column contains the body text of an email.
In the body text, there is a substring like:
some link: <a href="https://somelink#somesite.com/somePage.php?someVar=someVal&sentby=agent">Random link text
Using MySql, I need to extract the url from this column like https://somelink#somesite.com/somePage.php?someVar=someVal&sentby=agent
I was thinking something like the below would work by finding the starting location and returning the next 150 chars, of course it actually just returns the first 150 chars.
SELECT LEFT(body, LOCATE('some link: <a href="', body)+150) AS link
FROM sentEmails
WHERE sent between date_sub(now(),INTERVAL 1 WEEK) and now()
AND body like '%some link:%'
AND toEmail = 'email#gmail.com'
Additional info:
the link will always be preceded by the text some link:
Random link text at the end will change
I can live with getting a bit more of the text than need if I have to, for example, getting https://somelink#somesite.com/somePage.php">Random link text would be acceptable
the text shown above is a substring of the full body column which contains much more text
This isnt something Im going to be doing often. Im researching an issue and I need the links from 40-50 of these rows, Im just hoping to avoid having to pull the link manually from each row.
I can only use MySQL Query Browser to access this DB if I could connect with php, this would be trivial
The url in question, can have 6-25 parameters in it
The url in question will always end with this parameter &sentby=agent
If you had two unique delimiters around the URL, then could just use SUBSTRING() to isolate it. One approach would be to replace the two sides of the URL in the anchor tag with a delimeter:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(
REPLACE(REPLACE(body, '<a href="', '~'), '&sentby=agent">', '&sentby=agent~'), '~', -2),
'~', 1)
FROM sentEmails
WHERE sent BETWEEN DATE_SUB(NOW(), INTERVAL 1 WEEK) AND NOW() AND
body LIKE '%some link:%' AND
toEmail = 'email#gmail.com'
I replaced <a href=" and "> with ~. If ~ does not occur anywhere in the body column, and if you only have one HTML tag in the body, then this should work.
If the body column is just a big chunk of HTML, then you should consider using xpath and handling this in your app layer.
if you're just trying to extract the link out, can you do instr() and mid function. something like this
select mid(body,substr(body,'="'),substr(body,'">')-substr(body,'="')) from email...
substr(body,'="') = starting position of the link =" and substr(body,'">') is the end position of the link.
MID function takes (str,pos, len) and len = end position - starting position
Thanks to Tim's help, I was able to get this working with the below query:
SELECT SUBSTRING_INDEX( SUBSTRING_INDEX(body, 'some link: <a href="', -1) , 'sentby=agent">', 1) AS link
FROM sentEmails
where sent between date_sub(now(),INTERVAL 1 WEEK) and now()
AND body like '%some link:%'
AND toEmail = 'email#gmail.com'
Doing this kind of search is not convenient. As the table with emails grows in size, the query will be less and less performant.
If this is a new application you're building, you're better with keeping a separate table with the list of URLs used on each sent email. You'd write the URLs to the DB as you send the emails.
The reasoning of this is that the App will do more searches in the DB than sending emails. Therefore, by doing a little extra work when sending emails, you help a lot in the most-expensive usage of the feature, which is the search.
If you still decide to keep the current approach, you'll want to have an index containing the columns (toEmail, sent) in this order.
Other than that, your approach makes sense and will work. Did you actually try it? Does it work for you?

Get tabledata from html, JSOUP

What is the best way to extract data from a table from an url?
In short I need to get the actual data from the these 2 tables at: http://www.oddsportal.com/sure-bets/
In this example the data would be "Paddy power" and "3.50"
See this image:
(Sorry for posting image like this, but I still need reputation, i will edit later)
http://img837.imageshack.us/img837/3219/odds2.png
I have tried with Jsoup, but i dont know if this is the best way?
And I can't seem to navigate correctly down the tables, I have tried things like this:
tables = doc.getElementsByAttributeValueStarting("class", "center");
link = doc.select("div#col-content > title").first();
String text1 = doc.select("div.odd").text();
The tables thing seem to get some data, but doesn't include the text in the table
Sorry, man. The second field you want to retrieve is filled by JavaScript. Jsoup does not execute JavaScript.
To select title of first row you can use:
Document doc = Jsoup.connect("http://www.oddsportal.com/sure-bets/").get();
Elements tables = doc.select("table.table-main").select("tr:eq(2)").select("td:eq(2)");
System.out.println(tables.select("a").attr("title"));
Chain selects used for visualization.

MySQL Replace query

I have one table, and I need to remove a specific text from a specific field. This field contains a full URL of an image, I need to remove the URL and just keep the image filename.
So:
Current date: fieldname: www.example.com/photo.jpg
What I want to do is remove www.example.com/ from all of the entries for this field.
I know how to use the search and replace function, but I don't know how to leave part of the data intact.
This is what I've used but can't modify it to make it work the way I want:
UPDATE table SET oc_upload1 = REPLACE(oc_upload1,'newtext') WHERE oc_upload1 LIKE "oldtext"
Is this possible? If so, how? Thank you!
This should do:
UPDATE table
SET image = REPLACE(image, 'www.example.com/','')
but, it's possible that image contains 'www.example.com/' as part of image file name so to be extra safe and replace only the first occurence of www.example.com
UPDATE table
SET image = SUBSTRING(image, LENGTH('www.example.com/') + 1)
WHERE image LIKE 'www.example.com/%'
But if You really, really just want the file name and not path to the file You can also use:
UPDATE table
SET image = SUBSTRING_INDEX(image,'/',-1)
Note that above statement will change 'www.example.com/images/01/02/daisy.jpg' to 'daisy.jpg', not 'images/01/02/daisy.jpg'. It also wont change rows that does not contain '/' in image.

SQL query - Replace/Move some parts of content

I need to update about 2000 records in MySQL
I have a column 'my_content' from table 'my_table' with the folowing value
Title: some title here<br />Author: John Smith<br />Size: 2MB<br />
I have created 3 new columns (my_title, my_author and my_size) and now I need to separate the content of 'my_content' like this
'my_title'
some title here
'my_author'
John Smith
'my_size'
2MB
As you can imagine the title, author and size are always different for each row.
What I'm thinking is to query the following, but I'm not great at SQL queries and I'm not sure what the actually query would look like.
This is what I'm trying to do:
Within 'my_content' find everything that starts with "title:..." and ends with "...<br />au" and move it to 'my_title'
Within 'my_content' find everything that starts with "thor:..." and ends with "...<br />s" and move it to 'my_author'
Within 'my_content' find everything that starts with "ize:..." and ends with "...<br />" and move it to 'my_size'
I just don't know how to write a query to do this.
Once all the content is in the new columns, I can just find and delete the content that's not needed any more, for example 'thor:' , etc.
You can use INSTR to find the index of your delimiters and SUBSTRING to select out the part you want. So, for instance, the author would be
SUBSTR(my_content,
INSTR(my_content, "Author: ") + 8,
INSTR(my_content, "Size: ") - INSTR(my_content, "Author: ") - 8)
You'd need a bit more work to trim the <br/> and any surrounding whitespace.
Please try the below:
SELECT SUBSTRING(SUBSTRING_INDEX(mycontent,'<br />',1),LOCATE('Title: ',mycontent)+7) as mytitle,
SUBSTRING(SUBSTRING_INDEX(mycontent,'<br />',2),LOCATE('Author: ',mycontent)+8) as myauthor,
SUBSTRING(SUBSTRING_INDEX(mycontent,'<br />',3),LOCATE('Size: ',mycontent)+6) as mysize
FROM mytable;