Regular expression in Visual Studio find Replace - html

Hi I have a html file with hundreds of lines like this
<tr>
<td class="text-column">
Risk
</td>
<td>
7,848,705
</td>
<td>
7,828,750
</td>
<td>
19,955
</td>
</tr>
To save time formatting it, does anyone know the visual studio find/ replace regular expression that will produce
<tr>
<td class="text-column">Risk</td>
<td>7,848,705</td>
<td>7,828,750</td>
<td>19,955</td>
</tr>
I plan to fill in the figures with razor later and this will ease readability.

Find: {\<[^\>]+\>}[:b\n]*{[^\n]*}[:b\n]*{\</[^\>]+\>}
Replace: \1\2\3
Explanation:
{\<[^\>]+\>} -- capture open tag
[:b\n]* -- discard whitespace
{[^\n]*} -- get contents (assuming no line breaks)
[:b\n]* -- discard whitespace
{\</[^\>]+\>} -- capture closing tag
Not perfect, but it produces the expected output on your sample.

Did it with code in the end. But thanks to Devon for taking the time

Related

Linearize specific node of XML/HTML document for vertical editing on Notepad++

In order to easily modify in vertical selection a huge number of cells in the table in one Selection with a vertical paste. (the paste on each line is from a list which has different value on each line).
I would like to do it inside Notepad++ only without the need to program anything.
What I have from the start is normal <Tbody> content of a table like this :
<tbody>
<tr>
<td>
MachineType
</td>
<td>
Yt_GP_MachineType
</td>
<td/>
<td>
MyMachine.MachineType
</td>
</tr>
<tr>
<td>
Variant
</td>
<td>
Yt_GP_Variant
</td>
<td>
</td>
<td>
MyMachine.Variant
</td>
</tr>
<tr>
<td>
Emulation
</td>
<td>
Yt_GP_Emulation
</td>
<td>
</td>
<td>
MyMachine.Emulation
</td>
</tr>
And I would like to have a macro that linearize and align all <tr> nodes and below in a single line like this :
<tbody>
<tr><td> MachineType </td><td> Yt_GP_MachineType </td><td></td><td> MyMachine.MachineType </td></tr>
<tr><td> Variant </td><td> Yt_GP_Variant </td><td></td><td> MyMachine.Variant </td></tr>
<tr><td> Emulation </td><td> Yt_GP_Emulation </td><td></td><td> MyMachine.Emulation </td></tr>
Note: the auto alignment of each <td> & </td> nodes is important and the "Code alignment" plugin of Notepad++ doesn't work if I mentioned align by... (CTRL+SHIFT+=) "<" for my part. Currently I'm doing this manually...
Note 2 : Linearize or Pretty print from XML Tool plug in doesn't solve my issue.
The short story - IMHO you should write a script in your favorite scripting language.
BTW - spaces within <tr>...</tr> only makes manual editing easier and should be replaced by the use of CSS e.g. <tbody style="vertical-align:center;text-align:center"><tr style="height:100px">
But if you really want to do a special linearize with Notepad++, here is the man's way on Windows. For other operating systems you have to adapt this.
Tidy up different (!) tags like <td/>. Find what: <td/>, Replace with: <td></td>
Tidy trailing space (see your HTML snippet posted): From main menu Edit > Blank Operations > Trim Trailing Space
Tidy tabs vs. space (see your HTML snippet posted): From main menu Edit > Blank Operations > TAB to Space
From main menu Plugins > XML Tools > Linearize
RegEx Make single lines: Find what: <tr>, Replace with: \r\n<tr>
RegEx Expand lines (only change trto td): Find what: <td>, Replace with: \r\n<td>
Go to begin of first line e.g Strg+Pos1
Align: Plugins > Code alignment > Align by...</td> (It may be necessary for you to install the plugin.)
If really necessary (for your needs only) insert two spaces. Uncheck Wrap around and try single replace first! Find what: <td>, Replace with: <td> . Please note the two space here!
Repeat (4) - From main menu Plugins > XML Tools > Linearize
Repeat (5) - RegEx Make single lines: Find what: <tr>, Replace with: \r\n<tr>
All this is resulting in:

jupyter notebook html table cannot display

why jupyter notebook cannot display html 'table', but other html elements are okay, and how can I solve this problem?
below is my source code and my result
source code
<table style="width:20%">
<tr>
<td> **L1** </td>
<td> 1.1 </td>
</tr>
</table>
result
<tr>
<td> **L1** </td>
<td> 1.1 </td>
</tr>
it only removes tag <table> and does nothing else.
That's interesting, because the table (created by pasting your code) comes out just fine for me:
From my personal experience, I can definitely say that Markdown in jupyter notebooks is somehow volatile/unpredictable when it comes to HTML. For example, I myself have been looking for a fix for Jupyter just arbitrarily doing line breaks in the table header (esp. in formulas) and I have found this fix, but it doesn't work for me at all.
I know this is not particularly helpful, but at least it demonstrates that the table not being rendered correctly in your case was either some unexpected behaviour or some bug which maybe, or maybe not, has been fixed in the course of the last year.
You can also add the magic command before the table tag, which worked for me.

laravel dompdf not rendering complex html with rowspan and colspan correctly

I have a very complex dynamic table that I need to output to pdf in laravel 5.6. The project I inherited had Dompdf installed and is already rendering all other content. Therefore, I use it as well for compatibility.
My issue is I have a table to render consisting of 13 columns and undefined number of rows, where intermittently a column may span 13 columns for a heading or a row may span several rows at any given time or a colspan within the rowspan that spans 11 columns from the 3rd row. No html is hardcoded except the <table>, <thead>, <th> and <tbody> tags. The html within the tbody tag is dynamically generated depending on the array data.
Everything looks great in the browser and when I view() the pdf blade as well as ctrl + p it creates a nice pdf, although for some reason rowspan cells spanning to the next page does not carry over markup and content. As soon as I try to stream() the pdf the table becomes warped and looks like a toppled building built by Picasso.
Here is links to pdf's, the one I ctrl + p lost its colour due to me removing names.
File to view pdf printed with ctrl + p
Pdf streamed with Dompdf
Image of viewing pdf in browser
Image of pdf when streaming via Dompdf:
Html sample rendered in browser:
<tr style="background-color: #5b8969;">
<td rowspan="2" style="background-color: #F8C293; color: black;">Spray 4</td>
<td>Pollinate</td>
<td>7-10 days later</td>
<td>BENOMYL WP 25KG </td>
<td>benomyl 500g/kg</td>
<td> </td>
<td>1000</td>
<td>2.00</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Full bloom</td>
<td>Black Spot</td>
<td>WETCIT DUO 20L </td>
<td>borax 10g/orange oil 50g/l</td>
<td> </td>
<td>1000</td>
<td>25.00</td>
<td>100.0000</td>
<td>120.0000L</td>
<td>2500.0000</td>
<td></td>
<td></td>
</tr>
<tr>
<td colspan="13" style="background-color: #9fb5d3;" class="h3 font-weight-bold">ANOTHER ONE</td>
</tr>
<tr>
<td rowspan="7" style="background-color: #F8C293; color: black;">Spray 7</td>
<td>20 cm</td>
<td>African Armyworm</td>
<td>CERATO 250 EC 5L </td>
<td>pyraclostrobin 250g/l</td>
<td> </td>
<td>1000</td>
<td>2.00</td>
<td>10.0000</td>
<td></td>
<td>20.0000</td>
<td></td>
<td></td>
</tr>
Can someone please help and give me a clue on how to output such a complex table with Dompdf? As I would really want to keep on using only one PDF rendering library in this project.
Otherwise I am open to suggestions to use another pdf library that can handle rowspan that span pages and this complex layout?
Update
Based on a comment by Don't panic (he suggested validating html and fill empty td tags with ), that he subsequently deleted.
I re-wrote the HTML as a template into my pdf.blade.php view. Now, I only output the values in a loop in my view. Firstly, it becomes easier to maintain and to leave off the validation he suggested. I also filled every empty <td> tag with a hardcoded ' '. This is to more easily see why certain rows end where they should and others not. The result is sadly still the same, a warped table. But it does seem to be a rowspan issue not colspan. The 'rowspan' rows stack after another. So maybe missing a td tr.
Solved rowspan stacking issue
Two weeks of testing and only problem was it was not outputting certain rows' opening tags, which lead to rows not knowing when to begin. Now only problem left is rowspan across pages.
Update on update
So I have really tried everything I can to get DomPdf to do what it is suppose to do, which is rendering pdf's. I have read a bit more and found that this library has a long standing issue of not being able to render rowspan accross pages. Therefore, on to the next rendering library wkhtlmpdf or I could logically divide rowspans to stop at end of page and start again on new page. Will have to check my watch on this one.

AMPScript: trying to insert hidden fields

I'm working on some email that will be deployed via Exact Target. We have a lot of AMPScript dictating what is going on within the email(s). The content blocks of the email are dynamically filled, and when a field is left empty there is still a call made to that table section, which then inserts a blank space on the email. Thus throwing the design out of whack.
My question is, is there anyway I can have those empty cells completely removed from the page when not in use?
here is the code sample ...
Set #SendLog_blockC1 = lookup("RaceDataSendLog","BLK_C1","SubID",#SubLookup,"JobID",#JobLookup,"BatchID",#BatchLookup)
...
...
...
Set #blockC1 = Concat("My Contents\Newsletter\",#SendLog_blockC1)
....
....
....
....
%%[IF empty(#blockC1) THEN]%%
%%[ELSE]%%
<tr>
<td align="left" valign="top" >
%%=ContentAreaByName(#blockC1,"",0)=%%
</td>
</tr>
%%[ENDIF]%%
Thank you in advance.
On the assumption you're referring to that space above your ELSE - this should work:
<!--%%[
IF empty(#blockC1) THEN
ELSE]%%-->
<tr>
<td align="left" valign="top" >
%%=ContentAreaByName(#blockC1,"",0)=%%
</td>
</tr>
<!--%%[ENDIF]%%-->
This will hide the AMPscript in the HTML, in addition - you don't really need the IF to produce the space, you can just have the ELSE right after it.
and thank you for your responses. I was finally able to resolve the issue of the extra spacing.
What I did was remove the <tr> and <td> tags from around the if/else statements. I then placed those <tr> and <td> tags around the content blocks that are brought in by the PM's when they decide which blocks to use. This solved the problem of the extra spacing. Client is happy!!!
Thanks again guys!!!

Looking for an HTML parser to do search/replaces on text nodes

I need to do a lot of various search and replaces within A LOT of static html files. One issue I'm coming up with is I'm getting matches in urls when really all I want to search/replace are text nodes.
So that makes regular expressions more difficult and most likely more error prone since you're parsing html with them now.
What's the easiest way to do search/replaces on only text nodes? I'm talking like you can be up and running within a couple minutes with no Master's required in Python-Java-Ruby-Headless-Phantom-PHP-Node-FluxCapacitor.
Please give advice as though you're speaking to a moron.
I'm on Windows 7.
What I'm looking for is something like the search/replace functionality in Notepad++. You give it a directory to start searching, it searches recursively, hitting every type of file you specify (like .html or .shtml) you tell it what to search for and what to replace it with. It runs and 10 or 15 seconds later you might have edited hundreds of files in one fell swoop. You know, dead simple stuff.
So that's what I want to do, but just searching/replacing within text nodes.
SublimeText 2 has some very powerful text searching features that should empower you to be able to do as you are explaining, so whilst i think i can point you in the right direction - I myself am still learning how to use it - but it does have the "find in files" option which means you can grab the selected word in many different files and replace it - but I havn't found a way to exclude the irrelevant ones that may not need changing. Hopefully someone else will come along and enlighten you.
You may want to add the tag "Sublime text 2" to your original post to broaden the audience
You can use Python and HTQL at http://htql.net. Some examples:
page="<html> <body> <table> <tr><td id='cell1'> test1 </td></tr> <tr> <td id='cell2'> test2 </td> </tr> </table> </body> </html>"
import htql
print(htql.query(page, "<td (id='cell1')>:tx &replace('XXXX') "))
#[("<html> <body> <table> <tr><td id='cell1'>XXXX</td></tr> <tr> <td id='cell2'> test2 </td> </tr> </table> </body> </html>",)]
print(htql.query(page, "<td (id='cell1')>:id &replace('ZZZZ') "))
#[("<html> <body> <table> <tr><td id='ZZZZ'> test1 </td></tr> <tr> <td id='cell2'> test2 </td> </tr> </table> </body> </html>",)]
print(htql.query(page, "<td (id like 'cell%')>:tx &replace('YYYY') "))
#[("<html> <body> <table> <tr><td id='cell1'>YYYY</td></tr> <tr> <td id='cell2'>YYYY</td> </tr> </table> </body> </html>",)]