Hopefully I can make this understandable as I am not a programmer.
I have an email that I believe the header information is correct but the content has been altered.
In the content portion of the header in every gmail that I have sent/forwarded there is an original text version of the email content and an HTML version which is limited to 76 characters and each sentence ends with an =.
This occurs whether the sentence was a complete sentence or if the email servers cut the words into 2 sections like this.
"little boy blue ju=
mped over the moon"=
When the line hit 75 characters it inserted the = sign and moved the rest of the word to the next line
This creates a perfectly square box of text.
Now, I have an email where I believe the almost clever individual attempted to spoof the content part of the header data but instead completed each sentence with a =20 then moved to the next line. No words were cropped and the context is not in a perfect square as it is on every other email header I have inspected.
Also, when an email is forwarded each line is prefaced with a > or a >> which is not the case in this particular email.
Additionally in the HTML portion of the emails the paragraph will end with additional coding such as...
In the HTML code used in Gmails headers, what does the =20 designate?
Also, based on the minimal amount of information I have supplied to you am I correct in believing the content may be spoofed?
Related
I am generating Word docs from html. Basically, I build a file with html and save it as a .doc. Then I open it in Word and apply a template. All good so far.
I would like to automatically generate a custom TOC via the HTML ie when I am building the document. I need to insert a field code to do that, in the same way I do to add page numbering via the HML. eg:
<span style="mso-field-code: PAGE " class="page-field"></span>
If I save my html doc as docx and apply a template, I can make a TOC based in the styles in the way one would normally create a TOC in Word. I customised the TOC so the Title style is the top level followed by H1, H2 then H3. If I then toggle the field code on the TOC, the field code looks like this:
{ TOC \t "Heading 1,2,Heading 2,3,Heading 3,4,Title,1" }
Now, I can add HTML like this to insert the TOC:
<div style="mso-field-code: TOC " class="toc-field">TOC goes HERE</div>
When I do that, if I right click the text "TOC goes HERE" I get the option to "Update field" and if I do that a TOC is generated using the default H1,H2,H3 tags.
But, what I can't work out is how to include the
\t "Heading 1,2,Heading 2,3,Heading 3,4,Title,1"
part so my custom style sequence is applied. I have tried all sorts of combinations and it seems that adding anything after TOC causes Word to not make a field code.
Does anyone have any suggestions?
Update:
Based on the essential help from #slightlysnarky below, I thought I would summarise the outcome here because the information I needed was in a Microsoft chm file that was taken down many years ago. If you read the following extract from that help manual and compare it to the solution below you will see how this all works.
Word marks and stores information for simple fields by means of the Span element with the mso-field-code style. The mso-field-code value represents the string value of the field code. Formatting in the original field code might be lost when saving as HTML if only the string value of the code is necessary for its calculation.
Word has a different way of storing field information to HTML for more complex fields, such as ones that have formatted text or long values. Word marks these fields with so the data is not displayed in the browser. Word uses the Span element with the mso-element: field-begin, mso-element: field-separator, and mso-element: field-end attributes to contain the three respective parts of the field code: the field start, the separator between field code and field results, and the field end. Whenever possible, Word will save the field to HTML in the method that uses the least file space.
So, basically, add tags as shown below to your HTML at the point you wish the TOC to appear.
:-)
Word recognises a "complex field format" in HTML, along the same lines as it does in the Office Open XML format. So you can use
<span style='mso-element:field-begin'></span>TOC \t "Heading 1,2,Heading 2,3,Heading 3,4,Title,1"
<span style='mso-element:field-separator'></span>This text will show but the user will need to update the field
<span style='mso-element:field-end'></span>
This construct is outlined in a Microsoft document called "Microsoft Office HTML and XML Reference". It's a Windows .exe that unpacks to a .chm Help file. You can get it here
The info. on encoding fields is in Getting Started with Microsoft Office 2000 HTML and XML->Microsoft Word->Fields
There may be a later version but that's the only one I could find.
I would like to replace the text in a google doc. At the moment I have place markers as follows
Invoice ##invoiceNumber##
I replace the invoice number with
body.replaceText('##invoiceNumber##',invoiceNumber);
Which is fine but I can only run the script once as obviously ##invoiceNumber## is no longer in the document. I was thinking I could replace the text after Invoice as this will stay the same, appendParagraph looks like it might to the trick but I can't figure it out. I think something like body.appendParagraph("Invoice") would select the area? Not sure how to append to this after that.
You could try something like this I think:
body.replaceText('InvoiceNumber \\w{1,9} ','InvoiceNumber ' + invoicenumber);
I don't know how big your invoice numbers are but that will except from 1 to 9 word characters preceeded by a space and followed by a space. That pattern might have to be modified depending upon your textual needs.
Word Characters [A-Za-z0-9_]
If your invoice numbers are unique enough perhaps you could just replace them.
Reference
Regular Expression Syntax
Note: the regex pattern is passed as a string rather than a regular expression
I'm just wondering how to delete the same, yet different html code in an html page. Or to be able to do this for multiple pages at once too if that's possible.
What I mean by this is html code that has the same beginning, and same end, but not the same middle content between them compared to other .html pages.
The middle may seem similar, but is actually different across all html documents, such as a slight change in a link from page to page.
Or, the middle's code can be entirely different compared to other documents, diverging.
Is there any tool out there where you can specify delete from <span STYLE= to </span> where the middle content is "font-size: x-small; color: #90c040">example link with the middle content varying between different html pages?
If you could specify the beginning and end code to be deleted, and delete everything that's inbetween it, and do this with a one push button that you can specify the parameters, that saves those parameters so you don't have to enter it every time, that would be great.
Or if it could allow you to do multiple html pages at once selecting them manually, a whole bunch at once , or possibly specify a folder and look for every html page in that folder, and delete the html code if it exists once you do it ( if it doesn't exist, then it moves to the next file. )
Thanks! I'm just wondering. Any help is much appreciated! ^_^~
~Update! I've found a program that works!~
I found a link with the programs, Notepad++, TextCrawler, Search & Replace Master, Ecobyte Replace Text, and InfoRapid Search & Replace. I also found multiple file search and replace.
− Notepad++ didn't allow wildcard * or start/end functions.
− TextCrawler as well as InfoRapid Search & Replace didn't work.
− Search & Replace Master was finicky. It didn't work at first, then it did after re-opening the program.
− Ecobyte Replace Text worked the best. This deleted everything beginning to end that I didn't want across many different .html files. I could specify what I wanted with the 'range function'.
− Multiple file search and replace worked too, but functioned differently. If you're looking to keep the beginning and end code, but not what's in the middle, then this one would work for you.
Examples:
Ecobyte will delete <span STYLE= to middle content inbetween to </span>
Leaving you with none of that code remaining.
Multiple File Search will not delete <span STYLE= & </span> but it will delete all of the middle content inbetween.
This leaves you with <span STYLE= & </span> but no code remains if that was inbetween the beginning and end code you specified.
I hope this helps anyone else looking to delete code with the same beginning and end, but different middle code. Cheers! ^_^~
Picture if anyone needs: html different text replacer
I'm having issues with an email signature I'm creating with HTML. I've linked an image from another topic (dated 2013), with the same issue - to no fix.
When replying to an email, the quoted text appears beside the signature, instead of underneath. I've tried 100% width on numerous elements, to the same issue. Breaks also don't work (trimmed in Outlook?).
I've created the signature with tables. Formatting the html doc in Word (setting table word wrap to none) will fix the issue - but I don't want the mass amounts of fluff associated with the Word created html.
Any help would be greatly appreciated.
Source Topic
Edit: Fixed - I had to nest tables. Surrounded my signature table with a <table><tr><td width="100%"> ... </tr></td></table>
I was having the same problem and realized I had caused it by using the
<table align="left">
command, forcing the quoted e-mail to appear to the right of my signature table. Once I removed the align command it worked perfectly, without using nested tables.
To resolve the issue I've selected the full signature in word document:
Right click > Table Properties> Text Wrapping> None.
This moves the original message below the signature as intended.
I am posting HTML data from an input text field called "textbox" to a backend application. The backend application (a django view) receives the data bu it is garbled with random equal to "=" characters in between, even though the html content in "textbox" before posting, was perfectly fine.
I suspect this is a problem with the encoding of POST data, but I am not able to figure out a solution to avoid this.
The textbox data can have any html data, and special characters like <, >, {, }.
To summarize the problem:
The text data like:
<p>This is a <b>sample</b> text<p>
<p> This is the second line </p>
becomes something like when I check the request.POST["textbox"] value in the Django view.
<p>Th=is is a <b>sample</b> tex=t<p>
<p> This i=s the se=cond line </p>
Is anyone facing a similar problem because I did not find any related questions on stackoverflow? AFAIK, I think this problem might not have specificity to Django, but still adding the information, in case its useful.