Validation/Format of display-name in from header - smtp

I need to know what are the rules for validation/format from(name-addr) field in the email.
In rfc explained the format of name-addr, but goes into detail about the display-name.
Like this:
From: John Q. Public <JQP#bar.com>
I want to know the characters and length allowed.
How do I know that John Q. Public has valid characters?
Should I allow only printable US-ASCII characters ?
I consulted the RFC 2822 and not found on the specific format of a display name

This is all defined in the rfc you linked to in your question (btw, the newer version of this document is RFC 5322):
display-name = phrase
phrase = 1*word / obs-phrase
word = atom / quoted-string
atom = [CFWS] 1*atext [CFWS]
atext = ALPHA / DIGIT / ; Any character except controls,
"!" / "#" / ; SP, and specials.
"$" / "%" / ; Used for atoms
"&" / "'" /
"*" / "+" /
"-" / "/" /
"=" / "?" /
"^" / "_" /
"`" / "{" /
"|" / "}" /
"~"
specials = "(" / ")" / ; Special characters used in
"<" / ">" / ; other parts of the syntax
"[" / "]" /
":" / ";" /
"#" / "\" /
"," / "." /
DQUOTE
You have to jump around in the document a bit to find the definitions of each of these token types, but they are all there.
Once you have the definitions, all you need to do is scan over your name string and see if it consists only of the valid characters.
According to the definitions, a display-name is a phrase and a phrase is 1-or-more word tokens (or an obs-word which I'll ignore for now to make this explanation simpler).
A word token can be either an atom or a quoted-string.
In your example, John Q. Public contains a special character, ".", which cannot appear within an atom token. What about a quoted-string token? Well, let's see...
quoted-string = [CFWS]
DQUOTE *([FWS] qcontent) [FWS] DQUOTE
[CFWS]
qcontent = qtext / quoted-pair
qtext = NO-WS-CTL / ; Non white space controls
%d33 / ; The rest of the US-ASCII
%d35-91 / ; characters not including "\"
%d93-126 ; or the quote character
Based on this, we can tell that a "." is allowed within a quoted-string, so... the correct formatting for your display-name can be any of the following:
From: "John Q. Public" <JQB#bar.com>
or
From: John "Q." Public <JQB#bar.com>
or
From: "John Q." Public <JQB#bar.com>
or
From: John "Q. Public" <JQB#bar.com>
Any one of those will work.

Related

Why is isspace() returning false for strings from the docx python library that are empty?

My objective is to extract strings from numbered/bulleted lists in multiple Microsoft Word documents, then to organize those strings into a single, one-line string where each string is ordered in the following manner: 1.string1 2.string2 3.string3 etc. I refer to these one-line strings as procedures, consisting of 'steps' 1., 2., 3., etc.
The reason it has to be in this format is because the procedure strings are being put into a database, the database is used to create Excel spreadsheet outputs, a formatting macro is used on the spreadsheets, and the procedure strings in question have to be in this format in order for that macro to work properly.
The numbered/bulleted lists in MSword are all similar in format, but some use numbers, some use bullets, and some have extra line spaces before the first point, or extra line spaces after the last point.
The following text shows three different examples of how the Word documents are formatted:
Paragraph Keyword 1: arbitrary text
1. Step 1
2. Step 2
3. Step 3
Paragraph Keyword 2: arbitrary text
Paragraph Keyword 3: arbitrary text
• Step 1
• Step 2
• Step 3
Paragraph Keyword 4: arbitrary text
Paragraph Keyword 5: arbitrary text
Step 1
Step 2
Step 3
Paragraph Keyword 6: arbitrary text
(For some reason the first two lists didn't get indented in the formatting of the post, but in my word document all the indentation is the same)
When the numbered/bulleted list is formatted without line extra spaces, my code works fine, e.g. between "paragraph keyword 1:" and "paragraph keyword 2:".
I was trying to use isspace() to isolate the instances where there are extra line spaces that aren't part of the list that I want to include in my procedure strings.
Here is my code:
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
def extractStrings(file):
doc = file
for i in range(len(doc.paragraphs)):
str1 = doc.paragraphs[i].text
if "Paragraph Keyword 1:" in str1:
start1=i
if "Paragraph Keyword 2:" in str1:
finish1=i
if "Paragraph Keyword 3:" in str1:
start2=i
if "Paragraph Keyword 4:" in str1:
finish2=i
if "Paragraph Keyword 5:" in str1:
start3=i
if "Paragraph Keyword 6:" in str1:
finish3=i
print("----------------------------")
procedure1 = ""
y=1
for x in range(start1 + 1, finish1):
temp = str((doc.paragraphs[x].text))
print(temp)
if not temp.isspace():
if y > 1:
procedure1 = (procedure1 + " " + str(y) + "." + temp)
else:
procedure1 = (procedure1 + str(y) + "." + temp)
y=y+1
print(procedure1)
print("----------------------------")
procedure2 = ""
y=1
for x in range(start2 + 1, finish2):
temp = str((doc.paragraphs[x].text))
print(temp)
if not temp.isspace():
if y > 1:
procedure2 = (procedure2 + " " + str(y) + "." + temp)
else:
procedure2 = (procedure2 + str(y) + "." + temp)
y=y+1
print(procedure2)
print("----------------------------")
procedure3 = ""
y=1
for x in range(start3 + 1, finish3):
temp = str((doc.paragraphs[x].text))
print(temp)
if not temp.isspace():
if y > 1:
procedure3 = (procedure3 + " " + str(y) + "." + temp)
else:
procedure3 = (procedure3 + str(y) + "." + temp)
y=y+1
print(procedure3)
print("----------------------------")
del doc
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
import docx
doc1 = docx.Document("docx_isspace_experiment_042420.docx")
extractStrings(doc1)
del doc1
Unfortunately I have no way of putting the output into this post, but the problem is that whenever there is a blank line in the word doc, isspace() returns false, and a number "x." is assigned to empty space, so I end up with something like: 1. 2.Step 1 3.Step 2 4.Step 3 5. 6. (that's the last iteration of print(procedure3) from the code)
The problem is that isspace() is returning false even when my python console output shows that the string is just a blank line.
Am I using isspace() incorrectly? Is there something in the string I am not detecting that is causing isspace() to return false? Is there a better way to accomplish this?
Use the test:
# --- for s a str value, like paragraph.text ---
if s.strip() == "":
print("s is a blank line")
str.isspace() returns True if the string contains only whitespace. An empty str contains nothing, and so therefore does not contain whitespace.

SSRS adding extra characters when exporting to CSV

I am creating an expression combining multiple fields into a single field in SSRS. However when I am exporting to CSV, some of the records are being appended with additional double quotes. How do I fix this?
Original Data:
SCode|BuildingNumber|StreetName|City|State|
---------------------------------------------
ABC| |123 Street|New York |NY|
DEF| |456 Street|Los Angeles|CA|
IJK|123|7th Ave |Chicago |IL|
XYZ| |789 Cir |Atlanta |GA|
Expression I am using:
=Fields!SCode.Value & "#" & IIF(IsNothing(Fields!BuildingNumber.Value), Fields!StreetName.Value, Fields!BuildingNumber.Value & "\," & Fields!StreetName.Value) & "#" & Fields!City.Value & "#" & Fields!State.Value"
Data after exporting to CSV:
ABC#123 Street#New York#NY
DEF#456 Street#Los Angeles#CA
"IJK#123, 7th Ave#Chicago#IL"
XYZ#789 Cir#Atlanta#GA
Thanks!
The CSV export should only be adding the text delimiter around a field if there's a delimiter character (a comma) or some sort of return charter.
Text qualifiers are added only when the value contains the delimiter
character or when the value has a line break.
MS Docs
Check you text for commas, return characters, and line feeds.
Your examples don't have a comma but it may still have a return or line feed.
SELECT *
FROM TABLE
WHERE FIELD LIKE '%' + CHAR(13) + '%'
OR FIELD LIKE '%' + CHAR(10) + '%'
The line feed and return characters are character numbers 10 and 13 in ASCII.

Appendheader needs to append string, variable and then string again

Hopefully this is an easy one for someone out there. I need to append a long command that has strings and variables in it.
this->AppendHeader("Content-Range", "bytes " + offset "-" + (offset + part_size - 1) "/" + file_size);
This is not acceptable in C++. How can I format the above so the Header looks like
Content-Range: bytes 0-19/40
(just a fyi - offset is 0, part_size is 20 and file_size is 40)

Is "fr, en; q=0.3" a valid Accept-Language value?

According to the RFC, it's okay to have *LWS between words and separators. However, if you look at the specific ABNF for matching the Accept-Language field, it doesn't allow for whitespace around the ; character.
Here is the exact LWS specification:
implied *LWS: The grammar described by this specification is
word-based. Except where noted otherwise, linear white space (LWS) can
be included between any two adjacent words (token or quoted-string),
and between adjacent words and separators, without changing the
interpretation of a field. At least one delimiter (LWS and/or
separators) MUST exist between any two tokens (for the definition of
"token" below), since they would otherwise be interpreted as a single
token.
Here is the ABNF grammar:
Accept-Language = "Accept-Language" ":"
1#( language-range [ ";" "q" "=" qvalue ] )
language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" )
I found out that there is a more recent RFC which clarifies this.
weight = OWS ";" OWS "q=" qvalue
qvalue = ( "0" [ "." 0*3DIGIT ] )
/ ( "1" [ "." 0*3("0") ] )
https://greenbytes.de/tech/webdav/rfc7231.html#quality.values
Thus the answer is thus YES, it is valid.

What characters have to be escaped to prevent (My)SQL injections?

I'm using MySQL API's function
mysql_real_escape_string()
Based on the documentation, it escapes the following characters:
\0
\n
\r
\
'
"
\Z
Now, I looked into OWASP.org's ESAPI security library and in the Python port it had the following code (http://code.google.com/p/owasp-esapi-python/source/browse/esapi/codecs/mysql.py):
"""
Encodes a character for MySQL.
"""
lookup = {
0x00 : "\\0",
0x08 : "\\b",
0x09 : "\\t",
0x0a : "\\n",
0x0d : "\\r",
0x1a : "\\Z",
0x22 : '\\"',
0x25 : "\\%",
0x27 : "\\'",
0x5c : "\\\\",
0x5f : "\\_",
}
Now, I'm wondering whether all those characters are really needed to be escaped. I understand why % and _ are there, they are meta characters in LIKE operator, but I can't simply understand why did they add backspace and tabulator characters (\b \t)? Is there a security issue if you do a query:
SELECT a FROM b WHERE c = '...user input ...';
Where user input contains tabulators or backspace characters?
My question is here: Why did they include \b \t in the ESAPI security library? Are there any situations where you might need to escape those characters?
A guess concerning the backspace character: Imagine I send you an email "Hi, here's the query to update your DB as you wanted" and an attached textfile with
INSERT INTO students VALUES ("Bobby Tables",12,"abc",3.6);
You cat the file, see it's okay, and just pipe the file to MySQL. What you didn't know, however, was that I put
DROP TABLE students;\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b\b
before the INSERT STATEMENT which you didn't see because on console output the backspaces overwrote it. Bamm!
Just a guess, though.
Edit (couldn't resist):
The MySQL manual page for strings says:
\0   An ASCII NUL (0x00) character.
\'   A single quote (“'”) character.
\"   A double quote (“"”) character.
\b   A backspace character.
\n   A newline (linefeed) character.
\r   A carriage return character.
\t   A tab character.
\Z   ASCII 26 (Control-Z). See note following the table.
\\   A backslash (“\”) character.
\%   A “%” character. See note following the table.
\_   A “_” character. See note following the table.
Blacklisting (identifying bad characters) is never the way to go, if you have any other options.
You need to use a conbination of whitelisting, and more importantly, bound-parameter approaches.
Whilst this particular answer has a PHP focus, it still helps plenty and will help explain that just running a string through a char filter doesn't work in many cases. Please, please see Do htmlspecialchars and mysql_real_escape_string keep my PHP code safe from injection?
Where user input contains tabulators or backspace characters?
It's quite remarkable a fact that up to this day most users do believe that it's user input have to be escaped, and such escaping "prevents injections".
Java solution:
public static String filter( String s ) {
StringBuffer buffer = new StringBuffer();
int i;
for( byte b : s.getBytes() ) {
i = (int) b;
switch( i ) {
case 9 : buffer.append( " " ); break;
case 10 : buffer.append( "\\n" ); break;
case 13 : buffer.append( "\\r" ); break;
case 34 : buffer.append( "\\\"" ); break;
case 39 : buffer.append( "\\'" ); break;
case 92 : buffer.append( "\\" );
if( i > 31 && i < 127 ) buffer.append( new String( new byte[] { b } ) );
}
}
return buffer.toString();
}
couldn't one just delete the single quote(s) from user input?
eg: $input =~ s/\'|\"//g;