How do I alter email headers using procmail? - smtp

We use JIRA Cloud for our ticketing system, which does not support using email aliases. Since we now have two domains in our system, with the second domain added as an alias in G Suite (same usernames across both). Management decided to use this new domain, domain2, as the primary FROM address for all users, which has caused issues in several places, such as in JIRA, since we cannot change the main domain in G Suite OR in JIRA, and emails can come from either domain1 or domain2.
So I'd like to set up a procmail (or equivalent) filter that checks the helpdesk# email account via POP3, and for emails sent from domain1, it would add "inc" at the end so it matches domain2 in the email headers and email FROM field, and then send that message to a second email address that JIRA listens to. It would need appear as coming FROM user#domain1 as well, not the actual account sending it (which I know requires additional work on the G Suite end to allow).
Since JIRA doesn't allow any of this email processing internally, this would allow JIRA to work properly without add-ons that may not do what we need them to, and can get expensive since they're charged monthly, per user.
So I'm trying to see if procmail is even the easiest (or best) thing to set up for this (considering it's not maintained anymore), and which combination of agents would be easiest for this. There are so many options but I'm not sure which would be easiest to set up for this, or quite how to do it.
Once I know which direction to go, I should be able to figure out how to make it work. Just not sure where to begin here, which agents to use, how best to approach this.
Thank you!

Your question is really not about programming; maybe try https://serverfault.com/ or https://unix.stackexchange.com/ for the infrastructure parts. I'll focus on answering the question in the title, though the details on that are also rather muddy.
:0fH
* domain1
| sed 's/domain1/domain2/g'
I'm guessing from your description that domain1 is actually a substring of domain2. If that's the case, the regexes need to be sharpened a bit (or you'll end up replacing domain1inc with domain1incinc, etc). As a quick first approximation, doman1($|[^i]) will match domain1 when it is followed by nothing, or a character which isn't i. When substituting, you will want to keep that character, which is usually done in sed by remembering it, and substituting it with itself. Or you can switch to Perl, which supports a much richer regex dialect.
:0fH
* domain1($|[^i])
| perl -pe 's/domain1(?!inc)/domain2/g'
Though of course, perhaps your real use case looks more iike s/domain1.com/domain2.com/g in which case the additional context of the .com suffix is quite sufficient for avoiding to substitute strings which should remain unchanged, and you can safely stay with the simpler and thus faster and probably more secure sed.
Again, how exactly to run Procmail on your incoming email in the first place is a separate topic which isn't really programming-related. If you have Postfix and Procmail on the mail server, simply creating a .procmailrc in the helpdesk account's home directory should suffice.

Related

HTML- Required "bug" [duplicate]

Using a simple tool like FireBug, anyone can change javascript parameters on the client side. If anyone take time and study your application for a while, they can learn how to change JS parameters resulting in hacking your site.
For example, a simple user can delete entities which they see but are not allowed to change. I know a good developer must check everything on server side, but this means more overhead, you must do checks with data from a DB first, in order to validate the request. This takes a lot of time, for every action someone must validate it, and can only do this by fetching the needed data from DB.
What would you do to minimize hacking in that case?
A more simple way to validate is to add another parameter for every javascript function, this parameter must be a signature between previous parameters and a secret key.
How good sounds the solution above to you?
Our team use teamworkpm.net to organize our work. I just discovered that I can edit someone else tasks by changing a javascript function (which initially edit my own tasks).
when every function call to server, in server side before you do the action , you need to check if this user is allowed to do this action.
It is necessary to build server-side permissions mechanism to prevent unwanted actions, you may want to define groups of users, not individual user level, it makes it easier.
Anything on the client side could be spoofed. If you use some type of secret key + parameter signature, your signature algorithm must be sufficiently random/secure that it cannot be reverse engineered.
The overhead created with adding client side complexity is better spent crafting proper server side validations.
What would you do to minimize hacking in that case ?
You can't work around using validation methods on the server side.
A more simple way to validate is to add another parameter for every javascript function, this parameter must be a signature between previous parameters and a secret key.
How good sounds the solution above to you ?
And how do you use the secret key without the client seeing it? As you self mentioned before, the user easily can manipulate your javascript, and also he can read everything in javascript, the secret key, too!
You can't hide anything in JavaScript, the only thing you can do is to obscure things in JavaScript, and hope nobody tries to find out what you try to hide.
This is why you must validate everything on the server. You can never guarantee that the user won't mess about with things on the client.
Everything, even your javascript source code is visible to the client and can be changed by them, theres no way around this.
There's really no way to do this completely client-side. If the person has a valid auth cookie, they can craft any sort of request they want regardless of the code on the page and send it to your server. You can do things with other, encrypted cookies that must sent back with the request and also must match the inputs on the page, but you still need to check this server-side. Server-side security is essential in protecting your application from unauthorized access and you must ensure, server-side, that every action being performed is one that the user is authorized to perform.
You certainly cannot hide anything client side, so there is little point in trying to do so.
If what you are saying is that you are sending something like a user ID and you want to ensure that the returned value has not been illicitly changed then the simplest way of doing so it probably to generate and send a UUID alongside it, and check on return that the value of the uuid matched that stored on the server for the userID before doing any further processing. The space for uuid's is so large that you can discount any false hits ever occurring.
As to actual server side processing vulnerabilities:- you should simply always build in your security/permissions as close to the database as you can, and defiantly not in the client. There's nothing different in the scenario you outline from any normal client-server design.
Peter from Teamworkpm.net here - I'm one of the main developers and was concerned to come across this report about a security problem. I checked into this and I am happy that is not possible to delete a task that you shouldn't have access to.
You get a message saying "You do not have permission to delete this task".
I think it is just the confusion between being a Project Administrator and being an overall Administrator that is the problem here :- You may not be a member of a project but as an overall administrator, you still have permission to delete any task within your Teamwork site. This is by design.
We take security very seriously and it's all implemented server side because as Jens F says, we can't reply on client side security.
If you do come across any issues in TeamworkPM that you would like to discuss, we'd encourage any of you to just hit the feedback link and you'll typically get an answer within a few hours.

How to get the id of the logined google user from chrome extension?

Without using oauth2.
Because I don't want to get any user's data or do an authentication, only get the id.
And I want to monitor login/logout (chrome.identity.onSignInChanged does not work).
ps I need ID for storing data on my server (chrome.storage.sync is too small).
You say you dont want to use oauth because you dont need any user data. However the id or email IS user data and there's an oauth scope just for that. Use it, else the other alternatives might break in the future, or wait until chrome identity is out for all.
Another way if you really dont want oauth is to store a random number in chrome sync and use that as your id. If the random is large enough you will avoid collitions in practice. Prepend the random with the current millseconds since 1970 and I bet there will be no collitions.
chrome.identity.onSignInChanged is not working (as you mentioned), because it is currently available on the dev channel only (according to this and other sources online).
So, with a little bit of luck, it will be available on the stable channel soon...
A little hacky, but (as suggested in this answer) you could make an AJAX requests to https://www.google.com/settings/account and parse the content to extract the info about user being logged in and user e-mail.
(That's not very robust, of course, since the e-mail might change, but maybe good enough for a temporary work-around.)

What is and why is my email client trying to use opencounter in a bulk email?

We found some code being inserted into emails sent by our proprietary email system and have no idea of its provenance.
My company sends a lot of bulk email for clients. (We follow all the best practice protocols to ensure we're not spammers.) The system is proprietary, based on open source code. Customers have a GUI to enter content, similar to the big guys like MailChimp and the like.
A staff member brought a UI challenge with the GUI to me, using a client's bulk email as an example. I dug into the source to see if they had some exotic CSS that might be affecting my interface, when I noticed the following tag:
<custom name="opencounter" type="tracking"> </custom>
My interface certainly doesn't insert that code into an email.
What is opencounter? Who's technology is it? Does it have a valid reason for being used on our (proprietary) email system?
It appears that "opencounter" is a proprietary counting mechanism used by Exact Target's bulk mailing system. Apparently, the client was copy/pasting from an old campaign done on ExactTarget to move the design to our system. It is therefore safe for me to remove.
My best guess is that it is something that is auto-substituted to put in some tracking information into the individual e-mails. I'd suggest doing some tests on "bulk" e-mails you've set up just to yourself. Put some known content immediately either side of it and then send yourself this e-mail and view the source to see if its been substituted with anything. e.g.
XXX<custom name="opencounter" type="tracking"> </custom>YYY
If the final output has XXXYYY or something then you'll know its a tracker in the bulk e-mailer. If it outputs as is you can probably safely assume you can get rid of it. If it gets rid of it completely then it may be used for some kind of processing on the server but I'm not sure what that might be...
The other thing you can do is to do a search of your entire codebase for "opencounter" to see if there are any references to it.
One final thought: Does your customer interface allow them to put in HTML directly or is it just a gui? It occurs to me that if they used a previous bulk e-mailer then it might be something specific to that one that got copied over if its not in yours.
We had a similar situation and traced it back to a user who was utilizing a third-party WYSIWYG tool to develop code that they then pasted into our CMS. It's a harmless issue, but it points to the need to improve our tool so that others don't feel like they have to use another editor.

any way to "ping" a phone number?

We have a customer who wants to go through their CRM database and somehow determine phone numbers which are valid, without actually having someone sit there and try calling them all.
Is there any way to do something akin to a "ping" on a phone number (including landlines)?
You will need to go through a third party. I have used Melissa data for address verification with good success, they also offer phone verification, but I have not used it
http://www.melissadata.com/listservices/resphoneverify.htm
If getting a 100% correct phone number is crucial, I'd look into a service which would actually call the number, give a verification code and make the user confirm that code with the site. It is a PIA from the users perspective, but that is the most complete route you can take. Doing a quick little googling came up with this site, http://www.phoneconfirm.com which seems to do what I mentioned. I am sure there are others though.
If you can't/don't want to go through a third party, I can't imagine writing something like this yourself would be impossible. Scaling it would be the biggest issue.
could always go with the good ole war dialer
I believe a CTI system using ISDN calling based service can quickly return a status code that the number is either valid/invalid before the destination begins to ring.
One vendor is Katalina systems, their product is called VoiceGuide and they have a dialling out module that may give you what you want. see www.voiceguide.com.
Just export the calling list to the dialler (csv file) and review the call status after processing.
If the list is very large, it may justify purchasing a system to do this. The rate of calling depends upon the number of lines installed/availble. You might require some custom modifications to abort the call after obtaining the status. Katalina should be able to help. I am not sure if VoIP trunks can give you full access to the line status.
I once did something like that. Yeah, for telemarketers. And yeah, it haunts my conscience to this day.
It was based on a module called app_amd.c (Answering Machine Detection) which was a third party add-on for Asterisk and, AFAIK, can be found in their main tree now. With an E1/T1, you can also distinguish between bad numbers, busy, and many other status codes. Look that up, it may help.

Detecting a (naughty or nice) URL or link in a text string

How can I detect (with regular expressions or heuristics) a web site link in a string of text such as a comment?
The purpose is to prevent spam. HTML is stripped so I need to detect invitations to copy-and-paste. It should not be economical for a spammer to post links because most users could not successfully get to the page. I would like suggestions, references, or discussion on best-practices.
Some objectives:
The low-hanging fruit like well-formed URLs (http://some-fqdn/some/valid/path.ext)
URLs but without the http:// prefix (i.e. a valid FQDN + valid HTTP path)
Any other funny business
Of course, I am blocking spam, but the same process could be used to auto-link text.
Ideas
Here are some things I'm thinking.
The content is native-language prose so I can be trigger-happy in detection
Should I strip out all whitespace first, to catch "www .example.com"? Would common users know to remove the space themselves, or do any browsers "do-what-I-mean" and strip it for you?
Maybe multiple passes is a better strategy, with scans for:
Well-formed URLs
All non-whitespace followed by '.' followed by any valid TLD
Anything else?
Related Questions
I've read these and they are now documented here, so you can just references the regexes in those questions if you want.
replace URL with HTML Links javascript
What is the best regular expression to check if a string is a valid URL
Getting parts of a URL (Regex)
Update and Summary
Wow, I there are some very good heuristics listed in here! For me, the best bang-for-the-buck is a synthesis of the following:
#Jon Bright's technique of detecting TLDs (a good defensive chokepoint)
For those suspicious strings, replace the dot with a dot-looking character as per #capar
A good dot-looking character is #Sharkey's subscripted · (i.e. "·"). · is also a word boundary so it's harder to casually copy & paste.
That should make a spammer's CPM low enough for my needs; the "flag as inappropriate" user feedback should catch anything else. Other solutions listed are also very useful:
Strip out all dotted-quads (#Sharkey's comment to his own answer)
#Sporkmonger's requirement for client-side Javascript which inserts a required hidden field into the form.
Pinging the URL server-side to establish whether it is a web site. (Perhaps I could run the HTML through SpamAssassin or another Bayesian filter as per #Nathan..)
Looking at Chrome's source for its smart address bar to see what clever tricks Google uses
Calling out to OWASP AntiSAMY or other web services for spam/malware detection.
I'm concentrating my answer on trying to avoid spammers. This leads to two sub-assumptions: the people using the system will therefore be actively trying to contravene your check and your goal is only to detect the presence of a URL, not to extract the complete URL. This solution would look different if your goal is something else.
I think your best bet is going to be with the TLD. There are the two-letter ccTLDs and the (currently) comparitively small list of others. These need to be prefixed by a dot and suffixed by either a slash or some word boundary. As others have noted, this isn't going to be perfect. There's no way to get "buyfunkypharmaceuticals . it" without disallowing the legitimate "I tried again. it doesn't work" or similar. All of that said, this would be my suggestion:
[^\b]\.([a-zA-Z]{2}|aero|asia|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|tel|travel)[\b/]
Things this will get:
buyfunkypharmaceuticals.it
google.com
http://stackoverflo**w.com/**questions/700163/
It will of course break as soon as people start obfuscating their URLs, replacing "." with " dot ". But, again assuming spammers are your goal here, if they start doing that sort of thing, their click-through rates are going to drop another couple of orders of magnitude toward zero. The set of people informed enough to deobfuscate a URL and the set of people uninformed enough to visit spam sites have, I think, a miniscule intersection. This solution should let you detect all URLs that are copy-and-pasteable to the address bar, whilst keeping collateral damage to a bare minimum.
I'm not sure if detecting URLs with a regex is the right way to solve this problem. Usually you will miss some sort of obscure edge case that spammers will be able to exploit if they are motivated enough.
If your goal is just to filter spam out of comments then you might want to think about Bayesian filtering. It has proved to be very accurate in flagging email as spam, it might be able to do the same for you as well, depending on the volume of text you need to filter.
I know this doesn't help with auto-link text but what if you search and replaced all full-stop periods with a character that looks like the same thing, such as the unicode character for hebrew point hiriq (U+05B4)?
The following paragraph is an example:
This might workִ The period looks a bit odd but it is still readableִ The benefit of course is that anyone copying and pasting wwwִgoogleִcom won't get too farִ :)
Well, obviously the low hanging fruit are things that start with http:// and www. Trying to filter out things like "www . g mail . com" leads to interesting philosophical questions about how far you want to go. Do you want to take it the next step and filter out "www dot gee mail dot com" also? How about abstract descriptions of a URL, like "The abbreviation for world wide web followed by a dot, followed by the letter g, followed by the word mail followed by a dot, concluded with the TLD abbreviation for commercial".
It's important to draw the line of what sorts of things you're going to try to filter before you continue with trying to design your algorithm. I think that the line should be drawn at the level where "gmail.com" is considered a url, but "gmail. com" is not. Otherwise, you're likely to get false positives every time someone fails to capitalize the first letter in a sentence.
Since you are primarily looking for invitations to copy and paste into a browser address bar, it might be worth taking a look at the code used in open source browsers (such as Chrome or Mozilla) to decide if the text entered into the "address bar equivalent" is a search query or a URL navigation attempt.
Ping the possible URL
If you don't mind a little server side computation, what about something like this?
urls = []
for possible_url in extracted_urls(comment):
if pingable(possible_url):
urls.append(url) #you could do this as a list comprehension, but OP may not know python
Here:
extracted_urls takes in a comment and uses a conservative regex to pull out possible candidates
pingable actually uses a system call to determine whether the hostname exists on the web. You could have a simple wrapper parse the output of ping.
[ramanujan:~/base]$ping -c 1 www.google.com
PING www.l.google.com (74.125.19.147): 56 data bytes
64 bytes from 74.125.19.147: icmp_seq=0 ttl=246 time=18.317 ms
--- www.l.google.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 18.317/18.317/18.317/0.000 ms
[ramanujan:~/base]$ping -c 1 fooalksdflajkd.com
ping: cannot resolve fooalksdflajkd.com: Unknown host
The downside is that if the host gives a 404, you won't detect it, but this is a pretty good first cut -- the ultimate way to verify that an address is a website is to try to navigate to it. You could also try wget'ing that URL, but that's more heavyweight.
Having made several attempts at writing this exact piece of code, I can say unequivocally, you won't be able to do this with absolute reliability, and you certainly won't be able to detect all of the URI forms allowed by the RFC. Fortunately, since you have a very limited set of URLs you're interested in, you can use any of the techniques above.
However, the other thing I can say with a great deal of certainty, is that if you really want to beat spammers, the best way to do that is to use JavaScript. Send a chunk of JavaScript that performs some calculation, and repeat the calculation on the server side. The JavaScript should copy the result of the calculation to a hidden field so that when the comment is submitted, the result of the calculation is submitted as well. Verify on the server side that the calculation is correct. The only way around this technique is for spammers to manually enter comments or for them to start running a JavaScript engine just for you. I used this technique to reduce the spam on my site from 100+/day to one or two per year. Now the only spam I ever get is entered by humans manually. It's weird to get on-topic spam.
Of course you realize if spammers decide to use tinuyrl or such services to shorten their URLs you're problem just got worse. You might have to write some code to look up the actual URLs in that case, using a service like TinyURL decoder
Consider incorporating the OWASP AntiSAMY API...
I like capar's answer the best so far, but dealing with unicode fonts can be a bit fraught, with older browsers often displaying a funny thing or a little box ... and the location of the U+05B4 is a bit odd ... for me, it appears outside the pipes here |ִ| even though it's between them.
There's a handy · (·) though, which breaks cut and paste in the same way. Its vertical alignment can be corrected by <sub>ing it, eg:
stackoverflow·com
Perverse, but effective in FF3 anyway, it can't be cut-and-pasted as a URL. The <sub> is actually quite nice as it makes it visually obvious why the URL can't be pasted.
Dots which aren't in suspected URLs can be left alone, so for example you could do
s/\b\.\b/<sub>·<\/sub>/g
Another option is to insert some kind of zero-width entity next to suspect dots, but things like ‍ and ‌ and &ampzwsp; don't seem to work in FF3.
There's already some great answers in here, so I won't post more. I will give a couple of gotchas though. First, make sure to test for known protocols, anything else may be naughty. As someone whose hobby concerns telnet links, you will probably want to include more than http(s) in your search, but may want to prevent say aim: or some other urls. Second, is that many people will delimit their links in angle-brackets (gt/lt) like <http://theroughnecks.net> or in parens "(url)" and there's nothing worse than clicking a link, and having the closing > or ) go allong with the rest of the url.
P.S. sorry for the self-referencing plugs ;)
I needed just the detection of simple http urls with/out protocol, assuming that either the protocol is given or a 'www' prefix. I found the above mentioned link quite helpful, but in the end I came out with this:
http(s?)://(\S+\.)+\S+|www\d?\.(\S+\.)+\S+
This does, obviously, not test compliance to the dns standard.
Given the messes of "other funny business" that I see in Disqus comment spam in the form of look-alike characters, the first thing you'll want to do is deal with that.
Luckily, the Unicode people have you covered. Dig up an implementation of the TR39 Skeleton Algorithm for Unicode Confusables in your programming language of choice and pair it with some Unicode normalization and Unicode-aware upper/lower-casing.
The skeleton algorithm uses a lookup table maintained by the Unicode people to do something conceptually similar to case-folding.
(The output may not use sensible characters, but, if you apply it to both sides of the comparison, you'll get a match if the characters are visually similar enough for a human to get the intent.)
Here's an example from this Java implementation:
// Skeleton representations of unicode strings containing
// confusable characters are equal
skeleton("paypal").equals(skeleton("paypal")); // true
skeleton("paypal").equals(skeleton("𝔭𝒶ỿ𝕡𝕒ℓ")); // true
skeleton("paypal").equals(skeleton("ρ⍺у𝓅𝒂ן")); // true
skeleton("ρ⍺у𝓅𝒂ן").equals(skeleton("𝔭𝒶ỿ𝕡𝕒ℓ")); // true
skeleton("ρ⍺у𝓅𝒂ן").equals(skeleton("𝔭𝒶ỿ𝕡𝕒ℓ")); // true
// The skeleton representation does not transform case
skeleton("payPal").equals(skeleton("paypal")); // false
// The skeleton representation does not remove diacritics
skeleton("paypal").equals(skeleton("pàỳpąl")); // false
(As you can see, you'll want to do some other normalization first.)
Given that you're doing URL detection for the purpose of judging whether something's spam, this is probably one of those uncommon situations where it'd be safe to start by normalizing the Unicode to NFKD and then stripping codepoints declared to be combining characters.
(You'd then want to normalize the case before feeding them to the skeleton algorithm.)
I'd advise that you do one of the following:
Write your code to run a confusables check both before and after the characters get decomposed, in case things are considered confusables before being decomposed but not after, and check both uppercased and lowercased strings in case the confusables tables aren't symmetrical between the upper and lowercase forms.
Investigate whether #1 is actually a concern (no need to waste CPU time if it isn't) by writing a little script to inspect the Unicode tables and identify any codepoints where decomposing or lowercasing/uppercasing a pair of characters changes whether they're considered confusable with each other.