Extract mail from Exchange and load into Mysql. Perl Win32::OLE or Perl Net::POP3, or try it in Ruby - mysql

My problem is this: I need determine the timestamp of the first and last email sent from an Exchange account for every day that mail exists for. Also, for each day I need to rank the words that appear in each email so that I can report trend words for each day.
I have two approaches to this I'm considering, and would welcome comments and suggestions relating to either these approaches or something entirely different.
I've discounted exporting the file from Outlook as a CSV file as it does include the time stamp fields in the output, which is a crucial factor for me.
Approach #1 is:
Use Perl and Net::POP3 to pull the messages out of the inbox, mung them and then insert them into a MySQL database.
Approach #2 is:
Use Win32:OLE to attempt to act like a proper Exchange client, to the same end.

If you use Win32::OLE you'll have to either use Outlook automation or the CDO libraries. I've done both in a previous life, and it works, but it's a bit painful.
I'd suggest approach #1 except that I can't imagine that Exchange would actually allow you to fetch sent mail through POP. Rather, though, Exchange can be enabled to expose an IMAP interface, and IMAP should certainly let you get at the Sent mail without running into any of the problems associated with POP (for example, replacing deleted messages on the server). I haven't used it but Mail::IMAPClient appears to be the recommended module for this.

Related

Clean, low-overhead way to send an email when a mysql/mariadb table is updated?

I need to send an email when a record is added to a table.
A bunch of googling has left me with the impression that the only choices are "bad" and "really bad" and was wondering if anybody had any clean, solid, reliable suggestions.
So far I've found:
Use a mysql plugin that sends the mail. I'd rather not do this because I have a perfectly good mail server and the database wasn't designed to send mail.
Poll the table periodically from an external program, look for changes and send the mail if appropriate. This is almost OK, but I'd rather skip the dead time between the record being added and the next poll.
I had considered using SELECT ... OUTFILE, however this is really limited because it won't overwrite the output file and the only way to change the filename is by building the query with dynamic SQL, which can't be used inside a trigger.
I could write a socket listener and have Mysql open the socket and tell the mail app there are records waiting, however there doesn't seem to be a way to open a socket from mysql.
It feels like I must be missing something here.
All I want is to run an external application when a record is added.
Has anybody run into a clean, low overhead way to do this?
Modify the code that is adding the record and have it do the notifications. If you put it in a try/catch block you will know for sure whether or not the record was added successfully.
Trigger on table(s) of interest to insert into other table (email queue).
Create a scheduled process to process the other table.

html page with xlsm file as backend

I was wondering whether is that possible to have xlsm file as backend while having html as frontend? How can I achieve this if yes?
Thanks in advance.
Since the question lacks the understanding of an application structure in the programming realm, I will put this as an answer hoping to clarify a few things.
First of all I don't think you understand what the term "back-end" means.
Please read https://en.wikipedia.org/wiki/Front_and_back_ends AND http://blog.teamtreehouse.com/i-dont-speak-your-language-frontend-vs-backend
hopefully these will clarify a few thing for you.
Just to explain these concepts shortly:
In an application Front-end and back-end refer to two interfaces that communicate with each other and exchange data in some form. Such separation is made when the program and the user are separate (such as when you have a server and a client such as in distributed programming). This however, is only one of many programming patterns today. Although rare in today's world, there are programs that do not separate functionality in such way and thus delegate all this functionality to the core program that is statically installed on the clients computer. But in other cases here is what the terms front-end vs. Back-end means:
Reason why such separation is necessary:
In today's world many applications (such as web applications and mobile applications) are deployed on common servers to provide wider and faster access, better support and to reduce the cost of access for the client (not requiring any space, no download time etc.). However in such cases, since the client doesn't have access to the program locally, they need to access it over internet protocols such as TCP (which is used by today's http). The problem is that the frontend files are served everytime the application is loaded and can not keep track of states of data (they are stateless) [excluding the edge cases of cookies and caches]
Front End:
The sole reason that the front end exists for the user to interact with the application and to collect data from the user such as login information etc. (User Interface)
Back end:
Now back-end is a little bit more complicated. There are 2 major components to a good back-end design:
Logic
Data
The backend is responsible for processing the data from the user (front-end) in a correct and meaningful way. For example in a really simple program which adds two numbers the front end would be responsible of asking the user for two numbers and the back-end would carry out the actual addition and send the result back to front end to be displayed.
If the data has states. The backend would also need to save the last state of the data somewhere on the server. This is where the second component comes in. The most common practice is to have a ".db" file(s) which represents a database. However there is no obligation for you to do so. When necessary, if you wanted your backend could read data any where from a plain text file to STDIN.
Why do we use databases? ==> The queries. Query languages that come with data bases make it so much easier for us to extract and isolate the relevant data
After processing and modifying the data, the backend sends it back to the front end to be displayed to the user. The common data transferring ways are JSON, XML and SExpressions.
So following this short lecture, back to your question:
Can I have an xlsm file in the backend?
Yes. You can preserve the data in the backend(server) in anyway that you want. The only thing you need to make sure is that the endpoint the front end communicates to reads data from this file and writes back onto this file. (Sometimes CSV files are used in such a way that is similar to xlsm files)
Is it a good idea?
No. Databases exists for a reason. Use them.
Hope this sheds light on a few things. I highly advise you understand the application stack before writing any code

How much csv parsing can/should be processed in the front-end?

I am about to:
create a web interface where the user can type in email addresses, which are then sent up to the server in one json bulk, which is then used for messaging these users.
I also have the requirement to be able to upload a csv file with a long list of email addresses. The problem is that the number of email addresses can be very large. We're talking about in the thousands or even more.
Theoretically I can either parse the csv file in the front-end and send the email addresses up in a json object (as I already have the api for the first use case where the email addresses are typed in and sent as a json), or I could upload these csv files to our db and do their parsing on the server side.
Should consider processing the csv files in the front-end at all?
What should be a "safe" number of items for processing in the front-end without breaking anything, or ending up with a heavily compromised user experience?
Can anyone comment from experience? Thanks
What should be a "safe" number of items for processing in the
front-end without breaking something, or ending up with a heavily
compromised user experience?
This depends on the user's machine.
No one here would be able to give you a definitive answer on your question.
Anyway, you can use the Web Workers API
Web Workers allow you to create long running asynchronous threads in the background without heavily affecting/freezing your UI. You can show a spinner indicating that the CSV is being processed. Meanwhile your users would be able to interact with the UI just fine.
That's your best bet where supported.
Should consider processing the csv files in the front-end at all?
String parsing is usually a process that is optimised by modern browsers in some cases. If you move the computation to the server you need to scale your server to meet demands for the calculations as more and more users use your web-app.
You could get playful with it and detect the processing capabilities of the user's machine - if capable, use Web Workers, if not use the server to do this.
The most comprehensive way to do this is by defining a Browser Test Matrix and test for yourself.
You can even emulate bandwidths you want to target/test, specifically, using Network Throttling in Chrome Dev Tools

What is the simplest way to run a script when an email is received?

I'd like to run a script, typically in PHP or MySQL since it involves changing a Mysql db table when an email arrives at my exchange server (2010).
The server holds IIS7.
So in short:
Email->Exchange server->changes a table in MySQL DB
Notes:
Not looking for a script that connects to the e-mail via POP/imap
I'm looking for a kind of trigger that occurs in the server
Webservices or Transport Agent seem complicated If you can supply an easy example I'll accept it.
If you have a Sink example that runs on Exchange 2010 please support it with careful explanation and examples/links. (step by step if you have to)
Other scripting languages accepted.
Possibly not exactly what you are looking for but Postmark offer an incoming mail service that allows you to do this kind of thing.
Their API is pretty good and very well documented. My understanding is you can set up webhooks that will allow you to do what you're looking for.
I hope this is some use to you.

Tracking data access

Backstory
I work for a company that has an online site that allows user to text personal information for collection. We collect the data, and make it available online. Users can choose to share the data with other users.
Going Forward
At some point, this may become classified an FDA-governed medical tool. In anticipation, we'd like to have in place a logging system that shows each time someone accesses our users' data, whether it be the user themselves, another authorized user, or a support person.
Current Architecture
We are currently running Ruby/Rails, and using a MySQL database. The personal information is encrypted in the database.
Data Access for Support
Today, support personnel can access data one of three ways:
admin site The admin site is limited to whatever screens we develop. While we don't currently, we could easily add logging to keep an audit trail of who accessed which data using the admin tool.
sql client I use MySQLWorkbench to access production. However, when connected this way, all personal information (user name, cell number, etc), is encrypted.
Ruby Rails console - Finally, support can log into one of the production boxes and use the Ruby/Rails console from command line. Ruby will decrypt the data, so we can do some simple things such as
u=User.find_all_by_state('active')
and it will return the recordset of all users with state='active', and decrypt their personal information in the resultset.
Holy Grail
logging
easy access for support
I'd love to be have a way to allow easy support access (once authenticated) to the data, but would log everything that is accessed (read or updated). That way, if I'm checking out my buddy's ex-wife's data for example, it gets logged to a place where I can't get in and clean it the audit trail. (See Google firing Gmail employee for an example of employees breaching the data policies).
Anyone have ideas, thoughts, experiences, suggestions with this issue?
hey devguy. This was a issue for me a couple months back. We ended up centralizing our mysql queires so that we could start to track all information coming in and out. Unfortunately the class I wrote is in PHP but the idea behind it could make it very easy to start logging.
https://code.google.com/p/php-centralized-mysql-controller/
Try stored procedures. Make all code use the stored procedures for CRUD activities. This defines an API that your developers can use while business rules are global enforced (don't return entire SSN values, but only last 4 digits, etc).
This serves as the basis for an external API as well.
If you want logging/auditing, you put it in the procedure.
This protects you from everyone except the DBAs.