How to periodically extract data from a CSV file? - csv

I'm currently working in some Q&A projects. I am running tests (which can vary from a couple of minutes to 2-3 days) in an applications that is generating some csv files and updates them periodically, with a new row added with each update (once every couple of seconds or so).
Each CSV file is structured like this:
Header1,Header2,Header3,.................,HeaderN
numerical_value11,numerical_value12,numerical_value13,......,numerical_value1N,
numerical_value21,numerical_value22,numerical_value23,......,numerical_value1N,
etc
The number of columns may vary from csv file to csv file.
I am running in a windows environment. I also have cygwin (http://www.cygwin.com/) installed.
Is there a way I can do a script that runs periodically (once per hour or so), extracts data (a single/multiple values from a row, or the average of the the values from specific rows added in the csv between interrogations) and sends some email alerts if, for example, the data from one column is out of a range?
Thx

This can be done in several ways. Basically, you need to
1) Write a script in maybe pearl or python that does one iteration of what you want it to do.
2) Use windows scheduler to run this scrip at the frequency that you want. The Windows scheduler is very easy to setup from the Control Panel

Using Window' Scheduling, you can very easily get the interval part down; with the program parsing and alerting however, you have a few options. I myself would use C# to make the program. If you want an actual script however, VBA is a viable choice and could very easily Parse a basic CSV file and contact the web to send an email. If you have office already installed, this should give you some more detail. Hope that helps.

Related

How to manage "releases" with MS Access

I have an MS Access 2016 application that a few people use in one department. I know this whole thing has web dev written all over it but this access database has been their process for a while and there is no time right now to switch over.
Recently, a different department wants to use this application, but having their own copy. Currently, if I need to make changes, I'll make the changes in a copy of the app, they send me a current version when I'm ready to import their data, I import it and send them back a new one. However, currently I copy the data table by table and past it into the new database. This is inefficient and tedious, and now with 2 sets of data I'd be doing this for, that's crazy. There's over 20 tables so I don't want to have to manually copy over 40+ tables across the 2 apps for even the smallest change like altering a message to the user.
I know I can copy the code so I can avoid importing the data, but sometimes for big changes I'll change over 15-20 vba files.
So, a couple questions:
1.Is there a way to generate insert statements for the entire database that I could run in a script? So when I create the new copy I just upload 1 file and it populates all the data?
2.Are there any kind of dev tools that will help this process? Right now I'm thinking that it's just a downfall of creating an MS Access app, but there must be some way that people have made the "new release" process easier. My current system seems flawed and I'm looking to have a more stable process.
EDIT:
Currently I have all my data stored locally, attached to the same access file as the front end. Since I will have 2 different departments using the same functionality, how do I manage the data/the front-end? These 2 departments should have their own access file to enter data using the forms, so having 1 front end between the 2 departments won't work.
Also, should I create 2 separate back-ends? Currently I would have nothing to distinguish what is being inserted/changed/deleted from one department from the other. If I were to attach a field specifying who entered the record, that would require a complete overall of all my queries which I don't have the time for as there are deadlines I need to meet.
First thing is to split the database. There is a wizard for this.
Then you can maintain the frontend without touching the real data.
Next, consider using a script to distribute revised versions of the frontend. I once wrote an article on one proven method to handle this:
Deploy and update a Microsoft Access application in a Citrix environment

SSIS data validation and Cleaning

I need to do like this
Client puts data in FTP folder (data can be in these 3 format- .txt, .csv or .xls), The SSIS package need to pull data from ftp and check the data file for correct format such as last name not empty, phone is 10 digit, zip code is 5 digits, Address is not more than 20 character length etc etc)
After checking data file, if everything okay it should load file in dev. database, if not I need to run some cleaning quires (like taking first 5 digit for zip etc) and load data, if some column is missing, it need to send email to client asking different data file
Till now, I do this task by manually importing file and running lot of sql queries, which is time consuming. My manager asked me to write SSIS package to automate this process
I am fairly new in SSIS, can someone give me SSIS package design idea (I mean which task to use at which sequence etc) so I can try and learn
Thanks for the help
Here are a couple of suggestions:
Configure tasks to send errors caused by bad data to a separate file. This will identify problem rows while letting the good stuff to continue. You can also use the conditional split to redirect rows with bad data such as blank rows.
The Derived Column Transformation is handy to trim, format, slice, and dice data.
Use the Event Handler to send emails if a given condition is true.
Use the logging features. Very helpful in sorting out something that went sideways while you were sleeping.

Log Audit file when update mySQL and Perl

The want a Perl script that will write to a data file every time update the Database MySQL. I dont mind about growth of the file since Every Items audited will be stored seperatly
Thank you will Appreciate you Help
The module Log::Log4perl provides many different ways to log events to many types of output including files. It also would allow you to set debug levels to turn this off if you needed to.

Generating jmeter results into graphs off several trials through Hudson

I'm in the process of integrating test scripts into a Continuous Integration system like Hudson. My goal is to benchmark each load test over time and display it in readable charts.
While there are plugins to generate graphs for a single script run, I'd like to know how each session's data, such as those found in the summary report, could be recorded over time.
One way would be to store the summary reports into a jtl file, and graph data off of that.
I've checked out the Performance Plugin for Hudson, but I'm at a block at how to modify the plugin to display more charts with more information.
Both the reports from either JMeter or the Hudson plugin are snapshots (not charts over long periods of time) and that's part of the issue. I went through this same exercise a few months back and decided to go with a solution that was better suited for this problem.
I setup Logstash to pull the JMeter test results from the files it generates during every test. It outputs those results into an Elasticsearch index which I can chart with Kibana.
I know this adds several new pieces of software into your setup, but it only took a day to set things up and the results are much better than what the performance plugin was able to provide.

Data sync solution?

For some security issues I'm in an envorinment where third party apps can't access my DB. For this reason I should have some service/tool/script (dunno what yet... i'm open to the best option, still reading to see what I'm gonna do...)
which enables me to generate on a regular basis(daily, weekly, monthly) some csv file with all new/modified records for a certain application.
I should be able to automate this process and also export at any time a new file.
So it should keep track for each application which records he still needs.
Each application will need some data in some other format (csv/xls/sql), also some fields will be needed for some application and some aren't... It should be fairly flexible...
What is the best option for me? Creating some custom tables for each application? Based on that extracting modified data?
I think you best thing here, assuming you have access to the server to let you set this up is to make a small command line program that can do the relativley simple task you need. Languages like pearl are good for this sort of thing I do believe.
once you have that 'tool' made you can schedule it through the OS of the server to run ever set amount of time. Either schedule task for a windows server or a cronjob for a linux server.
You can also (with out having to set up the scheduled task if you don't / can't want to) enable this small command line application to be called via 'CGI' this is a special way of letting applications on the server be executed at will by a web user. If you do enable this though, I suggest you add some sort of locking system so that it can only be run every so often and to stop it being run five times at once.
EDIT
You might also want to just look into database replication or adding read only users. This saves a hole lot of arseing around. Try to find a solution that dose not split or duplicate data. You can set up users to only be able to access certain parts of the database system in certain ways, such as SELECT data