Translate CSV to ANSI ASC X-12 Standards - csv

My boss just asked me to convert many CSV files to ANSI X12 Standards text files. I have no idea how to do this. Wonder if there is an app or software to do this?
Thanks

Yes, there are a plethora of software packages that can help you translate CSV to EDI. What EDI documents do you need to create?
Liaison Delta and ECS can do it quite easily (and my recommendation). CSV is a very common file format. There are a ton of universal data translators that have this functionality. Boomi, Sterling Commerce, GXS/Inovis, Extol, 1EDISource. There are also hosted solutions like SPS Commerce and DI Central that will do the translation for you.

Related

What software is availible for data quality checking

I'm looking to identify some possible software options that will allow for custom rules to manipulate bulk data files (.csv) For example, proper capitalization (allowing for states to remain capital and unique surnames), identifying the word count of specific words in a field, and some other custom rules. Any guidance would be appreciated.
You could use Talend Open Studio for this task. It is an Opensource ETL tool for data manipulation and integration. You can for example ImportCSV >> DATABASE >> perform transformations >> ExportCSV. The possibilities are endless.
You can find it here: http://www.talend.com/products-data-integration/talend-open-studio.php
It also sounds like you might be looking to create a profile of the data. For this you can use Talend Open Profiler, they recently added support for flat files such as your .csv. It is simple to use and you should be up and running in 30 mins.
You can find the download here: http://www.talend.com/products-data-quality/talend-open-profiler.php
You can find some tutorials here:http://www.talendforge.org/tutorials/menu.php
On the tutorials choose the Data Quality tab, and scroll down until 'Talend Open Profiler'
It is my first step in assessing data quality on a new dataset.
A quick google "data scrubbing utilities" turned up this:
http://data-scrubbing.qarchive.org/
They look to be very close to what you're looking for.
It'll really depend on how complex the rules get. Much more complex than simple stuff, and you'd probably be ahead by just coding something up (or having it coded).

SSIS Term Extraction for non-English text

I started learning Data Mining with SQL Server and I was curious that SQL Server Integration Services is capable to perform Term Extraction from English text. However I'm interested to perform Text Mining from non-English text, basically from Ukrainian. So these are the very questions:
Is there a way to implement Term Extraction from non-English text in SSIS? If yes then any suitable resources would be appreciated:)
If the answer for the first question is positive I would like to know if there are already some custom solutions for non-English text.
Thanks in advance:)
The documentation states that the term extraction transformation supports only English, and there's no mention of a mechanism for adding other languages.
Therefore, I would assume that you need to find some sort of tool that can do term extraction with Ukrainian text, and work out how to integrate it into SSIS. Finding a tool like that isn't really an SSIS issue, it's a general NLP or linguistics question, so you might get a better answer in another forum.

Reversing an old file format Inbox X

I’m trying to reverse engineer an old medical imaging format called Stentor for interoperability. It was designed by a company of the same name who was subsequently bought by Phillips. But Phillips has forgotten how to read Stentor files. I have a windows program which exports JPEG from Stentor files but it’s closed source. I’d like to automate this process in order to tackle hundreds of files in this format.
The program is late-1990s Win32 or MFC executeable. It runs next to an ActiveX (.ocx) file which I’ve been able to interop with, but that file doesn’t contain the export method. I'm looking for suggestions on how to dissemble the binary in order to unearth the algorithm used to convert Stentor to JPEG. I looked through the Stentor files in hex editor and didn’t find any evidence of JPEG (although hints on finding that would be appreciated too), so I think that the program has a couple of tricks up its sleeve.
Thanks in advance.
Kyle
Few programmers implement complex routines such as image recoding themselves. Instead they tend to license libraries that do that. A very smart way to start would be searching for text strings and see if you can discover the libraries they use. This will subsequently give you a lot of insight into how the data is encoded.
Another good strategy would be to build a program that simply runs the GUI of your export program by sending mouse and keyboard events directly to it. Let this run a few days to complete your export. Reverse engineering the file format is going to be slow and expensive so for a 1 time gig it's probably not worthwhile.

How do i convert hdb file? ... believed to be from act! source

Any ideas ?
I think the original source was a goldmine database, looking around it appears that the file was likely built using an application called ACT which I gather is a huge product I don't really want to be deploying for a one off file total size less than 5 meg.
So ...
Anyone know of a simple tool that I can run this file through to convert it to a standard CSV or something?
It does appear to be (when looking at it in notepad and excel) in some sort of csv type format but it's like the data is encrypted somehow.
Ok this is weird,
I got a little confused because the data looked a complete mess, in actual fact the mess was the data, that's what it was meant to look like.
Simply put, i opened the file in notepad, seemed to have a sort of pattern so i droppped it on excel.
Apparently excel has no issues reading these files ... strange huh !!!
I am unaware of any third party tooling for opening these files specifically, although there is an SDK available for C# which could resolve your problem with a little elbow grease.
The SDK can be aquired for free Here
Also there is a developer forum which could provide some valuable resources including training material with sample code Here
Resources will be provided with the SDK
Also, out of interest since ACT is a Sage product have you any Sage software floating about which you could attempt to access the data with? Most offices have!
Failing all of the above there is a trial available for ACT! Here!
Good luck with your problem!

How to reverse engineer binary file formats for compatibility purposes

I am working of a file preparation software to enable translators work easily and efficiently on a wide range of file formats.
As far as text-based formats (xml, php, resource files,...) are concerned, my small preparation utility works fine, but a major problem for most translators is to handle all kinds of proprietary binary formats (Framemaker, Publisher, Quark...).
These files are rarely requested and need to be opened in expensive applications (few freelance can afford to buy $20,000 worth of software just to handle a few projects per year), and even then it is not convenient to work directly in those applications anyway.
I would like to be able to read these files and extract the text in such a way that it can be translated and then re-imported in the original application with minimal effort, or even better, to recreate a valid native binary file.
Does that sound doable?
Where can I find more information on handling binary file formats and are there useful tools for these kind of jobs (besides regular hex editors)?
Thanks in advance.
Of course reverse engineering is possible, but without format specs it will take a lot of work. I would look at the return on effort regarding supporting these 'rarely requested, very expensive' formats. You may be better off spending that effort improving the core functionality of your app.
Another angle is to contact the companies with these formats, explain your goal, explain that it helps their product, and if they don't see you as competition they might be willing to help.
I know that you want to reverse engineer them - but since these may be propriety file formats you are looking at a very steep curve trying to decode them...
Some (as I have written some propritety formats for interal use before) have specific methods and objects written into them that serve some alternative process than the file contents themselves. Stuff that would prove the new file is illegal.
Just my 2 cents and I am no lawyer =>
Maybe you could pick a cheaper application which has import features for QuarkXPress. For example InDesign should be able to read Quark documents. Then use the importing application to export to whatever format you need - maybe with a help of plug-in.