I'm looking for the best way to ingest AES-encrypted .csv files in Foundry. What would be the quickest way to accomplish this task?
There is no out-of-the-box support for symmetric decryption. However, ingested files can be processed downstream in Transforms using cryptography libraries in Python or Spark.
Related
I have millions of documents in different collections in my database. I need to export them to a csv onto my local storage when I specify the collection name.
I tried mlcp export but didn't work. We cannot use corb for this because of some issues.
I want the csv to be in such a format that if I try a mlcp import then I should be able to restore all docs just the way they were.
My first thought would be to use MLCP archive feature, and to not export to a CSV at all.
If you really want CSV, Corb2 would be my first thought. It provides CSV export functionality out of the box. It might be worth digging into why that didn't work for you.
DMSDK might work too, but involves writing code that handles the writing of CSV, which sounds cumbersome to me.
Last option that comes to mind would be Apache NiFi for which there are various MarkLogic Processors. It allows orchestration of data flow very generically. It could be rather overkill for your purpose though.
HTH!
ml-gradle has support for exporting documents and referencing a transform, which can convert each document to CSV - https://github.com/marklogic-community/ml-gradle/wiki/Exporting-data#exporting-data-to-csv .
Unless all of your documents are flat, you likely need some custom code to determine how to map a hierarchical document into a flat row. So a REST transform is a reasonable solution there.
You can also use a TDE template to project your documents into rows, and the /v1/rows endpoint can return results as CSV. That of course requires creating and loading a TDE template, and then waiting for the matching documents to be re-indexed.
I have searched high and low, but it seems like mysqldump and "select ... into outfile" are both intentionally blocked by not allowing file permissions to the db admin. Wouldn't it save a lot more server resources to allow file permissions than to disallow them? Any other import/export method I can find uses executes much slower, especially with tables that have millions of rows. Does anyone know a better way? I find it hard to believe Azure left no good way to do this common task.
You did not list the other options you found to be slow, but have you thought about using Azure Data Factory:
Use Data Factory, a cloud data integration service, to compose data storage, movement, and processing services into automated data pipelines.
It supports exporting data from Azure MySQL and MySQL:
You can copy data from MySQL database to any supported sink data store. For a list of data stores that are supported as sources/sinks by the copy activity, see Supported data stores and formats
Azure Data Factory allows you to define mappings (optional!), and / or transform the data as needed. It has a pay per use pricing model.
You can start an export manually or using a schedule using the .Net or Python SKD , the Rest api or Powershell.
It seems you are looking to export the data to a file, so Azure Blob Storage or Azure Files are likely to be a good destination. FTP or the local file system are also possible.
"SELECT INTO ... OUTFILE" we can achieve this using mysqlworkbench
1.Select the table
2.Table Data export wizard
3.export the data in the form of csv or Json
I'm trying to access files in my amazon S3 and do some operations on it. Currently evaluating the options.
Since I will be doing some operations on the S3 files, I would prefer using some language to access the files in S3 (I have already tried copy command).
My S3 contains JSON files which range between 2MB to 4 MB and I would need to parse these JSON and load them into a database (thinking about using JQuery here, but any other suggestions are welcome)
Given these requirements which is most efficient language/platform to be used here.
You options are pretty broad here. AWS has a list of SDKs for you to choose from. https://aws.amazon.com/tools/#sdk
So your comfort level with a particular language should be your largest influencer. Given that you mentioned JSON and JQuery perhaps you should look at Node.js SDK and AWS Lambda.
I am new to GeoMesa. I mean I just typed geomesa command. So, after following the command line tools tutorial on GeoMesa website. I found some information on ingesting data to geomesa through a .csv file.
So, for my research:
I have a MySQL database storing all the information sent from an Android Application.
And I want to perform some geo spatial analytics on it.
Right now I am converting my MySQL table to .csv file and then ingest it into geomesa as adviced on GeoMesa website.
But my questions are:
Is there any other better option because data is in GB and its a streaming data, hence I have to make .csv file regularly?
Is there any API through which I can connect my MySQL database to geomesa?
Is there any way to ingest using .sql dump file because that would be more easier then .csv file?
Since you are dealing with streaming data, I'd point to two GeoMesa integrations:
First, you might want to check out NiFi for managing data flows. If that fits into your architecture, then you can use GeoMesa with NiFi.
Second, Storm is quite popular for working with streaming data. GeoMesa has a brief tutorial for Storm here.
Third, to ingest sql dumps directly, one option would be to extend the GeoMesa converter library to support them. So far, we haven't had that as a feature request from a customer or a contribution to the project. It'd definitely be a sensible and welcome extension!
I'd also point out the GeoMesa gitter channel. It can be useful for quicker responses.
Has anyone had much experience with data migration into and out of NetSuite? I have to export DB2 tables into MySQL, manipulate data, and then export ina CSV file. Then take a CSV file of accounts and manipulate the data again for accounts to match up from our old system to new. Anyone tried to do this in MySQL?
A couple of options:
Invest in a data transformation tool that connects to NetSuite and DB2 or MySQL. Look at Dell Boomi, IBM Cast Iron, etc. These tools allow you to connect to both systems, define the data to be extracted, perform data transformation functions and mappings and do all the inserts/updates or whatever you need to do.
For MySQL to NetSuite, php scripts can be written to access MySQL and NetSuite. On the NetSuite side, you can either do SOAP web services, or you can write custom REST APIs within NetSuite. SOAP is probably a bit slower than REST, but with REST, you have to write the API yourself (server side JavaScript - it's not hard, but there's a learning curve).
Hope this helps.
I'm an IBM i programmer; try CPYTOIMPF to create a pretty generic CSV file. I'll go to a stream file - if you have NetServer running you can map a network drive to the IFS directory or you can use FTP to get the CSV file from the IFS to another machine in your network.
Try Adeptia's Netsuite integration tool to perform ETL. You can also try Pentaho ETL for this (As far as I know Celigo's Netsuite connector is built upon Pentaho). Also Jitterbit does have an extension for Netsuite.
We primarily have 2 options to pump data into NS:
i)SuiteTalk ---> Using which we can have SOAP based transformations.There are 2 versions of SuiteTalk synchronous and asynchronous.
Typical tools like Boomi/Mule/Jitterbit use synchronous SuiteTalk to pump data into NS.They also have decent editors to help you do mapping.
ii)RESTlets ---> which are typical REST based architures by NS can also be used but you may have to write external brokers to communicate with them.
Depending on your need you can have whatever you need.IN most of the cases you will be using SuiteTalk to bring in data to Netsuite.
Hope this helps ...
We just got done doing this. We used an iPAAS platform called Jitterbit (similar to Dell Boomi). It can connect to mySql and to NetSuite and you can do transformations in the tool. I have been really impressed with the platform overall so far
There are different approaches, I like the following to process a batch job:
To import data to Netsuite:
Export CSV from old system and place it in Netsuite's a File Cabinet folder (Use a RESTlet or Webservices for this).
Run a scheduled script to load the files in the folder and update the records.
Don't forget to handle errors. Ways to handle errors: send email, create custom record, log to file or write to record
Once the file has been processed move the file to another folder or delete it.
To export data out of Netsuite:
Gather data and export to a CSV (You can use a saved search or similar)
Place CSV in File Cabinet folder.
From external server call webservices or RESTlet to grab new CSV files in the folder.
Process file.
Handle errors.
Call webservices or RESTlet to move CSV File or Delete.
You can also use Pentaho Data Integration, its free and the learning curve is not that difficult. I took this course and I was able to play around with the tool within a couple of hours.