Fastest way to download/access files in Amazon S3 - json

I'm trying to access files in my amazon S3 and do some operations on it. Currently evaluating the options.
Since I will be doing some operations on the S3 files, I would prefer using some language to access the files in S3 (I have already tried copy command).
My S3 contains JSON files which range between 2MB to 4 MB and I would need to parse these JSON and load them into a database (thinking about using JQuery here, but any other suggestions are welcome)
Given these requirements which is most efficient language/platform to be used here.

You options are pretty broad here. AWS has a list of SDKs for you to choose from. https://aws.amazon.com/tools/#sdk
So your comfort level with a particular language should be your largest influencer. Given that you mentioned JSON and JQuery perhaps you should look at Node.js SDK and AWS Lambda.

Related

Configure Apache Drill to read xml files in Mapr distribution

I have a project where I should read xml files with Apache Drill to process it , can someone tell me how I can configure it?
NB: I use Mapr distribution
I tried to add the configuration to the configuration UI but I get a error(see image)
enter image description here
Thanks in advance
You'll need to use a Drill distribution based on Apache Drill >= 1.19 for the XML format plugin.
So this is more of a Drill question than a MapR question.
There are two key steps here
make sure that Drill can access whatever you use to store your data (sounds your data is xml files in MapR (which is now called HPE Ezmeral Data Fabric))
make sure that Drill can understand the data you have. I am not current on Drill, but reading many kinds of XML should be doable.
For getting access, there are two major paths to accessing files on Ezmeral Data Fabric. One path is to mount the data fabric as a conventional file system on all the nodes running Drillbits. This is often done using NFS mounts, but can also be the FUSE driver provided with data fabric.
The other major approach to getting data access is to use the HDFS API framework to access data via maprfs://... path names. This requires installing the data fabric client on all of the nodes running Drillbits.
It sounds like you are running the version of Drill that is packaged with the old MapR or current HPE Ezmeral system. This is the easiest approach since the packaged version is integrated with the client libraries needed to use the HDFS API with maprfs:// resources (it also provides access to the tables and streams in the data fabric).

Reading metrics in .csv format into prometheus from S3

I have several hundred IOT devices uploading performance metrics in .csv format into an s3 bucket. I need to get those metrics into my already existing prometheus/alertmanager monitoring solution. I'm attempting to use an existing exporter called mtail (https://github.com/google/mtail), but for the life of me I can't figure out how to use it to parse a CSV file. The documentation says to use awk language, but the way to set a custom delimeter in awk is with -f, but that's not relevant to mtail programs (or is it?).
If there's a better way to get .csv-formatted metrics from S3 into prometheus I'm open to suggestions.
I think VictoriaMetrics might be a good option for You https://docs.victoriametrics.com/?highlight=Import#how-to-import-csv-data
Also You might consider its agent for scraping on iot devices.

How to go about storing and accessing of images inside the blog using LAMP stack?

I want to create a technical blog using LAMP stack (Laravel Framework). I would like to know what is the best way of storing and accessing images inside a blog content?
There is one way of doing this that I could think of:
(1) Storing the images as a file and then accessing those images using path which is specified as the src attribute of the tag which could be the part of content fetched from the database.
The most correct thing would be to store it in storage. Laravel provides a powerful filesystem abstraction thanks to the wonderful Flysystem PHP package by Frank de Jonge. The Laravel Flysystem integration provides simple to use drivers for working with local filesystems, Amazon S3, and Rackspace Cloud Storage. Even better, it's amazingly simple to switch between these storage options as the API remains the same for each system.
That is, you can store them locally on your LAMP server or you can use an external server for that. Both ways are good, however it depends on your needs.
You have to store the relative path in the database. i.e. /path/to/image.jpg
Then to show these files with the Facade Storage you can show them easily.
If you are using the local driver, this will typically just prepend /storage to the given path and return a relative URL to the file. If you are using the s3 or rackspace driver, the fully qualified remote URL will be returned:
use Illuminate\Support\Facades\Storage;
$url = Storage::url('image.jpg');

How to transfer JSON files(containing test execution results) from my local system to a azure VM (Linux) or to a storage disk attached to a webserver.

It would be a great help if someone could recommend an easier approach to this using Java.
Current Framework Implementation
Expected Framework Implementation
You haven't really shared any code or deep technical details, so I can only answer on high level.
On your first screenshot, you have an arrow, where you copy back the test execution result to your localhost. You have to modify this logic, instead of copying back to localhost you have to upload the test executions files to azure storage account. You'll need blob storage to store your files. If you want to do it with Java, here is some sample code with Java v7 about how to upload files to blob storage: https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-java
After you uploaded the files to azure storage, you can access the JSON files from your web server. Or, if you make your container public, you won't even need the web server to access these files.

NetSuite Migrations

Has anyone had much experience with data migration into and out of NetSuite? I have to export DB2 tables into MySQL, manipulate data, and then export ina CSV file. Then take a CSV file of accounts and manipulate the data again for accounts to match up from our old system to new. Anyone tried to do this in MySQL?
A couple of options:
Invest in a data transformation tool that connects to NetSuite and DB2 or MySQL. Look at Dell Boomi, IBM Cast Iron, etc. These tools allow you to connect to both systems, define the data to be extracted, perform data transformation functions and mappings and do all the inserts/updates or whatever you need to do.
For MySQL to NetSuite, php scripts can be written to access MySQL and NetSuite. On the NetSuite side, you can either do SOAP web services, or you can write custom REST APIs within NetSuite. SOAP is probably a bit slower than REST, but with REST, you have to write the API yourself (server side JavaScript - it's not hard, but there's a learning curve).
Hope this helps.
I'm an IBM i programmer; try CPYTOIMPF to create a pretty generic CSV file. I'll go to a stream file - if you have NetServer running you can map a network drive to the IFS directory or you can use FTP to get the CSV file from the IFS to another machine in your network.
Try Adeptia's Netsuite integration tool to perform ETL. You can also try Pentaho ETL for this (As far as I know Celigo's Netsuite connector is built upon Pentaho). Also Jitterbit does have an extension for Netsuite.
We primarily have 2 options to pump data into NS:
i)SuiteTalk ---> Using which we can have SOAP based transformations.There are 2 versions of SuiteTalk synchronous and asynchronous.
Typical tools like Boomi/Mule/Jitterbit use synchronous SuiteTalk to pump data into NS.They also have decent editors to help you do mapping.
ii)RESTlets ---> which are typical REST based architures by NS can also be used but you may have to write external brokers to communicate with them.
Depending on your need you can have whatever you need.IN most of the cases you will be using SuiteTalk to bring in data to Netsuite.
Hope this helps ...
We just got done doing this. We used an iPAAS platform called Jitterbit (similar to Dell Boomi). It can connect to mySql and to NetSuite and you can do transformations in the tool. I have been really impressed with the platform overall so far
There are different approaches, I like the following to process a batch job:
To import data to Netsuite:
Export CSV from old system and place it in Netsuite's a File Cabinet folder (Use a RESTlet or Webservices for this).
Run a scheduled script to load the files in the folder and update the records.
Don't forget to handle errors. Ways to handle errors: send email, create custom record, log to file or write to record
Once the file has been processed move the file to another folder or delete it.
To export data out of Netsuite:
Gather data and export to a CSV (You can use a saved search or similar)
Place CSV in File Cabinet folder.
From external server call webservices or RESTlet to grab new CSV files in the folder.
Process file.
Handle errors.
Call webservices or RESTlet to move CSV File or Delete.
You can also use Pentaho Data Integration, its free and the learning curve is not that difficult. I took this course and I was able to play around with the tool within a couple of hours.