I have several hundred IOT devices uploading performance metrics in .csv format into an s3 bucket. I need to get those metrics into my already existing prometheus/alertmanager monitoring solution. I'm attempting to use an existing exporter called mtail (https://github.com/google/mtail), but for the life of me I can't figure out how to use it to parse a CSV file. The documentation says to use awk language, but the way to set a custom delimeter in awk is with -f, but that's not relevant to mtail programs (or is it?).
If there's a better way to get .csv-formatted metrics from S3 into prometheus I'm open to suggestions.
I think VictoriaMetrics might be a good option for You https://docs.victoriametrics.com/?highlight=Import#how-to-import-csv-data
Also You might consider its agent for scraping on iot devices.
Related
I am currently trying to search a group of ebooks to learn more about C#. The aim is to ask a question get a page in one or multiple of the ebooks to read. I went to the g_suite chat team and they have kindly directed me to vision commands that was easy enough to follow to make multiple json files.
https://cloud.google.com/vision/docs/pdf
I want to implement this files in to AUTO ML Natural Language Processing. To do so, a CSV file is required.
I do not know how to create a CSV file that would get me past this point and I am currently stuck.
How to create a CSV file using gcloud command and should not the Json file be Jsonl file to be accepted?
thanks for your answer in advance
The output from the Vision API (service) is a JSON file written to Cloud Storage.
The input dataset to Auto ML expects the data to be in CSV format and stored in Cloud Storage.
This isn't a gcloud issue but a general data-transformation problem: transforming JSON to CSV.
Google Cloud includes services that could help you with this but I suggest you start by writing a script that converts the data (i.e. loads then parses the JSON file creating a CSV file in the required format for Auto ML).
You may want to Google to see whether others have done similar and use their code as a starting point.
NOTE IIUC your solution, while an interesting use of these technologies may be overkill. If you're looking to learn Vision API and Auto ML, great. If not, most of this content is available more directly as searchable HTML and text on the web and indeed Stack overflow exists to answer developer questions on a myriad of topics including C#.
I'm trying to access files in my amazon S3 and do some operations on it. Currently evaluating the options.
Since I will be doing some operations on the S3 files, I would prefer using some language to access the files in S3 (I have already tried copy command).
My S3 contains JSON files which range between 2MB to 4 MB and I would need to parse these JSON and load them into a database (thinking about using JQuery here, but any other suggestions are welcome)
Given these requirements which is most efficient language/platform to be used here.
You options are pretty broad here. AWS has a list of SDKs for you to choose from. https://aws.amazon.com/tools/#sdk
So your comfort level with a particular language should be your largest influencer. Given that you mentioned JSON and JQuery perhaps you should look at Node.js SDK and AWS Lambda.
I am new to GeoMesa. I mean I just typed geomesa command. So, after following the command line tools tutorial on GeoMesa website. I found some information on ingesting data to geomesa through a .csv file.
So, for my research:
I have a MySQL database storing all the information sent from an Android Application.
And I want to perform some geo spatial analytics on it.
Right now I am converting my MySQL table to .csv file and then ingest it into geomesa as adviced on GeoMesa website.
But my questions are:
Is there any other better option because data is in GB and its a streaming data, hence I have to make .csv file regularly?
Is there any API through which I can connect my MySQL database to geomesa?
Is there any way to ingest using .sql dump file because that would be more easier then .csv file?
Since you are dealing with streaming data, I'd point to two GeoMesa integrations:
First, you might want to check out NiFi for managing data flows. If that fits into your architecture, then you can use GeoMesa with NiFi.
Second, Storm is quite popular for working with streaming data. GeoMesa has a brief tutorial for Storm here.
Third, to ingest sql dumps directly, one option would be to extend the GeoMesa converter library to support them. So far, we haven't had that as a feature request from a customer or a contribution to the project. It'd definitely be a sensible and welcome extension!
I'd also point out the GeoMesa gitter channel. It can be useful for quicker responses.
Can't find any input plugin for Relational Databases in Logstash Documentation.
What is the best approach to import data from one Relational Database Table with logstash? Is to connect Elastic Search directly to the database using JDBC?
You'll need to use JDBC River (https://github.com/jprante/elasticsearch-river-jdbc) for loading JDBC data into elastic search (or write your own code to do it).
It looks like there are several JIRAs open requesting JDBC loading in Logstash, but they haven't been worked: https://logstash.jira.com/browse/LOGSTASH-1764
There's this
WIP: Under Development, NOT FOR PRODUCTION
This is a plugin for Logstash.
It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
Logstash provides infrastructure to automatically generate documentation for this plugin. We use the asciidoc format to write documentation so any comments in the source code will be first converted into asciidoc and then into html. All plugin documentation are placed under one central location.
So far, there is no any Logstash API for reading SQL.
For my recommendation, you can write a program/script such as JAVA/python to read the logs from sql database and write to a file. Then use logstash file
API to read from the file. The Logstash website has getting started tutorial. It is easy to learn.
Good Luck
I have downloaded "High Resolution Initial Conditions" climate forecast data for one day, it was in extension .tar.gz so I extracted it in my local directory and I get the files like in the attached image. I think, that the files without extension are GRIB data (because first word in them is "GRIB"). So I want to get data from the big files (GRIB and NetCDF formats containing climate data like temerature & pressure in grid) to my database, but they are binary. Can you recommend me some easy way for getting data from these files? I can't get any information about handling their datasets on their website.
Converting these files to .csv would be nice, but I can't find a program to convert the GRIB files.
Using python and some available modules it is simple...
The Enthought Python Distribution includes several packages, including netCDF4, to deal with NetCDF files!
I've never worked with GRIB files, but google tells that another python package exists, pygrib2.
Or you can use PyNio, a Python package that allows to read and write netCDF3 and netCDF4 classic format, and to read GRIB1 and GRIB2 files.
I don't know the ammount of data you have, but usually it is crazy to convert it to *.csv! Python is easy to learn, and suitable to work with this kind of data (with matplotlib package you can even plot it). Or, if you really need it in a *.csv, you can select with python a smaller domain, for example, or the needed variables...
For conversion into text, look into http://www.cpc.ncep.noaa.gov/products/wesley/wgrib.html or http://www.cpc.ncep.noaa.gov/products/wesley/wgrib2/
Both are C programs from one of the big names in GRIB.
I'm currently dealing with a similar issue.
In my case I'm trying to rely on the GrADS software, which can "easily" transform GRIB data into other formats.
If your dataset is not huge, then you can export it to csv using this tutorial.
My dataset is 80gb in GRIB binary files, so I'm very restricted in what software I can use to handle it (no R unless I find a computer with more than 80gb of RAM).