I've got a test SSIS package that reads this API https://api.coindesk.com/v1/bpi/currentprice.json
Which exports it to a table in SQL Server.
What is the best way of parsing this data so it is split into multiple columns correctly?
Disclaimer - I work for ZappySys (Company which makes API Connectors / Drivers for SSIS and ODBC)
Loading data from JSON file or REST API into SQL Server can be done few ways. For example, I literally took URL you supplied and put in JSON Source and got it working in 2 mins.
Method-1: Use 3rd party JSON Source Component (e.g. ZappySys)
Here is how to do using SSIS JSON Source by ZappySys (3rd party)
Method-2: Use C# code in Script Component
If you like to use FREE approach, then you can write C# code like this.
Related
I have a PowerShell script that uses SQL Server Management Objects (SMO) to create a .SQL file containing all the metadata from a SQL Server database. However, SMO cannot natively generate XML or JSON output. Is there a means to turn the .SQL output into either of these formats?
In SMO you can script individual objects to strings with the Scripter, and then use .NET libraries to add those scripts to a custom class which you serialize to XML/JSON, or use the XML and JSON libraries to construct the document directly.
I'm looking for ideas for an Open Source ETL or Data Processing software that can monitor a folder for CSV files, then open and parse the CSV.
For each CSV row the software will transform the CSV into a JSON format and make an API call to start a Camunda BPM process, passing the cell data as variables into the process.
Looking for ideas,
Thanks
You can use a Java WatchService or Spring FileSystemWatcher as discussed here with examples:
How to monitor folder/directory in spring?
referencing also:
https://www.baeldung.com/java-nio2-watchservice
Once you have picked up the CSV you can use my example here as inspiration or extend it: https://github.com/rob2universe/csv-process-starter specifically
https://github.com/rob2universe/csv-process-starter/blob/main/src/main/java/com/camunda/example/service/CsvConverter.java#L48
The example starts a configurable process for every row in the CSV and includes the content of the row as a JSON process data.
I wanted to limit the dependencies of this example. The CSV parsing logic applied is very simple. Commas in the file may break the example, special characters may not be handled correctly. A more robust implementation could replace the simple Java String .split(",") with an existing CSV parser library such as Open CSV
The file watcher would actually be a nice extension to the example. I may add it when I get around to it, but would also accept a pull request in case you fork my project.
I am new to Azure Data Factory. And my question is, I have a requirement to move the data from an on-premise Oracle and on-premise SQL Server to a Blob storage. The data need to be transformed into JSON format. Each row as one JSON file. This will be moved to an Event Hub. How can I achieve this. Any suggestions.
You could use lookup activity + foreach activity. And inside the foreach, there is a copy activity. Please reference this post. How to copy СosmosDb docs to Blob storage (each doc in single json file) with Azure Data Factory
The Data copy tool as part of the azure data factory is an option to copy on premises data to azure.
the data copy tool comes with a configuration wizard where you do all the required steps like configuring the source, sink, integration pipeline etc.
In the source you need to write a custom query to fetch data from the tables you require in json format.
In case of SQL server to select json you would use the options OPENJSON, FOR JSON AUTO to convert the rows to json. Supported in SQL 2016. For older versions you need to explore the options available. Worst case you can write a simple console app in C#/java to fetch the rows and then convert them to json file. And then you can upload the file to azure blob storage. If this is an one time activity this option should work and you may not require a data factory.
In case of ORACLE you can use the JSON_OBJECT function.
Im trying out the MarkLogic Java API and would want to bulk upload some files with the extension .csv
I'm not sure what to use, since the Java API only supports JSON, XML, and TXT files.
How do I batch upload files using the MarkLogic Java api? Do i convert everything to JSON?
Do i convert everything to JSON?
Yes, that is a common way to do it.
If you would like additional examples of how you can wrangle CSV with the Java Client API, check out OpenCSVBatcherExample and JacksonDatabindTest.testDatabindingThirdPartyPojoWithMixinAnnotations. The first demonstrates converting the csv to XML and using a custom REST extension. The second example (well, unit test...) demonstrates converting the csv to JSON and using the batch upload (Bulk Writes) capabilities Justin linked to.
If you have CSV files on your filesystem, I’d start with mlcp, as suggested above. It will handle all of the parsing and splitting into multiple transactions/batches for you. Take a look at the mlcp documentation for more details and some example configurations.
If you’d like more control over the parsing and splitting logic than mlcp gives you out-of-the-box or you’re getting CSV from some other source (i.e. not files on the filesystem), you can use the Java Client API. The Java Client API allows you to efficiently write batches using a WriteSet. Take a look at the “Bulk Writes” example.
According to your reply to Justin, you cannot use MLCP because it is command line and you need to integrate it into a web portal.
Well, MLCP is released as open cource software under the Apache2 licence. So if you are happy with this licence, then you have the source to integrate.
But what I see as your main problem statement is more specific:
How can I create miltiple XML OR JSON documents from a CSV file [allowing the use of the java API to then upload them as documents in MarkLogic]
With that specific problem statement:
1) have a look at SplitDelimitedTextReader.java from the mlcp source
2) try some java libraries for this purpose such as http://jsefa.sourceforge.net/quick-tutorial.html
We are using Birt report viewer to create reporting pages.
We use jdbc data source to connect with a Oracle database.
But is it possible to use a rest api (json format) as data source for the reports?
Does someone has experience with this?
BIRT has no build in JSON data source. However there are some community JSON data source plugins, but all of them I have seen are very low level and not comfortable to use, so I do not recommand any of them here.
You could create a "scripted data source" where you connect to your URL and parse the result by yourself but this is also not very comfortable. Someone tried it here so you have a starting point.
If you are in charge of the infrastructure providing the JSON output it would be easier to add an export to XML and use the BIRT build in XML data source.