export nested JSON from GCS into Spreadsheet - json

I have a nested NDJSON file that I exported from BQ into Google Cloud Storage. From there I would like to open it in Spreadsheet again as a nested table.
I see a lot of Appscripts to import JSON files but none are for files stored in GCS.
What would be the best solution to open the data table in spreadsheet?
the csv file I see when I use the tool suggested by Alex
This is the NDJSON example:
{"page":"/xxxx","country":"DE","pageviews":"72136","daily_peak_pageviews":"5465","daily_peak_users":"3118","users_unique":"37763","SEO":true,"campaign_info":[{"channel_group":"Referral","users_c":"16","pageviews_c":"17","title":"404"},{"channel_group":"Social","users_c":"2255","pageviews_c":"3839","title":"OK"},{"channel_group":"other","users_c":"33185","pageviews_c":"63320","title":"OK"},{"channel_group":"Referral","users_c":"316","pageviews_c":"556","title":"OK"},{"channel_group":"Paid","users_c":"47","pageviews_c":"49","title":"404"},{"channel_group":"Paid","users_c":"1088","pageviews_c":"1706","title":"OK"},{"channel_group":"other","users_c":"1888","pageviews_c":"2517","title":"404"},{"channel_group":"Social","users_c":"100","pageviews_c":"132","title":"404"}]}
{"page":"/yyy","country":"DE","pageviews":"67576","daily_peak_pageviews":"5390","daily_peak_users":"2843","users_unique":"32772","SEO":true,"campaign_info":[{"channel_group":"other","users_c":"7","pageviews_c":"10","title":"404"},{"channel_group":"other","users_c":"30951","pageviews_c":"64345","title":"OK"},{"channel_group":"Paid","users_c":"782","pageviews_c":"1303","title":"OK"},{"channel_group":"Referral","users_c":"265","pageviews_c":"467","title":"OK"},{"channel_group":"Social","users_c":"889","pageviews_c":"1450","title":"OK"},{"channel_group":"Paid","users_c":"1","pageviews_c":"1","title":"404"}]}
{"page":"/zzz","country":"DE","pageviews":"7558","daily_peak_pageviews":"619","daily_peak_users":"331","users_unique":"4117","SEO":true,"campaign_info":[{"channel_group":"other","users_c":"7","pageviews_c":"14","title":"404"},{"channel_group":"Paid","users_c":"38","pageviews_c":"70","title":"OK"},{"channel_group":"other","users_c":"3987","pageviews_c":"7309","title":"OK"},{"channel_group":"Paid","users_c":"1","pageviews_c":"1","title":"404"},{"channel_group":"Referral","users_c":"18","pageviews_c":"26","title":"OK"},{"channel_group":"Social","users_c":"70","pageviews_c":"138","title":"OK"}]}
{"page":"hdhh","country":"DE","pageviews":"3616","daily_peak_pageviews":"336","daily_peak_users":"206","users_unique":"2131","campaign_info":[{"channel_group":"Social","users_c":"267","pageviews_c":"379","title":"OK"},{"channel_group":"Paid","users_c":"776","pageviews_c":"1394","title":"OK"},{"channel_group":"other","users_c":"1089","pageviews_c":"1814","title":"OK"},{"channel_group":"Referral","users_c":"17","pageviews_c":"24","title":"OK"},{"channel_group":"other","users_c":"2","pageviews_c":"5","title":"404"}]}
{"page":"/ethehh","country":"DE","pageviews":"1394","daily_peak_pageviews":"322","daily_peak_users":"294","users_unique":"1232","campaign_info":[{"channel_group":"Paid","users_c":"61","pageviews_c":"67","title":"OK"},{"channel_group":"Social","users_c":"271","pageviews_c":"301","title":"OK"},{"channel_group":"other","users_c":"3","pageviews_c":"5","title":"404"},{"channel_group":"Referral","users_c":"10","pageviews_c":"10","title":"OK"},{"channel_group":"other","users_c":"888","pageviews_c":"1011","title":"OK"}]}
and this is the csv example:
page,country,pageviews,daily_peak_pageviews,daily_peak_users,users_unique,SEO,campaign_info/0/channel_group,campaign_info/0/users_c,campaign_info/0/pageviews_c,campaign_info/0/title,campaign_info/1/channel_group,campaign_info/1/users_c,campaign_info/1/pageviews_c,campaign_info/1/title,campaign_info/2/channel_group,campaign_info/2/users_c,campaign_info/2/pageviews_c,campaign_info/2/title,campaign_info/3/channel_group,campaign_info/3/users_c,campaign_info/3/pageviews_c,campaign_info/3/title,campaign_info/4/channel_group,campaign_info/4/users_c,campaign_info/4/pageviews_c,campaign_info/4/title,campaign_info/5/channel_group,campaign_info/5/users_c,campaign_info/5/pageviews_c,campaign_info/5/title,campaign_info/6/channel_group,campaign_info/6/users_c,campaign_info/6/pageviews_c,campaign_info/6/title,campaign_info/7/channel_group,campaign_info/7/users_c,campaign_info/7/pageviews_c,campaign_info/7/title
/xxxx,DE,72136,5465,3118,37763,true,Referral,16,17,404,Social,2255,3839,OK,other,33185,63320,OK,Referral,316,556,OK,Paid,47,49,404,Paid,1088,1706,OK,other,1888,2517,404,Social,100,132,404
/yyy,DE,67576,5390,2843,32772,true,other,7,10,404,other,30951,64345,OK,Paid,782,1303,OK,Referral,265,467,OK,Social,889,1450,OK,Paid,1,1,404,,,,,,,,
/zzz,DE,7558,619,331,4117,true,other,7,14,404,Paid,38,70,OK,other,3987,7309,OK,Paid,1,1,404,Referral,18,26,OK,Social,70,138,OK,,,,,,,,
hdhh,DE,3616,336,206,2131,,Social,267,379,OK,Paid,776,1394,OK,other,1089,1814,OK,Referral,17,24,OK,other,2,5,404,,,,,,,,,,,,
/ethehh,DE,1394,322,294,1232,,Paid,61,67,OK,Social,271,301,OK,other,3,5,404,Referral,10,10,OK,other,888,1011,OK,,,,,,,,,,,,

I found some scripts to load json files into a Google SpreadSheet, but all of them need to be loaded using a url, so the steps to get a public link to your JSON file in GCS are:
Go to your Google Cloud Storage bucket and then in your json file click in the three dots at the right.
click into "edit permissions"
Click into "Add item"
in "ENTITY" choose "User", then en "NAME" type "allUsers" and in "ACCESS" choose "Reader".
Now you have an external link to load your JSON using some scripts, like this one or this other one, but you need to edit the JSON file or the code a bit.
Another solution (and the easiest one), is to convert the JSON file into CSV using this tool and then, import the CSV into Google SpreadSheet clicking into "File" -> "import" -> "Upload" and then select your CSV file.

Related

How to move data from response of Http (JSON data) to a JSON file

I am trying to build an Azure Data Factory pipeline where it copies the Response of Output (JSON format) from Azure Function and move its data to a JSON file.
I am not using REST API.
Initially, I built a pipeline where it used Variable to copy the data into a Text file.
Now, I am trying to move data into a JSON file so that I could map JSON file's columns to SQL table.
What is best architecture of this data pipeline?
Do I need a Text file from moving data of "output.Response" (from 1st step) to JSON file?
I tried to use Text file as 'Source' and JSON file as 'Sink' (inside "Copy data" step), but I am not able to map 'Source' and 'Sink'.
I also tried having the third step with JSON files as 'Source' and 'Sink', but I am not sure how #variables('myVariable') should be carried into this step as it is not a text file.
Thank you.
I was able to find the solution by having the data type of JSON on next "Copy data" pointing to the text file (csv) that I used. Even though the file was originally created with DelimitedText (csv) file, when I pointed that same file, but using JSON format, it took it.

How to import excel/csv with "File Import" widget in Foundry's Slate?

Context:
For a data pipeline we need to ingest excel spreadsheets directly into foundry (arriving via email). In order to avoid any manual handling error, we'd like to build a small slate app that basically just uploads an excel sheet and automatically appends it to an existing dataset (given schema, headers, etc.).
Unfortunately, there is very little documentation on the "File Import" widget or the API that gets called when drag and dropping a file into a folder.
Idea: Is there a way of uploading a file with slate? Could this file then be added to a dataset, similarly as with the prompt that opens when dropping it into a folder?
You actually don't have to build a Slate app to do this! Datasets that are made up of underlying .csv files support new additions of files directly.
Note: All of the following screenshots are from the dataset preview page.
For example, the following dataset I created from 4 .csv files:
And I can click on the Import button in the top right to add in more files (with the same schema, or not. Depends on if you want to strictly adhere to your applied schema.
If you have already applied a schema, you can also simply Import new files on top of the dataset, but the schemas of the files must exactly match those already present, otherwise your dataset will fail when attempted to be read.

how to use google Data Prep for several files located in Google Cloud Storage?

I imported a text file from in GCS and did some preparations using DataPrep and write them back to GCS as CSV files. What I want to do is, do this for all the text files in that bucket Is there a way to do this for all the files in that bucket(in GCS) at once?
Below is my procedure. I selected a textfile from GCS(can't select more than one text file) and did some preparations(rename columns .create new columns and etc). Then write it back to GCS as CSV.
You can use the Dataset with parameters feature to load several files at once.
You can then use a wildcard to select all the files that you want to load.
Note that all the files need to have the same schema (same columns) for this to work.
See https://cloud.google.com/dataprep/docs/html/Create-Dataset-with-Parameters_118228628 for more information on how to use this feature.
An other solution is to add all the files into a folder* and to use the large + button to load all the files in that folder.
[*] technically under the same prefix on GCS

Published https://docs.google.com/spreadsheets redirects to other URL (CSV data)

We auto-publish a Google Docs Spreadsheet (one tab as CSV). Google docs is providing a fixed URL that refers to the CSV. We import this CSV in another tool for product data import.
Suddenly this URL is redirected by Google Spreadsheet. If we go again in "File/Publish To The Internet" we can the same URL for that CSV.
Question: How can get the URL without redirection again?
Error: Source file
https://docs.google.com/spreadsheets/d/e/2PACX-1vTQsBEmvOwFwxORMqYg2N6LzzYqdqsdDCjxqsdqsdH72gdMCP4xrs1lsN37RO4h1-rjJsQ/pub?gid=501162839&single=true&output=csv doesn't exist (HTTPS : File not found ! (HTTP/1.0 307 Temporary Redirect)). Please check the source file path.
In short, the collection process needs to follow the Location header. Depending how you're getting the CSV this might be simple or a pain. I collect CSVs using curl so just adding the -L switch is sufficient to make sure the incoming files are the CSV we're looking for instead of the HTML that we were getting without -L. Without knowing what utility or process you're using to download the CSV I can't be more specific, unfortunately.

JMeter read the second sheet of CSV

How can I make JMeter read the second sheet of my CSV?
I want to use CSV Data Set Config.
Normally, it reads the first line of the first sheet but is there any way to be a bit more flexible?
CSV file format doesn't have "sheets", it is a normal plain text file using delimiters in order to represent structured data.
If you are trying to get data from i.e. Microsoft Excel file type - unfortunately you won't be able to do it using CSV Data Set Config. The easiest would be exporting data as separate plain-text CSV files.
If you don't have the possibility to do the export you still can access the data from Excel files but it will be a little bit more tricky as you will have to use JSR223 Test Elements, Groovy language and Apache POI libraries
More information:
Busy Developers' Guide to HSSF and XSSF Features
How to Extract Data From Files With JMeter
Currently you can use CSV Data Set Config for that, you should add external code for example using Apache Commons CSV,
Download the jar file and place it in JMETER_HOME lib folder, and then write the code in JSR223 Element.
Examples exists, code for get second record:
Reader in = new FileReader("path/to/file.csv");
Iterable<CSVRecord> records = CSVFormat.RFC4180.parse(in);
// go to next record
records.next();
CSVRecord secondRecord = records.next();
//columnOne = secondRecord.get(0);