Problem loading JSON with comma into Snowflake - json

I am having an issue loading some JSON data.
The data looks like this:
{"geometry":{"coordinates":[12.5263,55.7664],"type":"Point"},"properties":{"created":"2021-01-19T17:08:14.114216Z","observed":"2020-01-01T23:50:00Z","parameterId":"pressure_at_sea","stationId":"06181","value":1025.1},"type":"Feature","id":"00e3bc2b-9a55-03dc-3740-005fd752f840"}
{"geometry":{"coordinates":[10.6217,55.8315],"type":"Point"},"properties":{"created":"2021-01-19T23:26:37.906088Z","observed":"2020-01-01T23:50:00Z","parameterId":"radia_glob","stationId":"06132","value":1},"type":"Feature","id":"00f7c039-6096-e2c2-5063-594c1c3bc16e"}
{"geometry":{"coordinates":[-37.6367,65.6111],"type":"Point"},"properties":{"created":"2021-01-19T23:26:37.913180Z","observed":"2020-01-01T23:50:00Z","parameterId":"radia_glob","stationId":"04360","value":0},"type":"Feature","id":"0142e2d1-9c28-e884-8d88-4b8fac5cae5d"}
The challenge seems to be, when I try to look at the JSON data like;
select $1, metadata$filename from #my_bucket/2020/2020-01.json.gz limit 3;
It only returns a part of the JSON:
$1 METADATA$FILENAME
{"geometry":{"coordinates":[12.5263 2020/2020-01.json.gz
{"geometry":{"coordinates":[10.6217 2020/2020-01.json.gz
{"geometry":{"coordinates":[-37.6367 2020/2020-01.json.gz
Seems like everything after the comma in the coordinates gets truncated, but I cannot figure out how to avoid that.
Br.
Thomas

$1 is column one, which helps show that your data is being treated as CSV. Which is the default file format.
try adding file format FILE_FORMAT => 'my_json_file' which would be created via:
CREATE OR REPLACE FILE FORMAT my_json_file TYPE = JSON;

Related

HIVE JSON INPUT - should the JSON be in one line

I am trying to load a JSON data to a HIVE table.
I was wondering should the JSON be in one single line only . I have tested this way with this data formatted like this:
"associatedDrug": {"name":"asprin", "dose":"","strength":"500 mg"}
"associatedDrug": {"name":"asprin2", "dose":"","strength2":"500 mg"}
or it can be provided in a formatted format like this:
"associatedDrug": {
"name":"asprin",
"dose":"",
"strength":"500 mg"
}
And if it formatted should is there a SERDE PROPERTIES that I can include so that it knows the RECORD END OF LINE is ???

Solr how to write an array of data in a CSV file

What's the syntax to write an array for solr in a csv file?, i need to update a multivalued field but when i upload the file, the data get all in the array but like just one element like this:
multiField:["data1,data2,data3"]
instead of this
multiField:["data1", "data2" , "data3"]
how i can write this in the csv file by default?
You can use the split parameter to split a single field into multiple values:
&f.multiField.split=,
.. should do what you want.

Pentaho Kettle conversion from String to Integer/Number error

I am new to Pentaho Kettle and I am trying to build a simple data transformation (filter, data conversion, etc). But I keep getting errors when reading my CSV data file (whether using CSV File Input or Text File Input).
The error is:
... couldn't convert String to number : non-numeric character found at
position 1 for value [ ]
What does this mean exactly and how do I handle it?
Thank you in advance
I have solved it. The idea is similar to what #nsousa suggested, but I didn't use the Trim option because I tried it and it didn't work on my case.
What I did is specify that if the value is a single space, it is set to null. In the Fields tab of the Text File Input, set the Null if column to space .
That value looks like an empty space. Set the Format of the Integer field to # and set trim to both.

How to write a JSON extracted value to a csv file in jmeter for a specific variable

I have a csv file that looks like this:
varCust_id,varCust_name,varCity,varStateProv,varCountry,varUserId,varUsername
When I run the HTTP Post Request to create a new customer, I get a JSON response. I am extracting the cust_id and cust_name using the json extractor. How can I enter this new value into the csv for the correct variable? For example, after creating the customer, the csv would look like this:
varCust_id,varCust_name,varCity,varStateProv,varCountry,varUserId,varUsername
1234,My Customer Name
Or once I create a user, the file might look like this:
varCust_id,varCust_name,varCity,varStateProv,varCountry,varUserId,varUsername
1234,My Customer Name,,,,9876,myusername
In my searching through the net, I have found ways and I'm able to append these extracted variables to a new line but in my case, I need to replace the value in the correct location so it is associated to the correct variable I have set up in the csv file.
I believe what you're looking to do can be done via a BeanShell PostProcessor and is answered here.
Thank you for the reply. I ended up using User Defined Variables for some things and BeanShell PreProcessors for other bits vs. using the CSV.
Well, never tried this. But what you can do is create all these variables and set them to Null / 0.
Once done, update these during your execution. At the end, you can concatenate these with any delimiter (say ; or Tab) and just push in CSV as a single string.
Once you got data in CSV, you can easily split in Ms excel.

Python 3: write string list to csv file

I have found several answers (encoding, decoding...) online, but I still don't get what to do.
I have a list called abc.
abc = ['sentence1','-1','sentence2','1','sentence3','0'...]
Now I would like to store this list in a CSV file, the following way:
sentence1, -1
sentence2, 1
sentence3, 0
I know that the format of my abc list probably isn't how it should to achieve this. I guess it should be a list of lists? But the major problem is actually that I have no clue how to write this to a CSV file, using Python 3. The only times it kinda worked, was when every character turned out to be separated by a comma.
Does anybody know how to solve this? Thank you!
You can use zip and join to create a new list and then write to csv :
abc=['sentence1', '-1', 'sentence2', '1', 'sentence3', '0', 'sentence4']
new=[(abc[0],)]+[(''.join(i),) for i in zip(abc[1::2],abc[2::2])]
import csv
with open('test.csv', 'w', newline='') as fp:
a = csv.writer(fp, delimiter=',')
a.writerows(new)
result :
sentence1
-1sentence2
1sentence3
0sentence4
Here is the documentation to work with files, and CSV is basically the same thing as txt, the difference is that you should use commas to separate the columns and new lines to rows.
In your example you could do this (or iterate over a loop):
formated_to_csv = abc[0]+','+abc[1]+','+abc[2]+','+abc[3]...
the value of formated_to_csv would be 'sentence1,-1,sentence2,1,sentence3,0'.. note that this is a single string, so it will generate a single row, and then write the formated_to_csv as text in the csv file :
f.write(formated_to_csv)
To put all sentences on the first column and all the numbers on the second column it would be better to have a list of lists :
abc = [['sentence1','-1'],['sentence2','1'],['sentence3','0']...]
for row in abc:
f.write(row[0]+','+row[1])
The "conversion" to table will be done by excel, calc or whatever program that you use to read spreadsheets.