set annotated csv for grafana's influx uploader - csv

Trying to change a csv to an "annotated csv", in order to manually upload a file into the Grafana's influx uploader.
Says "
Failed to upload the selected CSV: The CSV could not be parsed. Please make sure that the CSV was in Annotated Format
"
manual is here : https://docs.influxdata.com/influxdb/v2.0/write-data/developer-tools/csv/#csv-annotations
My (miss-annotated) csv is :
#constant measurement,datatest
#datatype dateTime:number,long,long,string
time,temp,raw,SN
1539250260,21,409,ABC3
1539250985,27,718,ABC1
1539251114,25,496,ABC2
1539251168,22,751,ABC3
1539251893,29,725,ABC1
1539252019,28,489,ABC2
1539252076,26,753,ABC3
1539252800,29,731,ABC1
1539252930,29,485,ABC2
Thanks

Try with this
#constant measurement,datatest
#datatype time,long,long,string
time,temp,raw,SN
1539250260,21,409,ABC3
1539250985,27,718,ABC1
1539251114,25,496,ABC2
1539251168,22,751,ABC3
1539251893,29,725,ABC1
1539252019,28,489,ABC2
1539252076,26,753,ABC3
1539252800,29,731,ABC1
1539252930,29,485,ABC2

Related

issue with connecting data in databricks from data lake and reading JSON into Folium

i'm working on something based of this blogpost:
https://python-visualization.github.io/folium/quickstart.html#Getting-Started
specifically part 13 - using Cloropleth maps:
the piece of code they use is the following:
import pandas as pd
url = (
"https://raw.githubusercontent.com/python-visualization/folium/master/examples/data"
)
state_geo = f"{url}/us-states.json"
state_unemployment = f"{url}/US_Unemployment_Oct2012.csv"
state_data = pd.read_csv(state_unemployment)
m = folium.Map(location=[48, -102], zoom_start=3)
folium.Choropleth(
geo_data=state_geo,
name="choropleth",
data=state_data,
columns=["State", "Unemployment"],
key_on="feature.id",
fill_color="YlGn",
fill_opacity=0.7,
line_opacity=0.2,
legend_name="Unemployment Rate (%)",
).add_to(m)
folium.LayerControl().add_to(m)
m
if I use this I get the requested map.
Now I try to do this with my own data; i work in databricks
so I have a JSON with the GEOJSON data (source_file1) and a CSV file (source_file2) with the data that needs to be "plotted" on the map.
source_file1 = "dbfs:/mnt/sandbox/MAARTEN/TOPO/Belgie_GEOJSON.JSON"
state_geo = spark.read.json(source_file1,multiLine=True)
source_file2 = "dbfs:/mnt/sandbox/MAARTEN/TOPO/DATASVZ.csv"
df_2 = spark.read.format("CSV").option("inferSchema", "true").option("header", "true").option("delimiter",";").load(source_file2)
state_data = df_2.toPandas()
when adjusting the code below:
m = folium.Map(location=[48, -102], zoom_start=3)
folium.Choropleth(
geo_data=state_geo,
name="choropleth",
data=state_data,
columns=["State", "Unemployment"],
key_on="feature.properties.name_nl",
fill_color="YlGn",
fill_opacity=0.7,
line_opacity=0.2,
legend_name="% Marktaandeel CC",
).add_to(m)
folium.LayerControl().add_to(m)
m
So i upload the geo_data parameter as a Sparkdatafram, I get the following error:
ValueError: Cannot render objects with any missing geometries: DataFrame[features: array<struct<geometry:struct<coordinates:array<array<array<string>>>,type:string>,properties:struct<arr_fr:string,arr_nis:bigint,arr_nl:string,fill:string,fill-opacity:double,name_fr:string,name_nl:string,nis:bigint,population:bigint,prov_fr:string,prov_nis:bigint,prov_nl:string,reg_fr:string,reg_nis:string,reg_nl:string,stroke:string,stroke-opacity:bigint,stroke-width:bigint>,type:string>>, type: string]```
I think it is because transforming the data from the "blob format" in the Azure datalake to the sparkdataframe, something goes wrong with the format. I tested this in a jupyter notebook from my desktop, data straight from file to folium and it all works.
If i load it directly from the source, like the example does with their webpage, so i adjust the 'geo_data' parameter for the folium function:
m = folium.Map(location=[48, -102], zoom_start=3)
folium.Choropleth(
geo_data=source_file1, #this gets adjusted directly to data lake
name="choropleth",
data=state_data,
columns=["State", "Unemployment"],
key_on="feature.properties.name_nl",
fill_color="YlGn",
fill_opacity=0.7,
line_opacity=0.2,
legend_name="% Marktaandeel CC",
).add_to(m)
folium.LayerControl().add_to(m)
m
I get the error
Use "/dbfs", not "dbfs:": The function expects a local file path. The error is caused by passing a path prefixed with "dbfs:".
So I started wondering what is the difference between my JSON file and the one of the blogpost. And the only thing i can imagine is that the Azure datalake doesn't store my json as a json but as a block blob file and for some reason i am not converting it properly so that folium can read it.
Azure blob storage (data lake)
So can someone with folium knowledge let me know if
A. it is not possible to load the geo_data directly from a datalake ?
B. in what format I need to upload the data ?
any thoughts on this would be helpfull!!!
thanks in advance!
Solved this issue, just had to replace "dbfs:" with "/dbfs". I tried it a lot of times but used "/dbfs:" and got another error.
can't believe i'm this stupid :-)

How to search for a specific value in a .csv file using groovy

I have a .csv file for Unloco code and I want to search if a specific port code exists in the .csv file.
File:
ADALV,Andorra la Vella,,,4230N 00131E
ADCAN,Canillo,,,4234N 00135E
you can iterate all lines in your file with this kind of code:
def f = new File(myfile)
f.withReader("UTF-8"){ r->
r.splitEachLine( ',' ){ line->
if(line[0]=='ADCAN'){
println "found: $line"
}
}
}
If you just want to know that the code exists in the file and don't care where you could do the following, the text method will read the entire file into a string however so not great if dealing with large files.
new File( 'myfile.csv' ).text.contains( 'ADCAN' )

Python3 Replacing special character from .csv file after convert the same from JSON

I am trying to develop a program using Python3.6.4 which convert a JSON file into a CSV file and also we need to clean the data in the csv file. as for example:
My JSON File:
{emp:[{"Name":"Bo#b","email":"bob#gmail.com","Des":"Unknown"},
{"Name":"Martin","email":"mar#tin#gmail.com","Des":"D#eveloper"}]}
Problem 1:
After converting that into csv its creating a blank row between every 2 rows. As
**Name email Des**
[<BLANK ROW>]
Bo#b bob#gmail.com Unknown
[<BLANK ROW>]
Martin mar#tin#gmail.com D#eveloper
Problem 2:
In my code I am using emp but I need to use it dynamically.
fobj = open("D:/Users/shamiks/PycharmProjects/jsonSamle.txt")
jsonCont = fobj.read()
print(jsonCont)
fobj.close()
employee_parsed = json.loads(jsonCont)
emp_data = employee_parsed['employee']
As we will not know the structure or content of up-coming JSON file.
Problem 3:
I also need to remove all # characters from the CSV file.
For solving Problem 3, you can use .replace (https://www.tutorialspoint.com/python/string_replace.htm).
For problem 2, you can use the dictionary keys and then get the zeroth item out of it.
fobj = open("D:/Users/shamiks/PycharmProjects/jsonSamle.txt")
jsonCont = fobj.read().replace("#", "")
print(jsonCont)
fobj.close()
employee_parsed = json.loads(jsonCont)
first_key = employee_parsed.keys()[0]
emp_data = employee_parsed[first_key]
I can't solve problem 1 without more code to see how your are exporting the result. It may be that your data has newlines in it. In which case, you could add .replace("\n","") and/or .replace("\r","") after the previous replace so the line would read fobj.read().replace("#", "").replace("\n", "").replace("\r", "").

Setting properties of a node from a csv - Neo4j

This is an example of my csv file:
_id,official_name,common_name,country,started_by,
ABO.00,Association Football Club Bournemouth,Bournemouth,England,"{""day"":NumberInt(1),""month"":NumberInt(1),""year"":NumberInt(1899)}"
AOK.00,PAE Kerkyra,Kerkyra,Greece,"{""day"":NumberInt(30),""month"":NumberInt(11),""year"":NumberInt(1968)}"
I have to import this csv into Neo4j:
LOAD CSV WITH HEADERS FROM
'file:///Z:/path/to/file/team.csv' as line
create (p:Team {_id:line._id, official_name:line.official_name, common_name:line.common_name, country:line.country, started_by_day:line.started_by.day,started_by_month:line.started_by.month,started_by_year:line.started_by.year
I get an error(Neo.ClientError.Statement.InvalidType) setting started_by.day, started_by.month, started_by.year
How can I set rightly the properties about started_by?
Format of you csv should be following:
_id,official_name,common_name,country,started_by_day,started_by_month,started_by_year
ABO.00,Association Football Club Bournemouth,Bournemouth,England,1,1,1899
Cypher:
LOAD CSV WITH HEADERS FROM 'file:///Z:/path/to/file/team.csv' as line
CREATE (p:Team {_id:line._id, official_name:line.official_name, common_name:line.common_name, country:line.country, started_by_day:line.started_by_day,started_by_month:line.started_by_month,started_by_year:line.started_by_year})
It looks like your date part in the csv file is in JSON format - don't you need to parse that first?
line.started_by
is this string
"{""day"":NumberInt(30),""month"":NumberInt(11),""year"":NumberInt(1968)}"
There is no line.started_by.day

Datastore is putting a different ID value when importing from csv file

I am importing data from csv file to datastore using appcfg.py
My problem is that the ID value inserted in datastore is different from the value in csv file.
I think the error is in the bulkloader.yaml, I don't know how to import the value.
This is my csv file:
http://pastebin.com/embed_js.php?i=xC9UVVty
This is my bulkloader.yaml:
http://pastebin.com/embed_js.php?i=W3t9c6qd
And this is the result in datastore:
ID/Name = id=1121
casillaEspecial = N
claveEntidad = aghzfmhheS1mb3IYCxIRRW50aWRhZEZlZGVyYXRpdmEiATEM
EntidadFederativa: name=1
.
.
.
ID/Name = id=1122
casillaEspecial = N
claveEntidad = aghzfmhheS1mb3IYCxIRRW50aWRhZEZlZGVyYXRpdmEiATEM
EntidadFederativa: name=1
.
.
.
ID/Name = id=1123
casillaEspecial = N
claveEntidad = aghzfmhheS1mb3IYCxIRRW50aWRhZEZlZGVyYXRpdmEiATEM
EntidadFederativa: name=1
.
.
.
As you can see, the ID's in csv file are: 1,2,3, whereas in datastore are 1121, 1122, 1123
I hope you could help me please.
Thanks.
It seems there was any problem. I only changed the billing status of my application to Enable and all things started to go very well. After changing the status I tried to upload the csv file again and magically the ID didn't have more problems, the import was made correctly.