I have a JSON String array (Result from a REST API call) as shown below. That string array should be organized as Column Names and Respective Records in a Data Table in Spotfire Using IronPython Script.
All the attributes should be displayed as my column names (only for once) and values should be displayed as records (iterate).
Any Help is much Appreciated.
Please let me know if you need more Clarity.
JSON Data:
[{"actionType":"Adjust Tubing or Casing PSI","assignedTeam":"FS","exceptionStatus":"","eventEndDate":"1970-01-01T00:00:00.000Z","description":"Motor Temperature","lastEditedDate":"2019-07-19T02:49:00.945Z","well":{"apiNumber12":"423294206100","area":"MIDLAND BASIN","wellName":"C SCHARB 31WB","asset":"ODESSA EAST FMT","route":"4000003 - NOD","apiNumber14":"42329420610001"},"actionItemPastDueDate":"2019-07-27T02:48:59.000Z","lastEditedBy":"Agent(System-Queue-ServiceLevel.ProcessEvent)","priorityRangeMax":500.000000000,"currentStage":"ALT5","id":"E-4056","workOrder":"","assignedDate":"2019-07-15T05:00:00.000Z","impactedLpo":true,"alcrApprovalComments":"","createdDate":"2019-07-16T02:48:34.786Z","alcrApprovalDeadline":"2019-07-19T02:48:59.000Z","dueDate":"2019-07-22T02:48:59.000Z","chemicalAdvisorComments":"","recommendationsStripped":"Training Test","workStatus":"Pending-ALCRApproval","actionTakenStripped":"","recommendations":"Training Test","tier":"6","cancelationComments":"","priority":"5","woSubmitted":false,"daysTillActionItemDue":"","eventStartDate":"2019-07-16T02:48:37.276Z","assignedTo":"","currentCommentsStripped":"","fsActionItemNearDueDate":"1970-01-01T00:00:00.000Z","cancelDate":"1970-01-01T00:00:00.000Z","averageBoed":374,"fsActionItemPastDueDate":"1970-01-01T00:00:00.000Z","lpoUom":"barrels per day","lpoRate":374,"surfaceWorkRequired":false,"liftMethod":"ESP","originatorCai":"alwo.coeesp","actionItemNearDueDate":"2019-07-21T02:48:59.000Z","createdBy":"Centaur COE ESP","priorityRangeMin":250.000000000,"functionalGroup":"ESP COE"}]
(Above is the Edited JSON of one single record)
uri = "https://quantprice.herokuapp.com/api/v1.1/scoop/period?tickers=MSFT&begin=2012-02-19"
webRequest = HttpWebRequest.Create(uri)
response = webRequest.GetResponse()
from System.IO import StreamReader
from System.Web.Script.Serialization import JavaScriptSerializer
streamReader = StreamReader(response.GetResponseStream())
jsonData = streamReader.ReadToEnd()
js = JavaScriptSerializer()
dataDict = js.Deserialize(jsonData,object)
# build a string representing the data in tab-delimited text format
myColName = []
for val in dataDict["stations"]:
myColName.append(val["id"])
textData = "\t".join(myColName) + "\r\n"
print textData
for quote in dataDict["stations"]:
print "\t".join(str(val) for val in quote)
textData += "\t".join(str(val) for val in quote) + "\r\n"
print textData
If i do this, my Result set is showing [column name, column value] as rows.
But, i want "Column name" as my header list and "Column Values" as my Rows.
Related
I have a DB collection consisting of nested strings . I am trying to convert the contents under "status" column as separate columns against each order ID in order to track the time taken from "order confirmed" to "pick up confirmed". The string looks as follows:
I have tried the same using
xyz_db= db.logisticsOrders -------------------------(DB collection)
df =pd.DataFrame(list(xyz_db.find())) ------------(JSON to dataframe)
Using normalize :
parse1=pd.json_normalize(df['status'])
It works fine in case of non nested arrays. But status being a nested array the output is as follows:
Using for :
data = df[['orderid','status']]
data = list(data['status'])
dfy = pd.DataFrame(columns = ['statuscode','statusname','laststatusupdatedon'])
for i in range(0, len(data)):
result = data[i]
dfy.loc[i] = [data[i][0],data[i][0],data[i][0],data[i][0]]
It gives the result in form of appended rows which is not the format i am trying to achieve
The output I am trying to get is :
Please help out!!
i share you which i used json read, maybe help you:
you can use two and more list
def jsonify(z):
genr = []
if z==z and z is not None:
z = eval(z)
if type(z) in (dict, list, tuple):
for dic in z:
for key, val in dic.items():
if key == "name":
genr.append(val)
else:
return None
else:
return None
return genr
top_genr['genres_N']=top_genr['genres'].apply(jsonify)
I want to parse a .json column through Power BI. I have imported the data directly from the server and have a .json column in the data along with other columns. Is there a way to parse this json column?
Example:
Key IDNumber Module JsonResult
012 200 Dine {"CategoryType":"dining","City":"mumbai"',"Location":"all"}
97 303 Fly {"JourneyType":"Return","Origin":"Mumbai (BOM)","Destination":"Chennai (MAA)","DepartureDate":"20-Oct-2016","ReturnDate":"21-Oct-2016","FlyAdult":"1","FlyChildren":"0","FlyInfant":"0","PromoCode":""}
276 6303 Stay {"Destination":"Clarion Chennai","CheckInDate":"14-Oct-2016","CheckOutDate":"15-Oct-2016","Rooms":"1","NoOfPax":"2","NoOfAdult":"2","NoOfChildren":"0"}
I wish to retain the other columns and also get the simplified parsed columns.
There is an easier way to do it, in the Query Editor on the column you want to read as a json:
Right click on the column
Select Transform>JSON
then the column becomes a Record that you can split in every property of the json using the button on the top right corner.
Use Json.Document function like this
let
...
your_table=imported_the_data_directly_from_the_server,
json=Table.AddColumn(your_table, "NewColName", each Json.Document([JsonResult]))
in
json
And then expand record to table using Table.ExpandRecordColumn
Or by clicking this button
Use Json.Document() function to convert string to Json data.
let
Source = Json.Document(Json.Document(Web.Contents("http://localhost:18091/pools/default/buckets/Aggregation/docs/AvgSumAssuredByProduct"))[json]),
#"Converted to Table" = Record.ToTable(Source),
#"Filtered Rows" = Table.SelectRows(#"Converted to Table", each not Text.Contains([Name], "type_")),
#"Renamed Columns" = Table.RenameColumns(#"Filtered Rows",{{"Name", "AvgSumAssuredByProduct"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Value", type number}})
in
#"Changed Type"
import json
from urllib import urlopen
import string
from UserList import *
l=[]
j=[]
d_base=urlopen('https://api.thingspeak.com/channels/193888/fields/1.json?results=1')
data = json.load(d_base)
for k in data['feeds']:
name = k['entry_id']
value = k['field1']
l.append(name)
j.append(value)
print l[0]
print j[0]
**this python code may useful for you **
**270
1035
**
I'm trying to convert a string of key value pairs to a JSON string. The only thing I know about the string of KV pairs is the format of the string i.e. space seperated, comma seperated etc.. For e.g. I don't control over the number or type of the keys coming in as input.
Here is what I came up with and wanted to see if this approach looks OK / awesome / awkward. Would appreciate if there is better alternative than this.
INPUT : clientIp="1.1.1.1" identifier="a.b.c" key1=10 key2="v3"
final val KV_PATTERN = "(\"[^\"]*\"|[^,\\\"\\s]*)=(\"[^\"]*\"|[^,\\\"\\s]*)".r
val cMap = KV_PATTERN.findAllMatchIn(inputString).map(m => (m.group(1).trim(), m.group(2).trim())).toMap
val json = cMap.map { case (key, value) => if (!key.startsWith("\"")) s""""$key"""" + ":" + value else s"$key:$value" }.mkString("{", ",", "}")`
OUTPUT: {"clientIp":"1.1.1.1","identifier":"a.b.c","key1":10,"key2":"v3"}
"{"+ inputString.split(" ").map{case i => val t = i.split("="); s""""${t(0).replaceAll("^\"|\"$", "")}": ${t(1)}"""}.mkString(",") + "}"
Maybe this is more cleaner.
a relative newbie to spark, hbase, and scala here.
I have json (stored as byte arrays) in hbase cells in the same column family but across several thousand column qualifiers. Example (simplified):
Table name: 'Events'
rowkey: rk1
column family: cf1
column qualifier: cq1, cell data (in bytes): {"id":1, "event":"standing"}
column qualifier: cq2, cell data (in bytes): {"id":2, "event":"sitting"}
etc.
Using scala, I can read rows by specifying a timerange
val scan = new Scan()
val start = 1460542400
val end = 1462801600
val hbaseContext = new HBaseContext(sc, conf)
val getRdd = hbaseContext.hbaseRDD(TableName.valueOf("Events"), scan)
If I try to load up my hbase rdd (getRdd) into a dataframe (after converting the byte arrays into string etc.), it only reads the first cell in every row (in the example above, I would only get "standing".
this code only loads up a single cell for every row returned
val resultsString = getRdd.map(s=>Bytes.toString(s._2.value()))
val resultsDf = sqlContext.read.json(resultsString)
In order to get every cell I have to iterate as below.
val jsonRDD = getRdd.map(
row => {
val str = new StringBuilder
str.append("[")
val it = row._2.listCells().iterator()
while (it.hasNext) {
val cell = it.next()
val cellstring = Bytes.toString(CellUtil.cloneValue(cell))
str.append(cellstring)
if (it.hasNext()) {
str.append(",")
}
}
str.append("]")
str.toString()
}
)
val hbaseDataSet = sqlContext.read.json(jsonRDD)
I need to add the square brackets and the commas so its properly formatted json for the dataframe to read it.
Questions:
Is there a more elegant way to construct the json i.e. some parser that takes in the individual json strings and concatenates them together so its properly formed json?
Is there a better capability to flatten hbase cells so i dont need to iterate?
For the jsonRdd, the closure that is computed should include the str local variable, so the task executing this code on a node should not be missing the "[", "]" or ",". i.e i wont get parser errors once i run this on the cluster instead of local[*]
Finally, is it better to just create a pair RDD from the json or use data frames to perform simple things like counts? Is there some way to measure the efficiency and performance of one vs. the other?
thank you
I am using Dell Boomi to map data from one system to another. I can use groovy in the maps but have no experience with it. I tried to do this with the other Boomi tools, but have been told that I'll need to use groovy in a script. My inbound data is:
132265,Brown
132265,Gold
132265,Gray
132265,Green
I would like to output:
132265,"Brown,Gold,Gray,Green"
Hopefully this makes sense! Any ideas on the groovy code to make this work?
It can be elegantly solved with groupBy and the spread operator:
#Grapes(
#Grab(group='org.apache.commons', module='commons-csv', version='1.2')
)
import org.apache.commons.csv.*
def csv = '''
132265,Brown
132265,Gold
132265,Gray
132265,Green
'''
def parsed = CSVParser.parse(csv, CSVFormat.DEFAULT.withHeader('code', 'color')
parsed.records.groupBy({ it.code }).each { k,v -> println "$k,\"${v*.color.join(',')}\"" }
The above prints:
132265,"Brown,Gold,Gray,Green"
Well, I don't know how are you getting your data, but here is a general way to achieve your goal. You can use a library, such as the one bellow to parse the csv.
https://github.com/xlson/groovycsv
The example for your data would be:
#Grab('com.xlson.groovycsv:groovycsv:1.1')
import static com.xlson.groovycsv.CsvParser.parseCsv
def csv = '''
132265,Brown
132265,Gold
132265,Gray
132265,Green
'''
def data = parseCsv(csv)
I believe you want to associate the number with various values of colors. So for each line you can create a map of the number and the colors associated with that number, splitting the line by ",":
map = [:]
for(line in data) {
number = line.split(',')[0]
colour = line.split(',')[1]
if(!map[number])
map[number] = []
map[number].add(colour)
}
println map
So map should contain:
[132265:["Brown","Gold","Gray","Green"]]
Well, if it is not what you want, you can extract the general idea.
Assuming your data is coming in as a comma separated string of data like this:
"132265,Brown 132265,Gold 132265,Gray 132265,Green 122222,Red 122222,White"
The following Groovy script code should do the trick.
def csvString = "132265,Brown 132265,Gold 132265,Gray 132265,Green 122222,Red 122222,White"
LinkedHashMap.metaClass.multiPut << { key, value ->
delegate[key] = delegate[key] ?: []; delegate[key] += value
}
def map = [:]
def csv = csvString.split().collect{ entry -> entry.split(",") }
csv.each{ entry -> map.multiPut(entry[0], entry[1]) }
def result = map.collect{ k, v -> k + ',"' + v.join(",") + '"'}.join("\n")
println result
Would print:
132265,"Brown,Gold,Gray,Green"
122222,"Red,White"
Do you HAVE to use scripting for some reason? This can be easily accomplished with out-of-the-box Boomi functionality.
Create a map function that prepends the ID field to a string of your choice (i.e. 222_concat_fields). Then use that value to set a dynamic process prop with that value.
The value of the process prop will contain the result of concatenating the name fields. Simply adding this function to your map should take care of it. Then use the final value to populate your result.
Well it depends upon the data how is it coming.
If the data which you have posted in the question is coming in a single document, then you can easily handle this in a map with groovy scripting.
If the data which you have posted in the question is coming into multiple documents i.e.
doc1: 132265,Brown
doc2: 132265,Gold
doc3: 132265,Gray
doc4: 132265,Green
In that case it cannot be handled into map. You will need to use Data Process Step with Custom Scripting.
For the code which you are asking to create in groovy depends upon the input profile in which you are getting the data. Please provide more information i.e. input profile, fields etc.