Trying to create a sentinel query (KQL) which uses the externaldata() operator to ingest the information from the json file 'https://www.gov.uk/bank-holidays.json'. Problems I am finding is due to this json file containing the column / field 'date' sentinel does not allow this as a variable. Anyone been able to get a multilayer json fields from an external file?
externaldata (title:string, date:string, notes:string, bunting:bool)[
#"https://www.gov.uk/bank-holidays.json"
]
with(format="multijson")
The externaldata operator was created to enable users of Azure Data Explorer (AKA Kusto) based SaaS systems, such as Log Analytics and Application Insights, to work with external data located in Azure storage.
Retrieving data from web sites in an unsupported scenario.
Sometimes it works, and sometimes not (depends on what lies on the other side).
For your specific URL, it does not work.
Special names in KQL can be expressed with bracket and single/double qualifies, e.g., ['date'] or ["date"]
The entire document is written in a single row. json is enough. No need for multijson.
The assumed schema is wrong (title:string, date:string, notes:string, bunting:bool).
The JSON has 3 keys in the 1st layer, one for each kingdom: "england-and-wales", "scotland" & "northern-ireland".
While we can use the above keys to read the JSON, I would prefer reading it as txt or raw, parsing it to JSON and then explode it, as demonstrated in the query below.
externaldata(doc:string)
[h'https://<storage-account-name>.blob.core.windows.net/mycontainer/bank-holidays.json;<secret>']
with(format='txt')
| project parse_json(doc)
| mv-expand kind=array doc
| project kingdom = tostring(doc[0])
,division = doc[1].division
,events = doc[1].events
| mv-expand events
| evaluate bag_unpack(events)
//| sample 10
kingfom
division
bunting
date
notes
title
northern-ireland
northern-ireland
false
2017-04-14T00:00:00Z
Good Friday
england-and-wales
england-and-wales
true
2017-05-29T00:00:00Z
Spring bank holiday
scotland
scotland
false
2018-03-30T00:00:00Z
Good Friday
england-and-wales
england-and-wales
true
2018-12-25T00:00:00Z
Christmas Day
northern-ireland
northern-ireland
false
2019-04-19T00:00:00Z
Good Friday
england-and-wales
england-and-wales
true
2019-12-25T00:00:00Z
Christmas Day
northern-ireland
northern-ireland
true
2020-01-01T00:00:00Z
New Year’s Day
scotland
scotland
true
2022-01-04T00:00:00Z
Substitute day
2nd January
scotland
scotland
false
2022-09-19T00:00:00Z
Bank Holiday for the State Funeral of Queen Elizabeth II
scotland
scotland
true
2023-01-02T00:00:00Z
Substitute day
New Year’s Day
Related
I'm not sure If the title is correct, but I'll explain what I mean. So I'm doing a project that involves an API. I created the data classes that was need to store that information as well. Where it gets weird is actually getting the information I need. Here's an instance of a list of information I need for the project.
"Text":"Alma Gutierrez - Alma M. Gutierrez is a fictional character on the HBO drama The Wire, played by actress Michelle Paress. Gutierrez is a dedicated and idealistic young reporter on the city desk of The Baltimore Sun."
You see, the name of the character and the description is all in a single string value. I'm usually used to name and the description being separated like this for example
Text:{
name: "Alma Gutierrez"
description:"Alma is a..."
}
So my question is, how can I manipulate the response so that I can get the name and the description separately? I am thinking maybe some sort of function that will take the string value from the JSON call and split it to a name and description values. But I'm not sure how to do that.
I'll leave my project GitHub URL so you guys for reference. Thanks for the help.
https://github.com/OEThe11/AnywhereCE
You can use split() to split a string into parts based on a delimiter.
For instance, if you have a string containing the description as mentioned in the question, you can do the following:
val text = "Alma Gutierrez - Alma M. Gutierrez is a fictional character on the HBO drama The Wire, played by actress Michelle Paress. Gutierrez is a dedicated and idealistic young reporter on the city desk of The Baltimore Sun."
val (name, description) = text.split(" - ", limit = 2)
(see the behaviour in this playground)
The limit = 2 parameter ensures that you won't miss a part of the description if it contains - . It only splits in maximum 2 parts, so it will consider everything until the first occurrence of - as the name, and everything after that as the description, even if it includes more occurrences of - .
Note that using the deconstruction val (name, description) = ... like this will fail if split() returns less than 2 parts (in other words, it will fail if the initial text doesn't contain - at all. It may be ok for you depending on the input you expect here.
To add on to what Joffery said, I actually created a variable to hold the the spited string.
val parts = Text.split(" - ", limit = 2)
Since there's only two values in the split, I can use the variable and call the index that I need for the corresponding text field.
Is it possible to define a template Daily, which can only be created once per day in the sense that if Alice creates one, Bob no longer can, and if Bob creates one, Alice no longer can?
When asking about constraints like "one per day" in DAML, one has to think about the scope of that constraint and who guarantees it.
The simplest possible template in DAML is
template Daily
with
holder : Party
where
signatory holder
An instance of this template is only known to holder. There is no party, or set of parties that could ensure that there is only one such contract instance between Alice and Bob. In certain ledger topologies, Alice and Bob may not even know about each other, nor is there any party that knows about both.
A set of parties that guarantees the uniqueness is needed:
template Daily
with
holder : Party
uniquenessGuarantors : [Party]
...
The uniqueness guarantors need to be able to enable or block the creation of a Daily. In other words, they need to be signatories.
template Daily
with
holder : Party
uniquenessGuarantors : [Party]
where
signatory holder, uniquenessGuarantors
Now the easiest way to guarantee any sort of uniqueness in DAML is by using contract keys. Since we want one per day, we need a Date field.
template Daily
with
holder : Party
uniquenessGuarantors : [Party]
date : Date
where
signatory holder, uniquenessGuarantors
key (uniquenessGuarantors, date) : ([Party], Date)
maintainer key._1
What this says is that there is a unique copy of Daily for each key, and the guarantors in key._1 are responsible for making it so.
Finally you need a mechanism for actually creating these things, a sort of DailyFactory provided by the guarantors. That factory can also take care of making sure that date is always set to the current date on the ledger.
template DailyFactory
with
uniquenessGuarantors : [Party]
holder : Party
where
signatory uniquenessGuarantors
controller holder can
nonconsuming FabricateDaily
: ContractId Daily
do
now <- getTime
let date = toDateUTC now
create Daily with ..
A simple test shows how it works, with uniqueness being guaranteed by a single party Charlie:
test_daily = scenario do
[alice, bob, charlie] <- mapA getParty ["Alice", "Bob", "Charlie"]
fAlice <- submit charlie do
create DailyFactory with
holder = alice
uniquenessGuarantors = [charlie]
fBob <- submit charlie do
create DailyFactory with
holder = bob
uniquenessGuarantors = [charlie]
-- Alice can get hold of a `Daily`
submit alice do
exercise fAlice FabricateDaily
-- Neither can create a second
submitMustFail alice do
exercise fAlice FabricateDaily
submitMustFail bob do
exercise fBob FabricateDaily
-- The next day bob can create one
pass (days 1)
submit bob do
exercise fBob FabricateDaily
-- But neither can create a second
submitMustFail alice do
exercise fAlice FabricateDaily
submitMustFail bob do
exercise fBob FabricateDaily
Note that in terms of privacy, Alice and Bob don't know about each other or the other's Daily or DailyFactory, but the uniquenessGuarantors know all parties for which uniqueness is maintained, and know of all Daily instances for which they guarantee uniqueness. They have to!
To run the above snippets, you need to import DA.Time and DA.Date.
Beware that getTime returns UTC - and consequently the code would guarantee uniqueness, one per day, according to UTC, but not for example according to local calendar (which could be, say, Auckland NZ).
Can you look at https://data.cityofnewyork.us/City-Government/ERROR-in-record-type/dq2e-3a6q
This shows a record type that appears to be incorrect.
It shows
P:10,item":"Bloomfield"},{"count":9,item":"New Britain"},{"count":8,item":"West Htfd"},{"count":7,item":"Torrington"},{"count":6,item":"Meriden"},{"count":5,item":"Whfd"},{"count":4,item":"Manchester
If you select count(*) and group by record_type you see:
curl 'https://data.cityofnewyork.us/resource/636b-3b5g.json?$select=count(*),record_type&$group=record_type'
[ {
"count" : "1",
"record_type" : "P:10,item\":\"Bloomfield\"},{\"count\":9,item\":\"New Britain\"},{\"count\":8,item\":\"West Htfd\"},{\"count\":7,item\":\"Torrington\"},{\"count\":6,item\":\"Meriden\"},{\"count\":5,item\":\"Whfd\"},{\"count\":4,item\":\"Manchester"
}
, {
"count" : "36631085",
"record_type" : "P"
}
This means there are 36M record type's having the value "P" and one very odd one.
One suggestion for New York City Open Data Law:
We must modify the Open Data Law (http://www1.nyc.gov/site/doitt/initiatives/open-data-law.page) to require New York City Government agencies to not only to open up data but to actually use the open data portal for government agency public sites.
If we allow agencies to simply dump data into a portal, then we have no quality testing. And agencies can trumpet how many datasets are open but no one is actually using the data.
This simple change "agency must use it's own data (aka, dogfood)" will encourage quality. If you read, http://www1.nyc.gov/site/doitt/initiatives/open-data-law.page it only mentions quality once and nothing about usage of the data. A portal is not a thing to brag about, it is an important way to join technology and government.
Thanks!
My Json code are as below
[
{"group":{"key":"Chinese","title":"Chinese","shortTitle":"Chinese","recipesCount":0,"description":"Chinese cuisine is any of several styles originating from regions of China, some of which have become increasingly popular in other parts of the world – from Asia to the Americas, Australia, Western Europe and Southern Africa. The history of Chinese cuisine stretches back for many centuries and produced both change from period to period and variety in what could be called traditional Chinese food, leading Chinese to pride themselves on eating a wide range of foods. Major traditions include Anhui, Cantonese, Fujian, Hunan, Jiangsu, Shandong, Szechuan, and Zhejiang cuisines. ","rank":"","backgroundImage":"images/Chinese/chinese_group_detail.png", "headerImage":"images/Chinese/chinese_group_header.png"},
"key":1000,
"title":"Abalone Egg Custard",
"shortTitle" : "Abalone Egg Custard",
"serves":4,
"perServing":"65kcal / 2.2g fat",
"favorite":false,
"rating": 3 ,
"directions":["Step 1.","Step 2.","Step 3.","Step 4.","Step 5."],
"backgroundImage":"images/Chinese/AbaloneEggCustard.jpg",
"healthytips":["Tip 1","Tip 2","Tip 3"],
"nutritions":["Calories 65kcal","Total Fat 2.2g","Carbs 4g","Cholesterol 58mg","Sodium 311mg","Dietary Fibre 0.3g"],
"ingredients":["1 head Napa or bok choy cabbage","1/4 cup sugar","1/4 teaspoon salt","3 tablespoons white vinegar","3 green onions","1 (3-ounce) package ramen noodles with seasoning pack","1 (6-ounce) package slivered almonds","1 tablespoon sesame seeds","1/2 cup vegetable oil"]}
]
how am I going to persist this in database? Cause the end of the day I have to read from the database and able to parse it using webapi
Persist it as a CLOB data type in your database in the likely event that the length is going to exceed the limits of a varchar.
There are so many potential answers here -- you'll need to provide many more details to get a specific answer.
Database
What database are you using -- is it relation, object, no-sql? If you come from a no-sql perspective -- saving it as a lump is likely fine. From a RDBMS perspective (like SQL Server), you map all the fields down to a series of rows in a set of related tables. If you're using a relation database, just jamming an unparsed, unvalidated lump of JSON text in the database is the wrong way to go. Why bother hiring a database that provides DRI at all.
Data Manipulation Layer
Included in your question is what type of data manipulation you'll use -- could be linq to sql, could be straight ADO, a micro ORM like Dapper, Massive, or PetaPoco, a full blown ORM like Entity Framework or NHibernate.
Have you picked one of these or are you looking for guidance on selecting one?
Parsing in WebAPI
Convering from JSON to an Object or an Object to JSON is easy in WebApi. For JSON specifically, the JSON.Net formatter is hanging around. You can get started by looking here, here, here, and here.
Conceptually, however, it sounds like you're missing part of the magic of WebAPI. With WebAPI you return your object in it's native state (or IQueryable if you want OData support). After your function call finishes the Formatter's take over and serialize it into the proper shape based on the client request. This process is called Content Negotiation. The idea is that your methods are format agnostic and the framework serializes the data into the transport format your client wants (xml,json, whatever).
The reverse is true too, where the framework deserializes the format provided by the client into a native object.
I'm working with a set of lobbying disclosure records. The Secretary of the Senate publishes these records as XML files, which look like this:
<Filing ID="1ED696B6-B096-4591-9181-DA083921CD19" Year="2010" Received="2011-01-01T11:33:29.330" Amount="" Type="LD-203 YEAR-END REPORT" Period="Year-End (July 1 - Dec 31)">
<Registrant xmlns="" RegistrantID="8772" RegistrantName="CERIDIAN CORPORATION" Address="4524 Cheltenham Drive
Bethesda, MD 20814" RegistrantCountry="USA"/>
<Lobbyist xmlns="" LobbyistName="O'CONNELL, JAMES"/>
</Filing>
<Filing ID="179345CF-8D41-4C71-9C19-F41EB88254B5" Year="2010" Received="2011-01-01T13:48:31.543" Amount="" Type="LD-203 YEAR-END AMENDMENT" Period="Year-End (July 1 - Dec 31)">
<Registrant xmlns="" RegistrantID="400447142" RegistrantName="Edward Merlis" Address="8202 Hunting Hill Lane
McLean, VA 22102" RegistrantCountry="USA"/>
<Lobbyist xmlns="" LobbyistName="Merlis, Edward A"/>
<Contributions>
<Contribution xmlns="" Contributor="Merlis, Edward A" ContributionType="FECA" Payee="DeFazio for Congress" Honoree="Cong. Peter DeFazio" Amount="250.0000" ContributionDate="2010-09-05T00:00:00"/>
<Contribution xmlns="" Contributor="Merlis, Edward A" ContributionType="FECA" Payee="Friends of Jim Oberstar" Honoree="Cong. Jim Oberstar" Amount="1000.0000" ContributionDate="2010-09-01T00:00:00"/>
<Contribution xmlns="" Contributor="Merlis, Edward A" ContributionType="FECA" Payee="McCaskill for Missouri 2012" Honoree="Senator Claire McCaskill" Amount="1000.0000" ContributionDate="2010-09-18T00:00:00"/>
<Contribution xmlns="" Contributor="Merlis, Edward A" ContributionType="FECA" Payee="Mesabi Fund" Honoree="Cong. Jim Oberstar" Amount="500.0000" ContributionDate="2010-07-13T00:00:00"/>
</Contributions>
</Filing>
As you can see, some <Filing> tags also contain <Contribution> tags, but others do not.
I see two objects here: contributors (i.e., lobbyists) and contributions (i.e., a transaction between a lobbyist and a member of Congress).
I'd like to load these records into a MySQL database. To me, the logical structure would include two tables: one for contributors (with fields for name, ID, address, etc.) and one for contributions (with amount, recipient, etc., and a relational link to the list of contributors).
My question: am I approaching this problem correctly? If so, does this data schema make sense? Finally, how am I to parse the XML to load it into the MySQL tables as I've structured them?
Solved: I'm using a Python SAX parser to process the XML file.
If you are using MySQL version 5.5 you may find the LOAD XML command useful.
That being said, LOAD XML appears to be geared towards loading data into a single table for a given XML file, so it may not work for your specific files.
Tradiional approach for these kind of problems is to use an ETL tool.
Do you already have such tool (E.g. Informatica / Talend) in your organization?
Another approach is to write a small utility to parse these XMLs and load this data by creation of master detail relationships in MySQL.