I have an ADF data flow which uses a pre-defined data set of type rest as the source. From the moment the get request is made, ADF shows all datetime values as null. Questions such as Azure data factory data flow silently NULLing date column suggest that it is an issue in how adf is casting the value, and that I should intercept the value before that point and apply my own casting logic. Since the values are presented as null from the get, I can't see how I could intercept the value before it is cast.
I know these values to be non-null when I make this rest call in another environment (curl, postman etc). Hoping someone has some insight. I would like to avoid processing the data outside of ADF as an intermediate step if possible.
Here is an example json body which shows the format the date objs are returned in any alternative environment:
"stats": {
"ticket_id": 100000549322,
"resolved_at": null,
"created_at": "2023-01-04T19:09:19Z",
"updated_at": "2023-01-04T19:09:22Z"
}
Under the projection tab of the source tile you can select the default formatting of a data type
Related
I have ADF CopyRestToADLS activity which correctly saves json complex object to Data Lake storage. But I additionally need to pass one of the json values (myextravalue) to a stored procedure. I tried referencing it in the stored procedure parameter as #{activity('CopyRESTtoADLS').output.myextravalue but I am getting error
The actions CopyRestToADLS refernced by 'inputs' in the action ExectuteStored procedure1 are not defined in the template
{
"items": [1000 items],
"count": 1000,
"myextravalue": 15983444
}
I would like to try to dynamically reference this value because the CopyRestToADLS source REST dataset dynamically calls different REST endpoints so the structure of JSON object is different each time. But the myextravalue is always present in each JSON call.
How is it possible to refernce myextravalue and use it as a parameter?
Rich750
You could create another lookup active on REST data source to get the json value. Then pass it to the Stored Procedure active.
Yes, it will create a new REST request, and it seams to be an easy way to achieve your purpose. Lookup active to get the content of the source and won't save it.
The another solution may be get the value from the copy active output file, after the copy active completed.
I'm glad you solved it by this way:
"I created a Data Flow to read from the folder where Copy Activity saves dynamically named output json filenames. After importing schema from sample file, I selected the myextravalue as the only mapping in the Sink Mapping section."
I have an azure data factory pipeline for fetch the data from a third party API and store the data to the data-lake as .json format. When i click the import schema, it shows the correct datatype format.
When I set the above mentioned data-lake as a source of data flow activity, the Int64 data type convert to boolean. I have checked the Microsoft documents and knew if the value is 0 or 1, it automatically convert to boolean. How can I avoid this data type conversion?
First, verify if you have checked 'Infer drifted column types' to true under Source Settings.
Data Factory detects the data type as boolean if the values in the source column are only 1 or 0. This could be a potential bug.
One way around is, since you are using Data Flow, Add derivations for the columns using a Case statement and derive 1 & 0 in output based on boolean value.
The easiest way is that just reset the all schema to String, that means don't convert the data type in Source dataset.
For example, this my source dataset schema and data, all the values in setNum are 1 or 0:
Data Flow Source Projection, the data type of setNum first considered as Boolean.
Reset schema: all the data type will be string.
Then data factory will convert the data type in Sink level. It is similar with copy data from csv file.
Update:
You can first reset the schema to String.
Then using Derived Column to change/convert the data type as you want.
Using bellow expressions:
toShort()
toString()
toShort()
This will solve the problem.
I have an Azure Data Factory v2 pipeline that's pulling data from a Cosmos DB collection. This collection has a property that's an array.
I want to, at the least, be able to dump that entire property's value into a column in SQL Azure. I don't need it parsed (although that would be great too), but ADF lists this column as "Unsupported Type" in the dataset definition and listed it in the Excluded Columns section.
Here is an example of the JSON I'm working with. The property I want is "MyArrayProperty":
{
"id": "c4e2012e-af82-4c48-8960-11e0436e6d3f",
"Created": "2019-06-14T16:04:13.9572567Z",
"Updated": "2019-06-14T16:04:14.1920988Z",
"IsActive": true,
"MyArrayProperty": [
{
"SomeId": "a4427015-ca69-4958-90d3-0918fd5dcac1",
"SomeName": "BlahBlah"
}
]
}
}
I've tried manually specifying a column in the ADF data source like "MyArrayProperty" and using a string data type, but the value always comes across as null.
please check this document about schema mapping example between MongoDB and Azure SQL. Basically you should define your collectionReference that will iterate through your nested array of objects and do cross apply.
There may be a better way to solve this problem, but I ended up creating a second copy activity which uses a query against Cosmos rather than a collection based capture. The query flattened the array like so:
SELECT m.id, c.SomeId, c.SomeName
FROM myCollection m join c in m.MyArrayProperty
I then took this data set and dumped it into a table in SQL then did my other work inside SQL Azure itself. You could also use the new Join pipeline task to do this in memory before it gets to the destination.
When I make a Https GET call, the following JSON is returned
{"-KXprfmbpX6dEqXLU1z_":{"Pizzatype":"Margarita"},"Toppings":{"-KXprfm_PBdOYUYiWzkK":["Onions","Mushrooms"]}}
I want to retrieve the values of Pizzatype and Toppings. However I am not aware of the real time keys -KXprfmbpX6dEqXLU1z_ , -KXprfm_PBdOYUYiWzkK. I am coding in an online Bot building platform in Node. How can I retrieve these values?
I'm using Spring Roo on top of MySQL. I pull dates out via the Roo-generated JSON methods, make changes to that data in a browser form, and save it back via the Roo-generated JSON methods.
The date format I'm getting out is yyyy-MM-dd, standard MySQL date format. I'm using a calendaring widget on the browser to ensure the date I'm submitting is the same format.
Unfortunately my data doesn't go right through the ...FromJson() method, failing with the error:
Parsing date 2007-12-12 was not recognized as a date format
I presume that the problem is that it's coming out as a string, but JPA feels like it needs to generate a Date object to update.
I'll happily show my code about this, but it's nothing Roo didn't build for me.
It occurs to me that there's something it's referring to when it says "recognized as a date format". Is there somewhere I can change what date formats it knows?
EDIT: With #nowaq's help, here's the ultimate answer:
public static Lease fromJsonToLease(String json) {
return new JSONDeserializer<Lease>()
.use(null, Lease.class)
.use(Date.class, new DateFormatter("yyyy-MM-dd"))
.deserialize(json);
}
That way JSONDeserializer knows what class it's dealing with AND builds a formatter for all the dates in that class. Wicked!
Your question is very related to this one: Spring-roo REST JSON controllers mangle date fields Take a look and make sure you're using correct DateTrasformer(s) with your JSON deserializers. E.g.
new JSONDeserializer()
.use(Date.class, new DateTransformer("yyyy-MM-dd") )
.deserialize( people );