Saving JSON data to DB in Zotonic - json

Im trying to write a small app that retrieves a JSON file (it contains a list of items, which all have some properties), saves its contents to the DB and then displays some of it later on. I have Zotonic up and running, and generating some HTML is no problem.
ATM i'm stuck trying to figure out how to define a custom resource and how to get the data from the JSON in the DB. When the data is there I should be fine, that part seems covered ok by the documentation.
I wrote some standalone erlang scripts that fetch the data and I noticed that Zotonic has a library for decoding JSON so that part should be fine. Any tips on where to put which code or where to look further?

The z_db module allows for creating custom tables by using:
z_db:create_table(Table, Cols, Context).
The Table variable is your table name which can be either an atom or a list containing a single atom.
The Cols is a list of column definitions, which are defined by records. Currently the record definition (you can find this in include/zotonic.hrl) is:
-record(column_def, {name, type, length, is_nullable=true, default, primary_key}).
See Erlang docs on records for more info on records
Example code which I put in users/sites/[sitename]/models/m_[sitename].erl:
init(Context) ->
case z_db:table_exists(?table,Context) of
false ->
z_db:create_table(tablename,
[
#column_def{name=id, type="serial"},
#column_def{name=gid, type="integer", is_nullable=false},
#column_def{name=magnitude, type="real"},
#column_def{name=depth, type="real"},
#column_def{name=location, type="character varying"},
#column_def{name=time, type="integer"},
#column_def{name=date, type="integer"}
], Context);
true -> ok
end,
ok.
Pay attention to what options of the record you specify. Most of the errors I got were e.g. from specifying a length on the integer fields.
The models/m_sitename:init/1 does not get called on site start. The sitename:init/1 does get called so I call the init function there to ensure the table exists. Example:
init(Context) ->
m_sitename:init(Context).
It is called by Zotonic with the Context variable of the site automatically. You can get this variable manually as well with z:c(sitename).. So if you call the m_sitename:init(Context). from somewhere else you would do:
m_sitename:init(z:c(sitename)).
Next, insertion in the DB can be done with:
z_db:insert(Table, PropList, Context).
Where Table is again an atom or a list containing a single atom representing the table name. Context is the same as above.
PropList is a property list which is a list containing tuples consisting of two elements where the first is an atom and the second is its associated value/property. Example:
PropList = [
{row, Value},
{anotherrow, AnotherValue}
].
Table = tablename.
Context = z:c(sitename).
z_db:insert(Table, PropList, Context).
See Erlang docs on Property Lists for more info on property lists.
=== The dependencies have been updated so if you build from source the step directly below is no longer needed ===
The JSON part is bit more tricky. Included with Zotonic are mochijson2 and as a secondary dependency also jiffy. The latest version of jiffy contains jiffy:decode/2 which allows you to specify maps as a return type. Much more readable than the standard {struct, {struct, <<"">>}} monster. To update to the latest version edit the line in deps/twerl/rebar.config that says
{jiffy, ".*", {git, "https://github.com/davisp/jiffy.git", {tag, "0.8.3"}}},
to
{jiffy, ".*", {git, "https://github.com/davisp/jiffy.git", {tag, "0.14.3"}}},
Now run z:m(). in the Zotonic shell. (you must do this after every change in your code).
Now check in the Zotonic shell if there is a jiffy:decode/2 available by typing jiffy: <tab>, it will show a list of available functions and their arity.
To retrieve a JSON file from the internet run:
{ok, {{_, 200, _}, _, Body}} = httpc:request(get, {"url-to-JSON-here", []}, [], [])
Which will yield the variable Body with the contents. See Erlang docs on http client for more info on this call.
Next convert the contents of Body to Erlang terms with:
JsonData = jiffy:decode(Body, [return_maps]).
What you have to do next depends a lot on the structure of your JSON resource. Keep in mind that everything is now in binary UTF-8 encoded strings! If you print JsonData to screen (just enter JsonData. in your Zotonic/Erlang shell) you will see a lot of #map{<<"key"", <<"Value">>} this.
My data was nested so I had to extract the needed data like this:
[{_,ItemList}|_] = ListData.
This gave me a list of maps, and in order to deal with them as individual items I used the following function:
get_maps([]) ->
done;
get_maps([First|Rest]) ->
Map = maps:get(<<"properties">>, First),
case is_map(Map) of
true ->
map_to_proplist(Map),
get_maps(Rest);
false -> done
end,
done;
get_maps(_) ->
done.
As you might remember, the z_db:insert/3 function needs a property list to populate rows, so that what the call to map_to_proplist/1 is for. How this function looks is completely dependent on how your data looks but as an example here is what worked for me:
map_to_proplist(Map) ->
case is_map(Map) of
true ->
{Value1,_} = string:to_integer(binary_to_list(maps:get(<<"key1">>, Map))),
{Value2,_} = string:to_float(binary_to_list(maps:get(<<"key2">>, Map))),
{Value3,_} = string:to_float(binary_to_list(maps:get(<<"key3">>, Map))),
Value4 = binary_to_list(maps:get(<<"key4">>, Map)),
{Value5,_} = string:to_integer(binary_to_list(maps:get(<<"key5">>, Map))),
{Value6,_} = string:to_integer(binary_to_list(maps:get(<<"key6">>, Map))),
PropList = [{rowname1, Value1}, {rowname2, Value2}, {rowname3, Value3}, {rowname4, Value4}, {rowname5, Value5}, {rowname6, Value6}],
m_sitename:insert_items(PropList,z:c(sitename)),
ok;
false ->
ok
end.
See the documentation on string:to_list/1 as to why the tuples are needed when casting. The call to m_sitename:insert_items(PropList,z:c(sitename)) calls the z_db:insert/3 in models/m_sitename.erl but wrapped in a catch:
insert_items(PropList,Context) ->
(catch z_db:insert(?table, PropList, Context)).
Ok, quite a long post but this should get you up and running if you were looking for this answer.
The above was done with Zotonic 0.13.2 on Erlang/OTP 18.
A repost (except the JSON part) of my post in the Zotonic Developers group.

Related

Deserialize JSON without knowing full structure

I'm redoing the backend of a very basic framework that connects to a completely customizable frontend. It was originally in PHP but for the refactor have been plodding away in F#. Although it seems like PHP might be the more suited language. But people keep telling me you can do everything in F# and I like the syntax and need to learn and this seemingly simple project has me stumped when it comes to JSON. This is a further fleshed out version of my question yesterday, but it got alot more complex than I thought.
Here goes.
The frontend is basically a collection of HTML files, which are simply loaded in PHP and preg_replace() is used to replace things like [var: varName] or [var: array|key] or the troublesome one: [lang: hello]. That needs to be replaced by a variable defined in a translation dictionary, which is stored as JSON which is also editable by a non-programmer.
I can't change the frontend or the JSON files, and both are designed to be edited by non-programmers so it is very likely that there will be errors, calls to language variables that don't exist etc.
So we might have 2 json files, english.json and french.json
english.json contains:
{
"hello":"Hello",
"bye":"Goodbye"
}
french.json:
{
"hello": "Bonjour",
"duck": "Canard"
//Plus users can add whatever else they want here and expect to be able to use it in a template
}
There is a template that contains
<b>[lang: hello]</b>
<span>Favourite Animal: [lang:duck]</span>
In this case, if the language is set to "english" and english.json is being loaded, that should read:
<b>Hello</b>
<span>Favourite Animal: </span>
Or in French:
<b>Bonjour</b>
<span>Favourite Animal: Canard</span>
We can assume that the json format key: value is always string:string but ideally I'd like to handle string: 'T as well but that might be beyond the scope of this question.
So I need to convert a JSON file (called by dynamic name, which gave F# Data a bit of an issue I couldn't solve last night as it only allowed a static filename as a sample, and since these two files have potential to be different from sample and provided, the type provider doesn't work) to a dictionary or some other collection.
Now inside the template parsing function I need to replace [lang: hello] with something like
let key = "duck"
(*Magic function to convert JSON to usable collection*)
let languageString = convertedJSONCollection.[key] (*And obviously check if containsKey first*)
Which means I need to call the key dynamically, and I couldn't figure out how to do that with the type that FSharp.Data provided.
I have played around with some Thoth as well to some promising results that ended up going nowhere. I avoided JSON.NET because I thought it was paid, but just realised I am mistaken there so might be an avenue to explore
For comparison, the PHP function looks something like this:
function loadLanguage($lang='english){
$json = file_get_contents("$lang.json");
return json_decode($json, true);
}
$key = 'duck';
$langVars = loadLanguage();
$duck = $langVars[$key] || "";
Is there a clean way to do this in F#/.NET? JSON seems really painful to work with in comparison to PHP/Javascript and I'm starting to lose my mind. Am I going to have to write my own parser (which means probably going back to PHP)?
Cheers to all you F# geniuses who know the answer :p
open Thoth.Json.Net
let deserialiseDictionary (s: string) =
s
|> Decode.unsafeFromString (Decode.keyValuePairs Decode.string)
|> Map.ofList
let printDictionary json =
json
|> deserialiseDictionary
|> fun m -> printfn "%s" m.["hello"] // Hello
For the question about 'T the question becomes, what can 'T be? For json it very limited, it can be a number of things, string, json-object, number, bool or json array. What should happen if it is bool or a number?

Symfony4 Extension->processConfiguration merges Parameters strangely

I have created a Symfony Bundle -- using the DependencyInjection-mechanism that Symfony4 provides.
When my custom Extension is initialized, the provided parameters from my_extension.yaml get piped into my "load"-method as expected, as described in the SymfonyDocs ( https://symfony.com/doc/current/bundles/configuration.html ).
However, if I add an additional yaml-file into the designated package-scope (e.g. config/packages/dev/my_extension.yaml), these parameters end up being merged by the built-in processConfiguration() - method in a strange way:
Whereas the first config is parsed correctly and all parameter keys are kept intact, the second file is not merged as expected, but all the contained values get transformed into numeric array keys, i.e. the original parameter keys get lost on the way.
Example:
Contents of config/packages/my_extension.yaml
my_extension:
parameters:
some_attribute: "original_value1"
some_other_attribute: "original_value2"
Contents of config/packages/dev/my_extension.yaml
my_extension:
parameters:
some_attribute: "new_value1"
specific_attribute: "new_value3"
Results in a merged configuration array that looks like this
parameters:
some_attribute: "new_value1"
some_other_attribute: "new_value2"
0: "new_value1"
1: "new_value2"
2: "new_value3"
while I would expect the resulting configuration to be
parameters:
some_attribute: "new_value1"
some_other_attribute: "original_value2"
specific_attribute: "new_value3"
The last (correct) result is what I get if I manually merge the configs in the "load"-method of my extension like this:
$mergedConfig = [];
foreach($configs as $config) {
$mergedConfig = array_replace_recursive($mergedConfig, $config);
}
$config = $this->processConfiguration($configuration, [$mergedConfig]);
However, why can't I rely on the built-in merging strategy that Symfony4 provides for this scenario? Is this a bug or did I get anything wrong about how Symfony is supposed to merge parameters from different Config-sources?

Can not query nodes but can see all the nodes and properties in the data browser

I have imported a few nodes with properties via the Cypher CSV import (command below) and the nodes seem to have loaded correctly as I can view them in the REST-API (the data browser). When I execute a MATCH (n) RETURN n query, all of the nodes are displayed in the Results pane, and when I click on one of the nodes the properties are displayed in the left pane of the browser (I would attach a screen shot showing what I am trying to refer to here which would make this issue A LOT more clear and easy to understand, but apparently us neophytes are prohibited from providing such useful information).
However, when I try to query any of the nodes directly, I get no rows returned. By "query the nodes directly" I am referring to querying with a WHERE condition where I ask for a specific property:
MATCH (n)
WHERE n:Type="Idea"
RETURN n
Type is one of the properties of the node. No rows are returned from the query. I can click on the node in the Stream pane to open the properties dialog, and I can see the Type property is clearly "Idea."
Am I missing something? The nodes and properties seemed to have loaded into the DB correctly, but I can't seem to query anything. Is "ID" a restricted term? Do I even need an "ID" property (i thought I read somewhere you shouldn't trust the auto-generated IDs as they aren't guaranteed to be unique over time)?
Import statement used to load the nodes is below:
$ auto-index name, ID
$ import-cypher -i ProjectNodesCSV.csv -o ProjectOut.csv CREATE (n:Project {ID:{ID},Name: {Name}, Type: {Type}, ProjectGroupName: {ProjectGroupName}, ProjectCategoryName: {ProjectCategoryName}, UnifierID: {UnifierID}, StartDate: {StartDate}, EndDate: {EndDate}, CapitalCosts: {CapitalCosts}, OandMCosts: {OandMCosts}}) RETURN ID(n) as ID, n.Name as Name

Exceptions in Yesod

I had made a daemon that used a very primitive form of ipc (telnet and send a String that had certain words in a certain order). I snapped out of it and am now using JSON to pass messages to a Yesod server. However, there were some things I really liked about my design, and I'm not sure what my choices are now.
Here's what I was doing:
buildManager :: Phase -> IO ()
buildManager phase = do
let buildSeq = findSeq phase
jid = JobID $ pack "8"
config = MkConfig $ Just jid
flip C.catch exceptionHandler $
runReaderT (sequence_ $ buildSeq <*> stages) config
-- ^^ I would really like to keep the above line of code, or something like it.
return ()
each function in buildSeq looked like this
foo :: Stage -> ReaderT Config IO ()
data Config = MkConfig (Either JobID Product) BaseDir JobMap
JobMap is a TMVar Map that tracks information about current jobs.
so now, what I have are Handlers, that all look like this
foo :: Handler RepJson
foo represents a command for my daemon, each handler may have to process a different JSON object.
What I would like to do is send one JSON object that represents success, and another JSON object that espresses information about some exception.
I would like foos helper function to be able to return an Either, but I'm not sure how I get that, plus the ability to terminate evaluation of my list of actions, buildSeq.
Here's the only choice I see
1) make sure exceptionHandler is in Handler. Put JobMap in the App record. Using getYesod alter the appropriate value in JobMap indicating details about the exception,
which can then be accessed by foo
Is there a better way?
What are my other choices?
Edit: For clarity, I will explain the role ofHandler RepJson. The server needs some way to accept commands such as build stop report. The client needs some way of knowing the results of these commands. I have chosen JSON as the medium with which the server and client communicate with each other. I'm using the Handler type just to manage the JSON in/out and nothing more.
Philosophically speaking, in the Haskell/Yesod world you want to pass the values forward, rather than return them backwards. So instead of having the handlers return a value, have them call forwards to the next step in the process, which may be to generate an exception.
Remember that you can bundle any amount of future actions into a single object, so you can pass a continuation object to your handlers and foos that basically tells them, "After you are done, run this blob of code." That way they can be void and return nothing.

What is the simple way to find the column name from Lineageid in SSIS

What is the simple way to find the column name from Lineageid in SSIS. Is there any system variable avilable?
I remember saying this can't be that hard, I can write some script in the error redirect to lookup the column name from the input collection.
string badColumn = this.ComponentMetaData.InputCollection[Row.ErrorColumn].Name;
What I learned was the failing column isn't in that collection. Well, it is but the ErrorColumn reported is not quite what I needed. I couldn't find that package but here's an example of why I couldn't get what I needed. Hopefully you will have better luck.
This is a simple data flow that will generate an error once it hits the derived column due to division by zero. The Derived column generates a new output column (LookAtMe) as the result of the division. The data viewer on the Error Output tells me the failing column is 73. Using the above script logic, if I attempted to access column 73 in the input collection, it's going to fail because that is not in the collection. LineageID 73 is LookAtMe and LookAtMe is not in my error branch, it's only in the non-error branch.
This is a copy of my XML and you can see, yes, the outputColumn id 73 is LookAtme.
<outputColumn id="73" name="LookAtMe" description="" lineageId="73" precision="0" scale="0" length="0" dataType="i4" codePage="0" sortKeyPosition="0" comparisonFlags="0" specialFlags="0" errorOrTruncationOperation="Computation" errorRowDisposition="RedirectRow" truncationRowDisposition="RedirectRow" externalMetadataColumnId="0" mappedColumnId="0"><properties>
I really wanted that data though and I'm clever so I can union all my results back together and then conditional split it back out to get that. The problem is, Union All is an asynchronous transformation. Async transformations result in the data being copied from one set of butters to another resulting in...new lineage ids being assigned so even with a union all bringing the two streams back together, you wouldn't be able to call up the data flow chain to find that original lineage id because it's in a different buffer.
Around this point, I conceded defeat and decided I could live without intelligent/helpful error reporting in my packages.
I know this is a long dead thread but I tripped across a manual solution to this problem and thought I would share for anyone who happens upon this same problem. Granted this doesn't provide a programmatic solution to the problem but for simple debugging it should do the trick. The solution uses a Derived Column as an example but this seems to work for any Data Flow component.
Answer provided by Todd McDermid and taken from AskSQLServerCentral:
"[...] Unfortunately, the lineage ID of your columns is pretty well hidden inside SSIS. It's the "key" that SSIS uses to identify columns. So, in order to figure out which column it was, you need to open the Advanced Editor of the Derived Column component or Data Conversion. Do that by right clicking and selecting "Advanced Editor". Go to the "Input and Output Properties" tab. Open the first node - "Derived Column Input" or "Data Conversion Input". Open the "Input Columns" tab. Click through the columns, noting the "LineageID" property of each. You may have to do the same with the "Derived Column Output" node, and "Output Columns" inside there. The column that matches your recorded lineage ID is the offending column."
For anyone using SQL Server versions before SS2016, here are a couple of reference links for a way to get the Column name:
http://www.andrewleesmith.co.uk/2017/02/24/finding-the-column-name-of-an-ssis-error-output-error-column-id/
which is based on:
http://toddmcdermid.blogspot.com/2016/04/finding-column-name-for-errorcolumn.html
I appreciate we aren't supposed to just post links, but this solution is quite convoluted, and I've tried to summarise by pulling info from both Todd and Andrew's blog posts and recreating them here. (thank you to both if you ever read this!)
From Todd's page:
Go to the "Inputs and Outputs" page, and select the "Output 0" node.
Change the "SynchronousInputID" property to "None". (This changes
the script from synchronous to asynchronous.)
On the same page, open the "Output 0" node and select the "Output
Columns" folder. Press the "Add Column" button. Change the "Name"
property of this new column to "LineageID".
Press the "Add Column" button again, and change the "DataType"
property to "Unicode string [DT_WSTR]", and change the "Name"
property to "ColumnName".
Go to the "Script" page, and press the "Edit Script" button. Copy
and paste this code into the ScriptMain class (you can delete all
other method stubs):
public override void CreateNewOutputRows() {
IDTSInput100 input = this.ComponentMetaData.InputCollection[0];
if (input != null)
{
IDTSVirtualInput100 vInput = input.GetVirtualInput();
if (vInput != null)
{
foreach (IDTSVirtualInputColumn100 vInputColumn in vInput.VirtualInputColumnCollection)
{
Output0Buffer.AddRow();
Output0Buffer.LineageID = vInputColumn.LineageID;
Output0Buffer.ColumnName = vInputColumn.Name;
}
}
} }
Feel free to attach a dummy output to that script, with a data viewer,
and see what you get. From here, it's "standard engineering" for you
ETL gurus. Simply merge join the error output of the failing
component with this metadata, and you'll be able to transform the
ErrorColumn number into a meaningful column name.
But for those of you that do want to understand what the above script
is doing:
It's getting the "first" (and only) input attached to the script
component.
It's getting the virtual input related to the input. The "input" is
what the script can actually "see" on the input - and since we
didn't mark any columns as being "ReadOnly" or "ReadWrite"... that
means the input has NO columns. However, the "virtual input" has
the complete list of every column that exists, whether or not we've
said we're "using" it.
We then loop over all of the "virtual columns" on this virtual
input, and for each one...
Get the LineageID and column name, and push them out as a new row on
our asynchronous script.
The image and text from Andrew's page helps explain it in a bit more detail:
This map is then merge-joined with the ErrorColumn lineage ID(s)
coming down the error path, so that the error information can be
appended with the column name(s) from the map. I included a second
script component that looks up the error description from the error
code, so the error table rows that we see above contain both column
names and error descriptions.
The remaining component that needs explaining is the conditional split
– this exists just to provide metadata to the script component that
creates the map. I created an expression (1 == 0) that always
evaluates to false for the “No Rows – Metadata Only” path, so no rows
ever travel down it.
Whilst this solution does require the insertion of some additional
plumbing within the data flow, we get extremely valuable information
logged when errors do occur. So especially when the data flow is
running unattended in Production – when we don’t have the tools &
techniques available at design time to figure out what’s going wrong –
the logging that results gives us much more precise information about
what went wrong and why, compared to simply giving us the failed data
and leaving us to figure out why it was rejected.
There is no simple way to find out column name by lineage id.
If you want to know this using BIDS You have to inspect all components inside dataflow using Advanced properties, Input and Output columns tab and see LineageID for each column and input/output path.
But You can:
inspect XML - this is very difficult
write .NET application and use FindColumnByLineageId
However, second solution includes a lot of coding and understanding of pipeline because You have to programmaticaly: open the package, iterate over tasks, iterate inside containers, iterate over transformations inside data flows to find particular component to use proposed method.
Here is a solution that:
Works at package runtime (not pre-populating)
Is automated through a Script Task and Component
Doesn't involve installing new assemblies or custom components
Is nicely BIML compatible
Check out the full solution here.
EDIT
Here is the short version.
Create 2 Object variables, execsObj and lineageIds
Create Script Task in Control flow, give it ReadWrite access to both variables
Add an assembly reference to your script task for Microsoft.SqlServer.DTSPipelineWrap.dll (may required SQL Client SDK be installed; required for the MainPipe object below)
Insert the following code into your Script Task
Dictionary<int, string> lineageIds = null;
public void Main()
{
// Grab the executables so we have to something to iterate over, and initialize our lineageIDs list
// Why the executables? Well, SSIS won't let us store a reference to the Package itself...
Dts.Variables["User::execsObj"].Value = ((Package)Dts.Variables["User::execsObj"].Parent).Executables;
Dts.Variables["User::lineageIds"].Value = new Dictionary<int, string>();
lineageIds = (Dictionary<int, string>)Dts.Variables["User::lineageIds"].Value;
Executables execs = (Executables)Dts.Variables["User::execsObj"].Value;
ReadExecutables(execs);
Dts.TaskResult = (int)ScriptResults.Success;
}
private void ReadExecutables(Executables executables)
{
foreach (Executable pkgExecutable in executables)
{
if (object.ReferenceEquals(pkgExecutable.GetType(), typeof(Microsoft.SqlServer.Dts.Runtime.TaskHost)))
{
TaskHost pkgExecTaskHost = (TaskHost)pkgExecutable;
if (pkgExecTaskHost.CreationName.StartsWith("SSIS.Pipeline"))
{
ProcessDataFlowTask(pkgExecTaskHost);
}
}
else if (object.ReferenceEquals(pkgExecutable.GetType(), typeof(Microsoft.SqlServer.Dts.Runtime.ForEachLoop)))
{
// Recurse into FELCs
ReadExecutables(((ForEachLoop)pkgExecutable).Executables);
}
}
}
private void ProcessDataFlowTask(TaskHost currentDataFlowTask)
{
MainPipe currentDataFlow = (MainPipe)currentDataFlowTask.InnerObject;
foreach (IDTSComponentMetaData100 currentComponent in currentDataFlow.ComponentMetaDataCollection)
{
// Get the inputs in the component.
foreach (IDTSInput100 currentInput in currentComponent.InputCollection)
foreach (IDTSInputColumn100 currentInputColumn in currentInput.InputColumnCollection)
lineageIds.Add(currentInputColumn.ID, currentInputColumn.Name);
// Get the outputs in the component.
foreach (IDTSOutput100 currentOutput in currentComponent.OutputCollection)
foreach (IDTSOutputColumn100 currentoutputColumn in currentOutput.OutputColumnCollection)
lineageIds.Add(currentoutputColumn.ID, currentoutputColumn.Name);
}
}
Create Script Component in Dataflow with ReadOnly access to lineageIds and the following code.
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
Dictionary<int, string> lineageIds = (Dictionary<int, string>)Variables.lineageIds;
int? colNum = Row.ErrorColumn;
if (colNum.HasValue && (lineageIds != null))
{
if (lineageIds.ContainsKey(colNum.Value))
Row.ErrorColumnName = lineageIds[colNum.Value];
else
Row.ErrorColumnName = "Row error";
}
Row.ErrorDescription = this.ComponentMetaData.GetErrorDescription(Row.ErrorCode);
}