How do I set a variable in Foundry's SQL Transforms? - palantir-foundry

Is there a way to set variables in Foundry's transforms-sql? I have a list of values that I reference multiple times in a query that would ideally just specify once.
Currently doing: SELECT * FROM foo WHERE param IN ('a', 'b', 'c')
Want to do something like: SET param_list = ('a', 'b', 'c') SELECT * FROM foo WHERE param IN #param_list

Unfortunately, SparkSQL as a language doesn't support variables yet, so your best alternative would be to rewrite this transform in Python. It will support dynamic queries, parameters, and all manner of more expressive queries.

Related

Equivalent Functions for JSON/B Operators in Postgresql

I've tried to dig through the documentation and can't seem to find anything for what I'm looking for. Are there equivalent functions for the various JSON/B operators (->, ->>, #>, ?, etc) in PostgreSQL?
Edit: To clarify, I would like to know if it is possible to have the following grouped queries return the same result:
SELECT '{"foo": "bar"}'::json->>'foo'; -- 'bar'
SELECT json_get_value('{"foo": "bar"}'::json, 'foo'); -- 'bar'
SELECT '{"foo": "bar"}'::jsonb ? 'foo'; -- t
SELECT jsonb_key_exists('{"foo": "bar"}'::jsonb, 'foo'); -- t
You can use the system catalogs to discover the function equivalent to each operator.
select * from pg_catalog.pg_operator where oprname ='?';
This shows that the function is named "jsonb_exists". Some operators are overloaded and will give more than one function, you have to look at the argument types to distinguish them.
Every operator has a function 'behind' it. That function may or may not be documented in its own right.
AFAIK there are two functions which are equivalents to #> and #>> operators.
These are:
#> json_extract_path
#>> json_extract_path_text
->> json_extract_path_text -- credits: #a_horse_with_no_name
Discover more in the docs
Other than that you could extract json to a table and take values you need using regular SQL query vs a table using json_each or json_each_text.
Same thing with checking if a key exists in a JSON would be to use json_object_keys and also query the table which comes out of it.
If you need to wrap things up in a different language / using ORM then what you could do is move the data retrieval logic to a PL/SQL procedure and just execute it and obtain prepared data from it.
Obviously, you could also build your own functions that would implement the behaviour of forementioned operators, but the question is: is it really worth it? It will definitely be slower.

Ecto query and custom MySQL function with variable arity

I want to perform a query like the following one:
SELECT id, name
FROM mytable
ORDER BY FIELD(name, 'B', 'A', 'D', 'E', 'C')
FIELD is a MySQL specific function, and 'B', 'A', 'D', 'E', 'C' are values coming from a List.
I tried using fragment, but it doesn't seem to allow dynamic arity known only in the runtime.
Except going full-raw using Ecto.Adapters.SQL.query, is there a way to handle this using Ecto's query DSL?
Edit: Here's the first, naive approach, which of course does not work:
ids = [2, 1, 3] # this list is of course created dynamically and does not always have three items
query = MyModel
|> where([a], a.id in ^ids)
|> order_by(fragment("FIELD(id, ?)", ^ids))
ORM are wonderful, until they leak. All do, eventually. Ecto is young (f.e., it only gained ability to OR where clauses together 30 days ago), so it's simply not mature enough to have developed an API that considers advanced SQL gyrations.
Surveying possible options, you're not alone in the request. The inability to comprehend lists in fragments (whether as part of order_by or where or any where else) has been mentioned in Ecto issue #1485, on StackOverflow, on the Elixir Forum and this blog post. The later is particulary instructive. More on that in a bit. First, let's try some experiments.
Experiment #1: One might first try using Kernel.apply/3 to pass the list to fragment, but that won't work:
|> order_by(Kernel.apply(Ecto.Query.Builder, :fragment, ^ids))
Experiment #2: Then perhaps we can build it with string manipulation. How about giving fragment a string built-at-runtime with enough placeholders for it to pull from the list:
|> order_by(fragment(Enum.join(["FIELD(id,", Enum.join(Enum.map(ids, fn _ -> "?" end), ","), ")"], ""), ^ids))
Which would produce FIELD(id,?,?,?) given ids = [1, 2, 3]. Nope, this doesn't work either.
Experiment #3: Creating the entire, final SQL built from the ids, placing the raw ID values directly in the composed string. Besides being horrible, it doesn't work, either:
|> order_by(fragment(Enum.join(["FIELD(id,", Enum.join(^ids, ","), ")"], "")))
Experiment #4: This brings me around to that blog post I mentioned. In it, the author hacks around the lack of or_where using a set of pre-defined macros based on the number of conditions to pull together:
defp orderby_fragment(query, [v1]) do
from u in query, order_by: fragment("FIELD(id,?)", ^v1)
end
defp orderby_fragment(query, [v1,v2]) do
from u in query, order_by: fragment("FIELD(id,?,?)", ^v1, ^v2)
end
defp orderby_fragment(query, [v1,v2,v3]) do
from u in query, order_by: fragment("FIELD(id,?,?,?)", ^v1, ^v2, ^v3)
end
defp orderby_fragment(query, [v1,v2,v3,v4]) do
from u in query, order_by: fragment("FIELD(id,?,?,?)", ^v1, ^v2, ^v3, ^v4)
end
While this works and uses the ORM "with the grain" so to speak, it requires that you have a finite, manageable number of available fields. This may or may not be a game changer.
My recommendation: don't try to juggle around an ORM's leaks. You know the best query. If the ORM won't accept it, write it directly with raw SQL, and document why the ORM does not work. Shield it behind a function or module so you can reserve the future right to change its implementation. One day, when the ORM catches up, you can then just rewrite it nicely with no effects on the rest of the system.
Create a table with 2 columns:
B 1
A 2
D 3
E 4
C 5
Then JOIN LEFT(name, 1) to it and get the ordinal. Then sort by that.
(Sorry, I can't help with Elixir/Ecto/Arity.)
I would try to resolve this using the following SQL SELECT statement:
[Note: Don't have access right now to a system to check the correctness of the syntax, but I think it is OK]
SELECT A.MyID , A.MyName
FROM (
SELECT id AS MyID ,
name AS MyName ,
FIELD(name, 'B', 'A', 'D', 'E', 'C') AS Order_By_Field
FROM mytable
) A
ORDER BY A.Order_By_Field
;
Please note that the list 'B','A',... can be passed as either an array or any other method and replace what is written in the above code sample.
This was actually driving me crazy until I found that (at least in MySQL), there is a FIND_IN_SET function. The syntax is a bit weird, but it doesn't take variable arguments, so you should be able to do this:
ids = [2, 1, 3] # this list is of course created dynamically and does not always have three items
ids_string = Enum.join(ids, ",")
query = MyModel
|> where([a], a.id in ^ids)
|> order_by(fragment("FIND_IN_SET(id, ?)", ^ids_string))

Searching for multiple values in 1 query

If I have a database having 2 fields, Roll no and name and I have a list (of n values) of roll numbers for which I have to search the corresponding names.
Can this be done using just one query in SQL or HQL?
SELECT name FROM [table] WHERE id IN ([list of ids])
where [list of ids] is for example 2,3,5,7.
Use the IN operator and separate your Roll no's by a comma.
SELECT name
FROM yourtable
WHERE [Roll no] IN (1, 2, 3, 4, etc)
You can use the IN statement as shown above.
There are a couple of minor issues with this. It can perform poorly if the number of values in the clause gets too large.
The second issue is that in many development environments you land up needing to dynamically create the query with a variable number of items (or a variable number of placeholders if using parameterised queries). While not difficult if does make your code look messy and mean you haven't got a nice neat piece of SQL that you can copy out and use to test.
But examples (using php).
Here the IN is just dynamically created with the SQL. Assuming the roll numbers can only be integers it is applying intval() to each member of the array to avoid any non integer values being used in the SQL.
<?php
$list_of_roll_no = array(1,2,3,4,5,6,7,8,9);
$sql = "SELECT FROM some_table WHERE `Roll no` IN (".implode(", ", array_map ('intval', $list_of_roll_no)).")";
?>
Using mysqli bound parameters is a bit messy. This is because the bind parameter statement expects a variable number of parameters. The 2nd parameter onwards are the values to be bound, and it expects them to be passed by reference. So the foreach here is used to generate an array of references:-
<?php
$list_of_roll_no = array(1,2,3,4,5,6,7,8,9);
if ($stmt = $mysqli->prepare("SELECT FROM some_table WHERE `Roll no` IN (".implode(",", array_fill(0, count($list_of_roll_no), '?')).")"))
{
$bind_arguments = [];
$bind_arguments[] = str_repeat("i", count($list_of_roll_no));
foreach ($list_of_roll_no as $list_of_roll_no_key => $list_of_roll_no_value)
{
$bind_arguments[] = & $list_of_roll_no[$list_of_roll_no_key]; # bind to array ref, not to the temporary $recordvalue
}
call_user_func_array(array($statement, 'bind_param'), $bind_arguments);
$statement->execute();
}
?>
Another solution is to push all the values into another table. Can be a temp table. Then you use an INNER JOIN between your table and your temp table to find the matching values. Depending on what you already have in place then this is quite easy to do (eg, I have a php class to insert multiple records easily - I just keep passing them across and the class batches them up and inserts them occasionally to avoid repeatedly hitting the database).

Sort a rails query based on array order

I want to sort an ActiveRecord query based on the values in an array. Something like:
#fruits=Fruit.where(seeds: true)._________________________
Say I wanted to sort the results by color using the array ['Red','Blue','Yellow']
I see where SQL supports the use of a case statement for custom ordering, does Rails have something that utilizes this?
If you're using MySQL, you can use FIELD. It would look like:
#fruit = Fruit.where(seeds: true).order("FIELD(color, 'Red', 'Blue', 'Yellow')")

Seeding SQLite RANDOM()

Does SQLite support seeding the RANDOM() function the same way MySQL does with RAND()?
$query = "SELECT * FROM table ORDER BY RAND(" . date('Ymd') . ") LIMIT 1;";
From the MySQL Manual about RAND(N):
If a constant integer argument N is
specified, it is used as the seed
value, which produces a repeatable
sequence of column values. In the
following example, note that the
sequences of values produced by
RAND(3) is the same both places where
it occurs.
If not, is there any way to archive the same effect using only one query?
Have a look at the sqlite3_randomness() function:
SQLite contains a high-quality pseudo-random number generator (PRNG) used to select random ROWIDs when inserting new records into a table that already uses the largest possible ROWID. The PRNG is also used for the build-in random() and randomblob() SQL functions.
...
The first time this routine is invoked (either internally or by the application) the PRNG is seeded using randomness obtained from the xRandomness method of the default sqlite3_vfs object. On all subsequent invocations, the pseudo-randomness is generated internally and without recourse to the sqlite3_vfs xRandomness method.
Looking at the source of this xRandomness method, you can see that it reads from /dev/urandom on Unix. On Windows, it just returns the return values of some time functions. So it seems that your only option is to start hacking on the SQLite source code.
If you need a pseudo-random order, you can do something like this (PHP):
$seed = md5(mt_rand());
$prng = ('0.' . str_replace(['0', 'a', 'b', 'c', 'd', 'e', 'f'], ['7', '3', '1', '5', '9', '8', '4'], $seed )) * 1;
$query = 'SELECT id, name FROM table ORDER BY (substr(id * ' . $prng . ', length(id) + 2))';
Plus, you can set $seed to the predefined value and always get same results.
I've learned this trick from my colleague http://steamcooker.blogspot.com/