I am working on a project that involves code in both Prolog and SQL to solve the same problem. A problem I've run across is I can't use a single database to form a hierarchy. In this list of prolog facts you can see that the "assembly" parts are related to each other.
basicpart(spoke).
basicpart(rearframe).
basicpart(handles).
basicpart(gears).
basicpart(bolt).
basicpart(nut).
basicpart(fork).
assembly(bike,[wheel,wheel,frame]).
assembly(wheel,[spoke,rim,hub]).
assembly(frame,[rearframe,frontframe]).
assembly(frontframe,[fork,handles]).
assembly(hub,[gears,axle]).
assembly(axle,[bolt,nut]).
If I put all of these "assembly" definitions into one SQL database, can I use knight moves (joining a table to itself on 2 different columns in it) to build this hierarchy in SQL in only 2 tables?
If I understand that question correctly. You cannot construct your bike with just one query (I'm not familiar with the term "knight moves"). In fact, you can -- but it must be a recursive SQL query. Because you will be computing the transitive closure of the part-subpart relationship.
Unfortunately I don't immediately know how to write these. SQL syntax is frankly abysmal and recursive SQL looks even abysmaller, so below is example code using a loop instead.
You actually need only one table to represent the data as the basicpart/1 relation does not bring anything to the table, except label certain "things" as basic. But these are also the things that do not appear in assembly/2 in the first position.
Notes:
Not using ENUMS which are not really "types" in MySQL/MariaDB but just a constraint on a field of a specific table. (Like, WTF!)
The multiset representation of the Prolog code ("a bike has two wheels") is flattened into multiple rows separately identified by a numeric surrogate id. This is due to the "First Normal Form" dogma of RDBMS practice. There is à priori nothing wrong with having multisets as values, if the query language and the RDBMS engine can support it. For example, you can have XML values in PostgreSQL complete with queries over its content, as I remember1.
DELIMITER //
DROP PROCEDURE IF EXISTS prepare;
CREATE PROCEDURE prepare()
BEGIN
DROP TABLE IF EXISTS assembly;
CREATE TABLE assembly
(id INT AUTO_INCREMENT KEY, -- surrogate key because a bike may have several wheels
part VARCHAR(10) NOT NULL,
subpart VARCHAR(10) NOT NULL);
INSERT INTO assembly(part,subpart) VALUES
("bike","wheel"),
("bike","wheel"),
("bike","frame"),
("wheel","spoke"),
("wheel","rim"),
("wheel","hub"),
("frame","rearframe"),
("frame","frontframe"),
("frontframe","fork"),
("frontframe","handles"),
("hub","gears"),
("hub","axle"),
("axle","bolt"),
("axle","nut");
END;
DROP PROCEDURE IF EXISTS compute_transitive_closure;
CREATE PROCEDURE compute_transitive_closure()
BEGIN
DROP TABLE IF EXISTS pieces;
CREATE TABLE pieces
(id INT AUTO_INCREMENT KEY,
part VARCHAR(10) NOT NULL,
subpart VARCHAR(10) NOT NULL,
path VARCHAR(500) NOT NULl DEFAULT "",
depth INT NOT NULL DEFAULT 0);
INSERT INTO pieces(part,subpart,path,depth) VALUES
("ROOT","bike","/bike",0);
SET #depth=0;
l: LOOP
INSERT INTO pieces(part,subpart,path,depth)
SELECT
p.subpart,
a.subpart,
CONCAT(p.path,'/',a.subpart),
#depth+1
FROM
pieces p,
assembly a
WHERE
p.depth = #depth AND p.subpart = a.part;
IF ROW_COUNT() <= 0 THEN
LEAVE l;
ELSE
SELECT * FROM pieces;
END IF;
SET #depth=#depth+1;
END LOOP;
END; //
DELIMITER ;
Put the above into a file SQL.txt, and then, in a database testme:
MariaDB [testme]> source SQL.txt;
MariaDB [testme]> CALL prepare;
MariaDB [testme]> CALL compute_transitive_closure;
Then after 4 passages through the loop, you get:
+----+------------+------------+--------------------------------+-------+
| id | part | subpart | path | depth |
+----+------------+------------+--------------------------------+-------+
| 1 | ROOT | bike | /bike | 0 |
| 2 | bike | wheel | /bike/wheel | 1 |
| 3 | bike | wheel | /bike/wheel | 1 |
| 4 | bike | frame | /bike/frame | 1 |
| 5 | wheel | spoke | /bike/wheel/spoke | 2 |
| 6 | wheel | spoke | /bike/wheel/spoke | 2 |
| 7 | wheel | rim | /bike/wheel/rim | 2 |
| 8 | wheel | rim | /bike/wheel/rim | 2 |
| 9 | wheel | hub | /bike/wheel/hub | 2 |
| 10 | wheel | hub | /bike/wheel/hub | 2 |
| 11 | frame | rearframe | /bike/frame/rearframe | 2 |
| 12 | frame | frontframe | /bike/frame/frontframe | 2 |
| 20 | frontframe | fork | /bike/frame/frontframe/fork | 3 |
| 21 | frontframe | handles | /bike/frame/frontframe/handles | 3 |
| 22 | hub | gears | /bike/wheel/hub/gears | 3 |
| 23 | hub | gears | /bike/wheel/hub/gears | 3 |
| 24 | hub | axle | /bike/wheel/hub/axle | 3 |
| 25 | hub | axle | /bike/wheel/hub/axle | 3 |
| 27 | axle | bolt | /bike/wheel/hub/axle/bolt | 4 |
| 28 | axle | nut | /bike/wheel/hub/axle/nut | 4 |
| 29 | axle | bolt | /bike/wheel/hub/axle/bolt | 4 |
| 30 | axle | nut | /bike/wheel/hub/axle/nut | 4 |
+----+------------+------------+--------------------------------+-------+
1: This made me dig out "Database in Depth: Relational Theory for Practitioners", O'Reilly 2005, by Chris Date, an excellent introduction to the relational model. On page 30, Date considers "sets as values" (but does not consider "multisets"):
Second (and regardless of what you might think of my first argument),
the fact is that a set like {P2,P4,P5} is no more and no less
decomposable by the DBMS than a character string is. Like character
strings, sets do have some inner structure; as with characters
strings, however, it's convenient to ignore that structure for certain
purposes. In other words, if a character string is compatible with the
requirements of 1NF - that is, if character strings are atomic - then
sets must be, too. The real point I'm getting at here is that the
notion of atomicity has no absolute meaning; it just depends on what
we want to do with the data. Sometimes we want to deal with an entire
set of part numbers as a single thing, and sometimes we want to deal
with individual part numbers within that set - but then we are
descending to a lower level of detail (a lower level of abstraction).
The meta function in kdb/q returns the following info about the table:
c – (symbol) column names
t – (char) data type
f – (symbol) domain of foreign keys
a - (symbol) attributes.
I would like to extend this to include more information about the table. The specific case that I am trying to solve is to include the timezone information about the time data columns in the table.
For example:
select from Price
+-------------------------+-------------------------+--------+-------+
| Time | SysTime | Ticker | Price |
+-------------------------+-------------------------+--------+-------+
| 2016.09.15D09:18:02.391 | 2016.09.15D08:18:02.391 | IBM | 63.46 |
| 2016.09.15D09:18:02.491 | 2016.09.15D08:16:22.391 | MSFT | 96.72 |
| 2016.09.15D09:18:02.591 | 2016.09.15D08:14:42.391 | AAPL | 23.06 |
+-------------------------+-------------------------+--------+-------+
meta Price
+---------+---+---+---+
| c | t | f | a |
+---------+---+---+---+
| Time | p | | |
| SysTime | p | | |
| Ticker | s | | |
| Price | f | | |
+---------+---+---+---+
I would like to have additional info about the time data columns (Time and SysTime) in the meta.
For Example, something like this:
metaExtended Price
+---------+---+---+---+------------------+
| c | t | f | a | z |
+---------+---+---+---+------------------+
| Time | p | | | America/New_York |
| SysTime | p | | | America/Chicago |
| Ticker | s | | | |
| Price | f | | | |
+---------+---+---+---+------------------+
Please note that I have a function that takes in the table and column to return the time zone.
TimeZone[Price;Time] returns America/New_York
My question is only about how to include this information in the meta function. The second question that I have is that if the user does something like this, newPriceTable:Price (creating a new table which is the same as the previous table) then the metaExtended function should return the same value for both the tables (akin to calling a function on two different variables having the same object reference)
Does something similar exist in sql?
meta is a reserved word and therefore cannot be redefined. But you can create your own implementation and use it in place of meta:
TimeZone:{[Table;Col] ... } / your TimeZone function
metaExtended:{meta[x],'([]z:TimeZone[t]each cols x)}
metaExtended Price
Regarding your second question, I don't think it's possible to do what you want in k/q. Immediately after assigning Price to newPriceTable the latter is indeed a reference, but as soon as you modify it kdb will create a copy and modify it instead of the original. The problem is there is no way to tell whether newPriceTable is still a reference to Price or a fresh new object.
You can use lj to join them into one metaExtended function.
The function will check for all the time cols and run TimeZone function on them and join the result with meta result:
metaExtended:{[tbl] meta[tbl] lj 1!select c,z:TimeZone[tbl] each t from meta[tbl] where t in "tp"}
metaExtended `t
when you assign this table to new variable it will be assigned as a reference.
nt:t / nt and t pointing to same object
Yo can check the reference count of a variable using -16! .
-16!t
At this point metaExtended function will give same output. But once some update is done on any of these variables pointing to same table, kdb will create a new copy for updated table/variable. From this point they are 2 different objects. Now output of metaExtended function depends on the object schema.
I have two tables:
+------------------------------------------------------+
| HIERARCHICAL_RECORDS |
+----------------+--------------------+----------------+
| ORG_ID | NAME | VALUE |
+----------------+--------------------+----------------+
| 333 | CC | ... |
| 22 | MT | ... |
| 22 | TMP | ... |
| 333 | TMP | ... |
+------------------------------------------------------+
and a second one with ORGs hierarchy:
+---------------------------------+
| ORGANIZATION |
+----------------+----------------+
| ORG_ID | PARENT_ID |
+----------------+----------------+
| 1 | null |
| 22 | 1 |
| 333 | 22 |
+---------------------------------+
which represent an hierarchy of parameters' values on UI:
Org ID# 1
|
|----- Org ID# 22
[MT -> value]
[TMP -> value]
|
|----- Org ID# 333
[CC -> value]
[MT -> shows value defined in parent #22]
[TMP -> redefined value]
Here's the thing: if both parent and child have defined some attribute value (e.g. TMP in example), we always should return to the client the value redefined by a child.
So, I would like to have a query that by a given child ID (or maybe set of all the IDs from child to root) will return me records that are defined on the very last level.
E.g., for the example above, if I pass 333 (or set of 1, 22, 333), I would like to have a result of:
+----------------+--------------------+----------------+
| ORG_ID | NAME | VALUE |
+----------------+--------------------+----------------+
| 333 | CC | ... |
| 22 | MT | ... |
| 333 | TMP | ... |
+------------------------------------------------------+
where TMP's value from child ORG# 333 will hide one from parent ORG# 22.
So, I need to somehow filter records for parent ORGs and leave only ones actual for child ORGs. At the same time, if a value isn't redefined at a child level, we take the value from it's parent (as with MT defined at ORG# 22 but not redefined at ORG# 333).
How can I do that?
Many thanks!
This should be a comment, but its a bit long. Since mysql has no "connect by" operator you will struggle to do this in mysql. The only practical solution would be to write a recursive procedure which would populate a temporary table. But it won't be very efficient. Indeed one solution, depending on how frequently the data changes would be to apply the mapping to all the nodes in the database. Whether other solutions (the graph engine, geospatial indexing or alternate schemas) might be more appropriate would depend on the frequency of changes the number of records, the distribution and the cardinality of the data (which you haven't mentioned).
I was wondering if there is an easy way of moving some (not all) data from one column to another.
My MySQL table has 200 entries but this is the simplified version of what I am trying to do:
| ID | A | B |
| 1 | | |
| 2 | | |
| 3 | | aa|
| 4 | | bb|
| 5 | | cc|
So I need to get data from column B to Column A but only the ones that have ID greater than (>) 2. so that aa from 3B will go to 3A, bb from 4B will go to 4A...
UPDATE <tablename> SET
A=B,
-- B=''
WHERE ID>2
Might help. The commented-out line needs to be enabled or disabled, depending on whether you want to move or copy the values between columns.
I have two mysql tables as
Component
+----+-------------------------+--------+
| OldComponentId | NewComponentId |
+----+-------------------------+--------+
| 15 | 85 |
| 16 | 86 |
| 17 | 87 |
+----+-------------------------+--------+
Formulae
+----+-------------------------+--------+
| id | formula_string |
+----+-------------------------+--------+
| 1 | A+15-16+17 |
| 2 | 16+15-17 |
+----+-------------------------+--------+
I want to replace value of formula_string on the basis of NewComponentId as
Formulae
+----+-------------------------+--------+
| id | formula_string |
+----+-------------------------+--------+
| 1 | A+85-86+87 |
| 2 | 86+85-87 |
+----+-------------------------+--------+
I have tried with following mysql query but its not working
update Formulae fr, Component comp set formula_string=REPLACE(fr.formula_string,comp.OldComponentId,comp.NewComponentId).
Please suggest the solutions
thanks.
There is no easy way to do this. As you observed in your update statement, the replacements don't nest. They just replace one at a time.
One thing that you can do is:
update Formulae fr cross join
Component comp
set formula_string = REPLACE(fr.formula_string, comp.OldComponentId, comp.NewComponentId)
where formula_string like concat('%', comp.OldComponentId, '%')
Then continue running this until row_count() returns 0.
Do note that your structure could result in infinite loops (if A --> B and B --> A). You also have a problem of "confusion" so 10 would be replaced in 100. This suggests that your overall data structure may not be correct. Perhaps you should break up the formula into separate pieces. If they are just numbers and + and -, you can have a junction table with the value and the sign for each component. Then your query would be much easier.