Recursive CTE in MySql for tree structure (adjacency list) - mysql

I'm just starting out with MySQL (I come from using SQL Server previously). I haven't yet started implementing anything in MySQL, just researching how to do things and what problems I might encounter.
In SQL Server I've used CTEs to successfully recurse through an adjacency list table structure to produce the desired result set. From what I can tell so far with MySQL, it does not support CTEs. I've got a fairly simple table structure to hold my hierarchy (written in SQL Server syntax b/c of my familiarity with it):
CREATE TABLE TreeNodes (
NodeId int IDENTITY(1,1) NOT NULL PRIMARY KEY,
ParentNodeId int NULL,
Name varchar(50) NOT NULL,
FullPathName varchar(MAX) NOT NULL, -- '/' delimited names from root to current node
IsLeaf bit NOT NULL -- is this node a leaf?
)
Side Note: I realize that FullPathName and IsLeaf are not required and could be determined at query time, but the insert of a tree node will be a very uncommon occurrence as opposed to the queries against this table - which is why I plan to compute those two values as part of the insert SP (will make the queries that need those two values less costly).
With CTE (in SQL Server), I would have a function like the following to find leaf nodes of current node:
CREATE FUNCTION fn_GetLeafNodesBelowNode (
#TreeNodeId int
)
RETURNS TABLE
AS
RETURN
WITH Tree (NodeId, Name, FullPathName, IsLeaf)
AS (
SELECT NodeId, Name, FullPathName, IsLeaf FROM TreeNodes WHERE NodeId = #TreeNodeId
UNION ALL
SELECT c.NodeId, c.Name, c.FullPathName, c.IsLeaf FROM Tree t
INNER JOIN TreeNodes c ON t.NodeId = c.ParentNodeId
)
SELECT * FROM Tree WHERE IsLeaf = 1
How would I do the same with MySQL?
Thanks in advance.

You can get it done by some sort of stored functions and bit logic.
Here is one example.
Have a try.

Related

Using SQL Server Indexed View in combination with OPENJSON

I have a table with just one row and one column which stores a JSON array with about 30MB/16k objects in it:
CREATE TABLE [dbo].[CitiesTable]
(
[CitiesJson] [NVARCHAR](MAX) NOT NULL
) ON [PRIMARY]
GO
INSERT INTO [dbo].[CitiesTable] ([CitiesJson])
VALUES ('{"cities":[{"cityName": "London","residentCount": 8961989},{"cityName": "Paris","residentCount": 2165423},{"cityName": "Berlin","residentCount": 3664088}]}')
I use this query to parse the JSON and bring it into a relational structure:
SELECT x.[CityName], x.[ResidentCount]
FROM
OPENJSON((SELECT [CitiesJson] FROM dbo.CitiesTable), '$.cities')
WITH
(
[CityName] [NVARCHAR] (50) '$.cityName',
[ResidentCount] [INT] '$.residentCount'
) AS x
Which yields:
CityName ResidentCount
---------- -------------
London 8961989
Paris 2165423
Berlin 3664088
I'd like to create a view for this so that I don't have to include the bulky query in several places.
But using this query inside a view has the downside that the JSON has to be parsed each time the view is executed... So I'm considering to create an Indexed View to gain the advantage that the view itself just has to be re-executed if the underlying table-data changes.
Unfortunately an indexed view has quite some prerequisites. Being one of them that no subqueries are allowed.
Hence the view can be created...
CREATE VIEW dbo.Cities_IndexedView
WITH SCHEMABINDING
AS
SELECT x.[CityName], x.[ResidentCount]
FROM
OPENJSON((SELECT [CitiesJson] FROM dbo.CitiesTable), '$.cities')
WITH
(
[CityName] [NVARCHAR] (10) '$.cityName',
[ResidentCount] [INT] '$.residentCount'
) AS x
But the following index creation fails:
CREATE UNIQUE CLUSTERED INDEX Cities_IndexedView_ucidx
ON dbo.Cities_IndexedView([CityName]);
Cannot create index on view MyTestDb.dbo.Cities_IndexedView" because it contains one or more subqueries. Consider changing the view to use only joins instead of subqueries. Alternatively, consider not indexing this view.
Is there any way to work around this? I don't know how to access the CitiesJson column within the OPENJSON without using a sub-select...
EDIT:
Zhorov had a nice idea to eliminate the subquery:
SELECT x.[CityName], x.[ResidentCount]
FROM [dbo].[CitiesTable] c
CROSS APPLY OPENJSON(c.[CitiesJson], '$.cities') WITH ([CityName] [NVARCHAR] (10) '$.cityName', [ResidentCount] [INT] '$.residentCount') AS x
But unfortunately APPLY can't be used in indexed views (see here):
Cannot create index on view "MyTestDb.dbo.Cities_IndexedView" because it contains an APPLY. Consider not indexing the view, or removing APPLY.
The additional requirements also state that OPENXML and table valued functions aren't allowed either. So I guess OPENJSON is just not yet mentioned in the docs but isn't allowed as well :-(
Locally I use SQL Server 2016. I created db fiddle over here which uses SQL Server 2019. And yep, OPENJSON just seems to be impossible to use:
Cannot create index on the view 'fiddle_cf57a9b555f74ea1ada4c5d0d277cf95.dbo.Cities_IndexedView' because it uses OPENJSON.
Creating an indexed view can only use a sujection from the original data to the target view, which is never the case when the query contains subqueries, outer join or UNION, INTERSECT or EXCEPT.
You must use another logic like a target table and a trigger.
About the table structure to do that, just use the SELECT INTO to create the "snapshot" table with or without primary formal data ::
SELECT IDENTITY(INT, 1, 1) AS JSON_ID, x.[CityName], x.[ResidentCount]
INTO MySQL_Schema.MyJSON_Data
FROM
OPENJSON((SELECT [CitiesJson] FROM dbo.CitiesTable), '$.cities')
WITH
(
[CityName] [NVARCHAR] (10) '$.cityName',
[ResidentCount] [INT] '$.residentCount'
) AS x
Then make JSON_ID a primary key :
ALTER MySQL_Schema.MyJSON_Data ADD PRIMARY KEY (JSON_ID);

Mysql - Returning number of rows where JSON document is contained within a target JSON document

I have table with json column data in mysql.
In data column i am expecting to have some array of products for key "products".
I am analyzing the best and fastest approach to get number of rows containing specified value in array of products.
This is what i have tried so far, on 1 million of rows with given results:
SELECT COUNT(*) as "cnt" FROM `components` `c`
WHERE (JSON_CONTAINS(`c`.`data`, '"some product from array"', '$."products"') = true)
first one takes ~4 sec
SELECT COUNT(*) as "cnt" FROM `components` `c`
WHERE (JSON_SEARCH(`c`.`data`, 'one', 'some product from array', null, '$."products"') is not null)
second one takes ~2,5 sec
Is there any faster way i can get this number of rows?
I noticed that from Mysql 8.0.17 it is possible to add multi-valued indexes on a JSON column. Is it possible to create multi-valued index on array of strings. I tried something like:
CREATE INDEX products ON components ( (CAST(data->'$.products' AS VARCHAR(255) ARRAY)) )
but it gives me error. Is there any way to accomplish this?
Best regards.

extract data from sql, modify it and save the result to a table

This may seem like a dumb question. I am wanting to set up an SQL db with records containing numbers. I would like to run an enquiry to select a group of records, then take the values in that group, do some basic arithmetic on the numbers and then save the results to a different table but still have them linked with a foreign key to the original record. Is that possible to do in SQL without taking the data to another application and then importing it back? If so, what is the basic function/procedure to complete this action?
I'm coming from an excel/macro/basic python background and want to investigate if it's worth the switch to SQL.
PS. I'm wanting to stay open source.
A tiny example using postgresql (9.6)
-- Create tables
CREATE TABLE initialValues(
id serial PRIMARY KEY,
value int
);
CREATE TABLE addOne(
id serial,
id_init_val int REFERENCES initialValues(id),
value int
);
-- Init values
INSERT INTO initialValues(value)
SELECT a.n
FROM generate_series(1, 100) as a(n);
-- Insert values in the second table by selecting the ones from the
-- First one .
WITH init_val as (SELECT i.id,i.value FROM initialValues i)
INSERT INTO addOne(id_init_val,value)
(SELECT id,value+1 FROM init_val);
In MySQL you can use CREATE TABLE ... SELECT (https://dev.mysql.com/doc/refman/8.0/en/create-table-select.html)

multiple temporary tables?

This might be a basic question: I am using a temporary table in some of my php code like so:
CREATE TEMPORARY TABLE ttable( `d` DATE NOT NULL , `p` DECIMAL( 11, 2 ) NOT NULL , UNIQUE KEY `date` ( `date` ) );
INSERT INTO ttable( d, p ) VALUES ( '$d' , '$p' );
SELECT * FROM ttable;
As we scale up our site, will this ever be a problem? ie, will user1's ttable & user2's ttable ever get mixed up & user1 sees user2's ttable & vice versa? Is it better to create a unique name for each unique temporary table?
thx
Temporary tables are session-specific. Every time you connect to a host (in PHP, this is done with mysql_connect), temporary tables that you create exist only within that session/connection.
It is almost always better to find a different way than using temporary tables.
The only time I would consider them is under the following conditions:
The activity is rare. Meaning, a given user MIGHT do this once a week.
It is used as a holding container prior to doing a regular full import of data.
It deals with data whose structure is unknown prior to being filled.
All three of those really go with building some type of generic bulk import routines where the data mapping is defined at run time.
If you find yourself creating temp tables frequently in the application, there's probably a better way.
Scalability is going to depend on the amount of data being loaded and frequency of temp table usage. For a low trafficked site it might be okay.
We're in the process of ripping out a ton of temp table usage by a client's app. 90% of the queries in their system result in a temp table being created. Analysis of all the queries have shown that the original dev used this mechanism simply because they didn't understand SQL. We're doing this because performance has radically dropped off as new users are added to the system.
Can you post a use case? Maybe we can help provide an alternate mechanism.
UPDATE:
Now that we have a use case, here is a simple table structure to accomplish what you need.
Table ZipCodes
ZipCode char(5) [or char(10) depending on need]
CityName varchar(50)
*other columns as necessary such as latitude or whatever.
Table TempReadings
ZipCode char(5) [foreign key to the ZipCode table]
ReadingDate datetime
Temperature float (or some equivalent)
To get all the temp readings for a given zip code you would do something like:
select ZipCode, ReadingDate, Temperature
from TempReadings
if you need info from the main ZipCode table:
select Z.ZipCode, Z.CityName, TR.ReadingDate, TR.Temperature
from ZipCodes Z
inner join TempReadings TR on (TR.ZipCode = Z.ZipCode)
add where clauses as necessary. Note that none of the above requires having a separate table per zip code.

SQL Server 2008 - how to automatically drop and create and output table?

I would like to set up a table within a SQL Server DB that stores the results from a long and complex query that takes almost an hour to run. After running the query the rest of the analysis is done by colleagues using Excel pivot tables.
I would prefer not to output the results to text, and want to keep it within SQL Server and then just set up Excel to pivot directly from the server.
My problem is that the output will not always have exactly the same columns, and manually setting up an output table to INSERT INTO every time would be tedious.
Is there a way to create a table on the fly based on the type of data you are selecting?
E.g. if I want to run:
SELECT
someInt,
someVarchar,
someDate
FROM someTable
And insert this into a table called OutputTable, which has to look like this
CREATE TABLE OutputTable
(
someInt int null
someVarchar varchar(255) null,
someDate date null,
) ON [primary]
Is there some way to make SQL Server interrogate the fields in the select statement and then automatically generate the CREATE TABLE script?
Thanks
Karl
SELECT
someInt,
someVarchar,
someDate
INTO dbo.OutputTable
FROM someTable
...doesn't explicitly generate a CREATE script (at least not one you can see) but does the job!