How to resolve SQL id chain? - mysql

I have an MySQL DB with table like that:
| id | redirect |
+-------+----------+
| 1 | NULL |
| 2 | 3 |
| 3 | NULL |
| 4 | 5 |
| 5 | 6 |
| 6 | 8 |
| 7 | NULL |
| 8 | NULL |
+-------+----------+
I need to create query for recursive resolving redirects.
So I can get results:
1 1
2 3
3 3
4 8
5 8
6 8
7 7
8 8
Thanks

One approach is to get each "level" with a separate query.
To get the first level, we can test for a NULL in redirect_id column to identify a "terminating" node.
To get the second level, we can use a JOIN operation to match rows that have a redirect_id that match the id from a "terminating" row (identified previously).
The third level follows the same pattern, adding another JOIN operation to return rows that redirect to a level two row.
And so on.
For example:
SELECT t1.id AS start_id
, t1.id AS terminate_id
FROM mytable t1
WHERE t1.redirect_id IS NULL
UNION ALL
SELECT t2.id
, t1.id
FROM mytable t1
JOIN mytable t2 ON t2.redirect_id = t1.id
WHERE t1.redirect_id IS NULL
UNION ALL
SELECT t3.id
, t1.id
FROM mytable t1
JOIN mytable t2 ON t2.redirect_id = t1.id
JOIN mytable t3 ON t3.redirect_id = t2.id
WHERE t1.redirect_id IS NULL
UNION ALL
SELECT t4.id
, t1.id
FROM mytable t1
JOIN mytable t2 ON t2.redirect_id = t1.id
JOIN mytable t3 ON t3.redirect_id = t2.id
JOIN mytable t4 ON t4.redirect_id = t3.id
WHERE t1.redirect_id IS NULL
The limitation of this single-query UNION ALL approach is that it would need to be extended to a finite maximum number of levels. (This approach is not truly "recursive".)
If we needed a truly recursive approach, we could run each query separately, just adding an extra "level" for each run, following the same pattern. We'd know that we'd exhausted all possible paths when the result of a query returns no rows.
I've demonstrated the use of the UNION ALL operator to combine the results into a single set, using a single query. (Add an ORDER BY clause at the end of the statement if the order of the rows is important. It would also be easy to include a literal "level" column to the resultset, e.g. 1 AS level for the first SELECT, the 2 on the second query, etc. to identify how far a node was from the termination.
MySQL doesn't support an Oracle style CONNECT BY syntax (in Oracle, we COULD write a single query that would traverse this set and return the specified rows, an arbitrary number of levels.)
To get a truly "recursive" approach in MySQL would require multiple queries. (Note that MySQL can support "recursion" in stored procedure calls, if the server is configured to allow it.)

Related

SQL Query with two tables and need to search one of the tables for specific and multiple column values

I need to write a complex query that involves the two sample tables below and I’m struggling to understand how to construct the query properly.
The table structure is as follows, along with sample data:
Table 1:
ID | Type | Size
A123 | Block | Medium
C368 | Square | Large
X634 | Triangle | Small
K623 | Square | Small
Table 2:
ID | Code | Description | Price
A123 | C06 | Sensitive Material | 99.99
A123 | H66 | Heavy Grade | 12.76
A123 | U74 | Pink Hue | 299.99
C368 | H66 | Heavy Grade | 12.76
C368 | G66 | Green Hue | 499.99
C368 | C06 | Sensitive Material | 99.99
C368 | K79 | Clear Glass | 59.99
X634 | G66 | Green Hue | 499.99
X634 | K79 | Clear Glass | 59.99
X634 | Z63 | Enterprise Class | 999.99
K623 | K79 | Clear Glass | 59.99
K623 | G66 | Green Hue | 499.99
K623 | X57 | Extra Piping | 199.99
The query should be based on the Type column from Table 1 primarily and then join on the ID column of Table 2. The goal of the query is to search for all IDs in Table 1 that have specific Code column combinations in Table 2.
The final output should be a table that looks like this for Type = Square AND both (Code = G66 AND Code = K79) as well:
ID
C368
K623
Those two IDs should be returned because they both have BOTH option codes in the pseudo query above.
How can I assemble this result using these two tables? Below are two initial queries I've written - neither produce the correct result. I've tried the IN operator along with =/AND/OR operators as you can see.
Attempt 1 (seems to work with ONE code but not > 1 code):
select distinct ID
from
(
SELECT shapes.ID, details.code, details.description
FROM db.table2 details
JOIN db.table1 shapes
ON details.VIN = shapes.VIN
WHERE shape.type='Square'
) src
where code IN ("G66", "K79")
-- where code = "G66" AND code = "K79" (Produces zero results)
-- where code = "G66" OR code = "K79" (Produces incorrect results)
Attempt 2 (seems to work with ONE code but not > 1 code):
SELECT distinct ID
FROM db.table2 details
WHERE ID IN
(
SELECT ID FROM db.table1 shapes
WHERE shapes.type='Square'
) AND code IN ("G66", "K79")
-- AND code = "G66" AND code = "K79" (produces zero results)
-- AND code = "G66" OR code = "K79" (Produces incorrect results)
Thanks
Whenever you have a problem that is of the ilk "I need the IDs from this table where there is one row that has value A and another row that has value B and both rows have this same ID" you need to select all the rows matching your criteria, group them and then count them and only use the rows that have the matching count:
SELECT t2.id
FROM table1 t1 JOIN table2 t2 ON t1.id = t2.id
WHERE t1.type = 'square' and t2.code IN ('G66', 'K79')
GROUP BY t2.id
HAVING COUNT(*) = 2
If there might be some bogus results like two rows that are G66 and no rows that are K79, then this simple counting will be defeated. We can instead look at the values (if it's 2) using MIN and MAX:
SELECT t2.id
FROM table1 t1 JOIN table2 t2 ON t1.id = t2.id
WHERE t1.type = 'square' and t2.code IN ('G66', 'K79')
GROUP BY t2.id
HAVING MIN(t2.code) = 'G66' AND MAX(t2.code) = 'K79'
It works because alphabetically G66 is less than K79, so G66 will be the min one
If we have 3 values that we must mandate, we can do some trick like turning all the codes into a number, and demanding the sum be something. I'll use base 2 for this:
SELECT t2.id
FROM table1 t1 JOIN table2 t2 ON t1.id = t2.id
WHERE t1.type = 'square' and t2.code IN ('G66', 'K79', 'X99')
GROUP BY t2.id
HAVING SUM(CASE t2.Code WHEN 'G66' THEN 1 WHEN 'K79' THEN 2 WHEN 'X99' THEN 4 END) = 7
If we map them to 1, 2 and 4 then the only way to make 7 (if the values are unique) is to have one of each. If there could be 7 G66 and none of the others, giving a bogus result, then we might have to count them individually:
SELECT t2.id
FROM table1 t1 JOIN table2 t2 ON t1.id = t2.id
WHERE t1.type = 'square' and t2.code IN ('G66', 'K79', 'X99')
GROUP BY t2.id
HAVING
SUM(CASE t2.code WHEN 'G66' THEN 1 ELSE 0 END) = 1 AND
SUM(CASE t2.code WHEN 'K79' THEN 1 ELSE 0 END) = 1 AND
SUM(CASE t2.code WHEN 'X99' THEN 1 ELSE 0 END) = 1
Here is what I think.
Filter data from t1 by the where clause, then join the t2 , 2 times, each to be filtered by conditions to have G66 & K79 in same table (2 different joins)
Select t1.ID
from t1
inner join t2 as t2_G66 on t1.ID = t2_G66.ID
inner join t2 as t2_K79 on t1.ID = t2_K79.ID
where t1.Type = 'Square' and
t2_G66.Code = 'G66' and
t2_K79.Code = 'K79'
Here is the fiddle
Couldn't you just use an INNER JOIN on the id's from Table 1 after getting the relevant rows which have the type we're looking for since Table 1 and Table 2 are related by ID?
I imagine we could first do a query to to get all the rows which have the relevant type and then run an INNER JOIN to get the shared rows which have the ID we care about.
Finally, we could just group our results by their code column?
Maybe something like this could work?:
SELECT * FROM db.table1 WHERE db.table1.type = "whatever"
INNER JOIN db.table2 ON db.table1.id = db.table2.id
GROUP BY db.table2.code HAVING COUNT(*) >= 1
AND db.table2.code IN ("code_1, code_2, code_3")
I just started SQL so I hope this hopes!
P.S. I realized I didn't cover your condition for having the code be a member of the the subset of codes that you care about. So I think this may work.

Selecting the Ids of a table where 3 or more column are duplicates

I am trying to select the ids of 3 columns that I have in my table that all contain the same data except for when the first occurrence of the duplicate was inserted. For example my table is as follows:
Select * From Workers
+----+--------+--------+--------------+
| id | name |JobTitle| description |
+----+--------+--------+--------------+
| 1 | john |Plumber |Installs Pipes|
| 2 | mike | Doctor |Provides Meds |
| 3 | john |Plumber |Installs Pipes|
| 4 | john |Plumber |Installs Pipes|
| 5 | mike | Doctor |Provides Meds |
| 6 | mike | Doctor |Provides Meds |
+----+--------+--------+--------------+
What im basically trying to get is the ids of all the duplicates records expect for the lowest or first id where a duplicate has occurred.
SELECT t1.id
From workers t1, workers t2
Where t1.id > t2.Id and t1.name = t2.name and t1.jobTitle = t2.jobTitle and t1.description = t2.description;
The table i am working with had hundred of thousands of records and I have tried the statement above to get the ids i want but due to the size of the table I get the error:
Error Code: 1054. Unknown column 't1.userId' in 'where clause'
I have tried increasing the timeout in workbench but to no avail.
In this example I am basically trying to get all the ids except for 1 and 2. I thought the above query would have got me what i was looking for but this has not been the case and now I am not sure what else to try.
Any help is greatly appreciated. Thanks in advance.
You can do it with an 'INNER JOIN
SELECT DISTINCT t1.*
From workers t1
INNER JOIN workers t2 ON t1.name = t2.name and t1.jobTitle = t2.jobTitle and t1.description = t2.description
Where t1.id > t2.Id ;
But i can't figure out how you got your message, there is no userid in sight
The error message does not match your query (there is no userId column in the query) - and it is not related to the size of the table.
Regardless, I would filter with exists:
select w.*
from workers w
where exists (
select 1
from workers w1
where
w1.name = w.name
and w1.jobTitle = w.jobTitle
and w1.description = w.description
and w1.id < w.id
)
For performance, consider an index on (name, jobTitle, description, id).

Select TWO rows from table based on ONE row from another table

This is the query I have:
$query = "SELECT t1.one_field,t2.another_field FROM table_one t1,
table_two t2 WHERE t2.flag = 2 AND (t1.id = t2.id1 OR t1.id = t2.id2)
AND (t2.id1 = '$comparison' OR t2.id2 = '$comparison') LIMIT 1";
I tried with GROUP BY, DISTINCT, UNIQUE (...) but could never extract what I wanted... I always extract another_field and ONE one_field instead of TWO one_field... what is the best way to accomplish this? Tyvm...
Table 1
one_field | id
example1 | id_example1
example2 | id_example2
Table 2
another_field | flag | id1 | id2
need_to_get | 2 | id_example1 | id_example2
What I want to get
example1, example2, need_to_get (don't care if result is array or associative)
What I get
example2, need_to_get (expected, since one of the results gets overridden since it has the same field_name...)

Duplicates in Database, Help Edit My Query to Filter Them Out?

I have just finished my latest task of creating an RSS Feed using PHP to fetch data from a database.
I've only just noticed that a lot (if not all) of these items have duplicates and I was trying to work out how to only fetch one of each.
I had a thought that in my PHP loop I could only print out every second row to only have one of each set of duplicates but in some cases there are 3 or 4 of each article so somehow it must be achieved by the query.
Query:
SELECT *
FROM uk_newsreach_article t1
INNER JOIN uk_newsreach_article_photo t2
ON t1.id = t2.newsArticleID
INNER JOIN uk_newsreach_photo t3
ON t2.newsPhotoID = t3.id
ORDER BY t1.publishDate DESC;
Table Structures:
uk_newsreach_article
--------------------
id | headline | extract | text | publishDate | ...
uk_newsreach_article_photo
--------------------------
id | newsArticleID | newsPhotoID
uk_newsreach_photo
------------------
id | htmlAlt | URL | height | width | ...
For some reason or another there are lots of duplicates and the only thing truely unique amongst each set of data is the uk_newsreach_article_photo.id because even though uk_newsreach_article_photo.newsArticleID and uk_newsreach_article_photo.newsPhotoID are identical in a set of duplicates, all I need is one from each set, e.g.
Sample Data
id | newsArticleID | newsPhotoID
--------------------------------
2 | 800482746 | 7044521
10 | 800482746 | 7044521
19 | 800482746 | 7044521
29 | 800482746 | 7044521
39 | 800482746 | 7044521
53 | 800482746 | 7044521
67 | 800482746 | 7044521
I tried sticking a DISTINCT into the query along with specifying the actual columns I wanted but this didn't work.
As you have noticed, the DISTINCT operator will return every id. You could use a GROUP BYinstead.
You will have to make a decision about wich id you want to retain. In the example, I have used MINbut any aggregate function would do.
SQL Statement
SELECT MIN(t1.id), t2.newsArticleID, t2.newsPhotoID
FROM uk_newsreach_article t1
INNER JOIN uk_newsreach_article_photo t2
ON t1.id = t2.newsArticleID
INNER JOIN uk_newsreach_photo t3
ON t2.newsPhotoID = t3.id
GROUP BY t2.newsArticleID, t2.newsPhotoID
ORDER BY t1.publishDate DESC;
Disclaimer
Now while this would be an easy solution to your immediate problem, if you decide that duplicates should not happen, you really should consider redesigning your tables to prevent duplicates getting into your tables in the first place.
group by all your selected columns with HAVING COUNT(*) > 1 will eleminate all duplicates like this:
SELECT *
FROM uk_newsreach_article t1
INNER JOIN uk_newsreach_article_photo t2
ON t1.id = t2.newsArticleID
INNER JOIN uk_newsreach_photo t3
ON t2.newsPhotoID = t3.id
GROUP BY t1.id, t1.headline, t1.extract, t1.text, t1.publishDate,
t2.id, t2.newsArticleID, t2.newsPhotoID,
t3.id, t3.htmlAlt, t3.URL, t3.height, t3.width
HAVING COUNT(*) > 1
ORDER BY t1.publishDate DESC;

Possible to create a mysql query that only displays things that are in descending order

To start things off, I want to make it clear that I'm not trying to order by descending order.
I am looking to order by something else, but then filter further by displaying things in a second column only if the value in that column 1 row below it is less than itself. Once It finds that the next column is lower, it stops.
Example:
Ordered by column-------------------Descending Column
353215 20
535325 15
523532 10
666464 30
473460 20
If given that data, I would like it to only return 20, 15 and 10. Because now that 30 is higher than 10, we don't care about what's below it.
I've looked everywhere and can't find a solution.
EDIT: removed the big number init, and edd the counter in ifnull test, so it works in pure MySQL: ifnull(#prec,counter) and not ifnull(#prec,999999).
If your starting table is t1 and the base request was:
select id,counter from t1 order by id;
Then with a mysql variable you can do the job:
SET #prec=NULL;
select * from (
select id,counter,#prec:= if(
ifnull(#prec,counter)>=counter,
counter,
-1) as prec
from t1 order by id
) t2 where prec<>-1;
except here I need the 99999 as a max value for your column and there's maybe a way to put the initialisation of #prec to NULL somewhere in the 1st request.
Here the prec column contains the 1st row value counter, and then the counter value of each row if it less than the one from previous row, and -1 when this becomes false.
Update
The outer select can be removed completely if the variable assignment is done in the WHERE clause:
SELECT #prec := NULL;
SELECT
id,
counter
FROM t1
WHERE
(#prec := IF(
IFNULL(#prec, counter) >= counter,
counter,
-1
)) IS NOT NULL
AND #prec <> -1
ORDER BY id;
regilero EDIT:
I can remove the 1st initialization query using a temporary table (left join) of 1 row this way: but this may slow down the query, maybe.
(...)
FROM t1
LEFT JOIN (select #prec:=NULL as nullinit limit 1) as tmp1 ON tmp1.nullinit is null
(..)
As said by #Mike using a simple UNION query or even :
(...)
FROM t1 , (select #prec:=NULL) tmp1
(...)
is better if you want to avoid the first query.
So at the end the nicest solution is:
SELECT NULL AS id, NULL AS counter FROM dual WHERE (#prec := NULL)
UNION
SELECT id, counter
FROM t1
WHERE (
#prec := IF(
IFNULL(#prec, counter) >= counter,
counter,
-1 )) IS NOT NULL
AND #prec <> -1
ORDER BY id;
+--------+---------+
| id | counter |
+--------+---------+
| 353215 | 20 |
| 523532 | 10 |
| 535325 | 15 |
+--------+---------+
EXPLAIN SELECT output:
+----+--------------+------------+------+---------------+------+---------+------+------+------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+------+---------------+------+---------+------+------+------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE |
| 2 | UNION | t1 | ALL | NULL | NULL | NULL | NULL | 6 | Using where |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | Using filesort |
+----+--------------+------------+------+---------------+------+---------+------+------+------------------+
You didn't find a solution because it is impossible.
SQL works only within a row, it can not look at rows above or below it.
You could write a stored procedure to do this, essentially looping one row at a time and calculating the logic.
It would probably be easier to write it in the frontend language, whatever it is you are using.
I'm afraid you can't do it in SQL. Relational databases were designed for different purpose so there is no abstraction like next or previous row. Do it outside the SQL in the 'wrapping' language.
I'm not sure whether these do what you want, and they're probably too slow anyway:
SELECT t1.col1, t1.col2
FROM tbl t1
WHERE t1.col2 = (SELECT MIN(t2.col2) FROM tbl t2 WHERE t2.col1 <= t1.col1)
Or
SELECT t1.col1, t1.col2
FROM tbl t1
INNER JOIN tbl t2 ON t2.col1 <= t1.col1
GROUP BY t1.col1, t1.col2
HAVING t1.col2 = MIN(t2.col2)
I guess you could maybe select them (in order) into a temporary table, that also has an auto-incrementing column, and then select from the temporary table, joining on to itself based on the auto-incrementing column (id), but where t1.id = t2.id + 1, and then use the where criteria (and appropriate order by and limit 1) to find the t1.id of the row where the descending column is greater in t2 than in t1. After which, you can select from the temporary table where the id is less than or equal to the id that you just found. It's not exactly pretty though! :)
It is actually possible, but the performance isn't easy to optimize. If Col1 is ordered and Col2 is the descending column:
First you create a self join of each row with the next row (note that this only works if the column value is unique, if not you need to join on unique values).
(Select Col1, (Select Min(Col2) as A2 from MyTable as B Where B.A2>A.Col1) As Col1FromNextRow From MyTable As A) As D
INNER JOIN
(Select Col1 As C1,Col2 From MyTable As C On C.C1=D.Col1FromNextRow)
Then you implement the "keep going until the first ascending value" bit:
Select Col2 FROM
(
(Select Col1, (Select Min(Col2) as A2 from MyTable as B Where B.A2>A.Col1) As Col1FromNextRow From MyTable As A) As D
INNER JOIN
(Select Col1 As C1,Col2 From MyTable As C On C.C1=D.Col1FromNextRow)
) As E
WHERE NOT EXISTS
(SELECT Col1 FROM MyTable As Z Where z.COL1<E.Col1 and Z.Col2 < E.Col2)
I don't have an environment to test this, so it probably has bugs. My apologies, but hopefully the idea is semi clear.
I would still try to do it outside of SQL.