I'm trying to learn sql better, views more specifically but I can't get the following to work out for me.
I've put a slimmed down version of it here. There's more joins I have to do based on foreign keys from the tbl2 matches.
Since it's a view, I can't create temp tables.
I can't rely on stored procedures in this case.
I could do outer apply, but only to get specific references (row 1, 2...) and that would be by doing a Select * from Table2 where.... and that would mean 1 index scan per time I use it.
I could create the view using "With tbl2 (FK_TABLE1...) as SELECT FK_TABLE1 from dbo.TABLE2) but that doesn't seem to be helpful. Each reference to it does a sort or a scan so no gain there.
Is there some way I'm able to create some type of list that I can reuse so I can simply just run 1 index scan to get the matching ones from Table2?
Or is there another way to think about this?
Table1 (PK, XX, YY)
Table2 (PK, FK_TABLE1, Type, Progress, ZZ, FK_Status)
Create View MyView
as
Select
Table1.PK
,Table1.XX
,Table1.YY
---- I want to present data from the first 3 matches
,(SELECT ZZ from tbl2 where tbl2.FK_TABLE1 = FK_TABLE1.PK ORDER BY Type ASC OFFSET(0) ROWS FETCH NEXT (1) ROWS ONLY) ZZ1
,(SELECT ZZ from tbl2 where tbl2.FK_TABLE1 = FK_TABLE1.PK ORDER BY Type ASC OFFSET(1) ROWS FETCH NEXT (1) ROWS ONLY) ZZ2
,(SELECT ZZ from tbl2 where tbl2.FK_TABLE1 = FK_TABLE1.PK ORDER BY Type ASC OFFSET(2) ROWS FETCH NEXT (1) ROWS ONLY) ZZ3
,sts.StatusName CurrentStatus
From Table1
LEFT OUTER JOIN Table2 AS tbl2 ON (tbl2.FK_TABLE1= Table1.PK) ---- Here I want to make some sort of join so I get all matching rows from the other table
LEFT OUTER JOIN STATUS AS sts ON (sts.PK = [tbl2 ordered by type, if last elements status = X take that, else status of first).FK_STATUS) ---- Here I'm a bit puzzled, since I have to order by, but also have a fallback value if last element isn't matching.
Related
I have a select request in MySQL that takes between 25-30s, which is extremely long and I was wondering if you could help me fasten it.
CREATE TEMPORARY TABLE results(
id VARCHAR(30),
secondid VARCHAR(5),
allele VARCHAR(30),
translation VARCHAR(10),
level VARCHAR(20),
subgroup VARCHAR(20),
subgroup2 VARCHAR(20)
);
INSERT INTO results(id, secondid, allele, level) SELECT DISTINCT t1.id, t1.secondid, t1.texte, t3.texte
FROM database t1
JOIN database t2 ON t1.id=t2.id
JOIN database t3 ON t1.id=t3.id AND t1.secondid=t3.secondid
WHERE (t1.qualifier,t2.qualifier) = ("allele","organism") AND t3.qualifier = "level_length" AND t3.texte NOT REGEXP "X" AND t3.texte IS NOT NULL
AND t2.texte = ? AND t1.texte REGEXP ?
GROUP BY t1.texte;
UPDATE results SET translation = (SELECT t1.qualifier
FROM database t1
JOIN database t2 ON t1.id=t2.id AND t1.secondid=t2.secondid
JOIN database t3 ON t1.id=t3.id AND t1.secondid=t3.secondid
WHERE t1.qualifier IN ("protein","ncRNA","rRNA") AND t2.texte=results.allele AND t3.texte=results.level LIMIT 1);
UPDATE results SET subgroup = (SELECT t2.subgrp
FROM alleledb.alleleSubgroups t1
JOIN alleledb.subgroups t2 ON t1.subgroup=t2.subgroup
WHERE t1.gene=SUBSTRING_INDEX(results.allele, "*", 1) AND t1.species=? LIMIT 1);
ALTER TABLE results DROP id, DROP secondid;
SELECT * FROM results ORDER BY subgroup ASC, level ASC;
DROP TABLE results;
I need to go through many dbs to get join (same id), database are huge but results to extract are quite low (less than 1% of all the database). The majority of the results are stored in the same db, in different rows (with the same id and secondid). However, id and secondid are not unique to the rows I need to select, only the combinaison of two is.
Thank you.
I would start by having a proper composite index on your database table
First on
(qualifier, id, secondid, texte)
This will help your joins, the where testing and NOT have to go back to the actual raw data tables for the records as the index has the data you are interested in.
Next, I would adjust the query/joins. Since you are specifically looking for the "allele" and "organism" from t1 and t2 respectively, make them as such.
I have no idea what you are doing with your REGEXP "X" or "?" values for texte, but you'll figure that out after.
Here is how I would revise the queries
insert into ...
SELECT DISTINCT
t1.id,
t1.secondid,
t1.texte,
t3.texte
FROM
database t1
JOIN database t2
ON t1.id = t2.id
AND t2.qualifier = 'organism'
JOIN database t3 ON
t1.id = t3.id
AND t1.secondid = t3.secondid
AND t3.qualifier = 'level_length'
WHERE
t1.qualifier = 'allele'
AND t1.texte REGEXP ?
-- I would move these t2 and t3 into the respective JOINs above directly.
AND t3.texte NOT REGEXP "X"
AND t3.texte IS NOT NULL
AND t2.texte = ?
GROUP BY
t1.texte;
As for your UPDATE commands, having a second index on (id, secondid) will help on the join to t2 and t3 since there is no qualifier context to the join.
As for your UPDATE commands, as Rick mentioned, without some context of an ORDER BY clause, you have no guarantee WHICH record is returned back by the LIMIT 1.
First of all, thank you for all your help.
My first table (The insert to and the first update, database named) looks like this :
I want all things in red. In others words, I need some parameters which has the same id and secondid as the "level" which is unique among the id. Whereas others parameters may be repeated within the same id (but different second id).
I am filtering using the allele name (ECK in EC locus) with thé REGEXP and species. For example, all allèles from EC locus of human.
Then (last update), I take one parameter (allele), substring it and go to a database that gives me one id (one row -> one id). And I use this id on annoter database that gives me one or two rows (one subgroup or two subgroups/rare). So as in my example I only has one group, the absence of ORDER BY was not seen. But yes I want to order (get the subgroup that contains the allele in first). I don't know how to do that.
Finally, I can try to make an index but due to the size of the db, I'm wondering the time and the size of such an index. Would it significally improve time and can I remove it ?
The REGEXP "X" is to remove every matches that are not relevant regarding this parameter (I don't want them).
The ? is user input (for the species/2 occurrences this one and the locus).
The operations on the first database takes 30s, last operation on the two databases lasts 1-2s. Others (drop , select) are <20ms (not the problem).
E.g. in Pandas, we can apply a mask and create a new dataframe and assign it a name. Similarly in SQL, once I do a LEFT JOIN of 2 tables, is there a way to refer to the new combined table ?
You can join two table and can get the result in the new combined and also you can give name to that table . Just try this query and if get any doubt just feel free to ask anytime.
MYSQL QUERY
EMP(C1, C2, CD1)
DEPT(D1, D2)
SELECT NEWTABLE.First, NEWTABLE.Third
FROM
(SELECT E.C1 AS First, E.C2 AS Second, D.D2 AS Third FROM EMP E, DEPT D WHERE
E.CD1 = D.D1) NEWTABLE
WHERE NEWTABLE.Second > 20;
We have created a virtual table i.e "NEWTABLE" you can give your name also .
(SELECT E.C1 AS First, E.C2 AS Second, D.D2 AS Third FROM EMP E, DEPT D WHERE
E.CD1 = D.D1)
This is the query for where we have applied join query and also we have selected the three row from two table and renamed it as "FIRST", "SECOND" and "THIRD".
And you will get the doubt in the first line so let me clear we have performed the operation NEWTABLE.Second > 20;on the new table which we obtained after join.
If you still get any doubt regarding the query just ask .
Values Stored in the new table is temporary and you can use it for that query only.
And if you want to store permanent value then you have create to new table then assigned that table with the table we joined and so on .
No that won't work in sql, at least not directly
But you can do a subquery
Like
SELECT aa.*
FROM
(SELECT t1.*,t2.* FROM table1 t1 LEFT JOIN table2 t2 ON t1.id = t2.refid) aa
or A view
CREATE VIEW v AS SELECT t1.*,t2.* FROM table1 t1 LEFT JOIN table2 t2 ON t1.id = t2.refid;
A problem can result, when you have in both tables the same names for columns, that would cause problems, so you must check and in case of equal columnames alias the second column
I just learned how to use JOINS (so please be gentle ;) ), and wrote this query:
SELECT 1
FROM `T1`
INNER JOIN `T2` ON `T1`.`T1_ID` = `T2`.`REQUIREMENT`
WHERE `T2`.`T1_ID` = XXX
AND `T1`.`STATEMENT` = YYY
AND `T2`.`REQUIREMENT` != 0 //last row does not work as intended!
It works perfectly without the last condition: if T1_ID from T1 does match REQUIREMENT from T2 for given T1_ID (XXX) - it does work. I wanted additional STATEMENT from T1 to be match (YYY) - and it still does work.
But then I realized, that I need to exclude one cause: when T2.REQUIREMENT is equal to 0, I want this query to return 1 regardless of the JOIN formula. The problem is, that if T2.REQUIREMENT = 0, I know for sure that there will not be any T1.T1_ID entry that will match the JOIN requirements. So I understand, that this last condition has no right to work like I'd wish it was.
What I need is some kind of IF statement. Something that would work like:
SELECT 1
IF (`T2`.`REQUIREMENT`!=0) //if true, don't even go to join, and return 1
OR (my previous join query)
The thing is, that I have no idea how to implement such IF statement into mysql.
Any ideas? Thanks.
Sample data:
T1:
id STATEMENT T1_ID
1 irrelevant 1
2 irrelevant 5
T2:
id T1_ID REQUIREMENT
1 1 0
2 2 0
3 3 1
4 4 3
5 5 4
6 6 5
7 7 6
Such setup should return 1 for T1_ID equal 1, 2, 3, 6.
In addition, if it's even possible in single query, I'd like it to return 1 as well even if T1 was empty, for all T2.REQUIREMENT=0 - in this case T1_ID equal 1, 2.
Just FYI, good start on your post and example... tableName.columnName references (or alias.columnName) should always be provided to prevent ambiguity that others don't know your table structure. Also, you only really need the tick marks for things like reserved words or column names that have spaces (never like these anyhow).
From my reading your question and sample tables T1 and T2... T1 appears to be some Lookup table and has IDs and descriptions associated to said IDs. Your T2 table appears to be your detail/transaction based table and it may or not have an actual requirement hence your desire to always include those records without a specific requirement.
If this is the case, it sounds like you want "all detail records that have some condition REGARDLESS of a matched requirement ID as found in the lookup table." If this is accurate, you would be looking for
Select
T2.T1_ID,
coalesce( T1.Statement, '' ) StatementFromT1Table
from
SomeMainTable T2
LEFT JOIN SomeLookupTable T1
on T2.T1_ID = T1.T1_ID
where
T2.T1_ID = SomeIdParameterValue
AND ( T1.T1_ID is NULL
OR T1.Statement = SomeOtherParameterValue )
The join between tables I always try to list the left-side table first, then indent to the right-side table and have my ON condition show the left.column = right.column so you always see the relationship and how table A gets to table B (and nested more as other joins come into play).
The different between (INNER) JOIN and LEFT JOIN is that (INNER) JOIN REQUIRES a record to always match on both tables. LEFT JOIN means I want everything from the table on the left side REGARDLESS of an actual match in the right table.
So at this point, I get all from T2 alias table first regardless of the answer in T1 alias. Now, how to deal with the zero remarks id value. If zero indicates you KNOW there wont be a match in T1, then you can just say I want all records on the T2 side if the T1 side IS NULL... But you also care for a specific statement, hence the OR within the parenthesis test.
The first part of the where is if you were specifically looking for all things of a T1_ID = some value, so that is a primary parameter and only applicable to the T2 side... you are qualifying the T1 side via the AND ( null or other equality test ) condition.
If you have some confidential data, it is ok to randomize / provide sample data, but having real table name reference / context will help us better understand what you are trying to accomplish. Ambiguity in table names and columns does not help us mentally understand and might have better query solutions having a better understanding.
If I am close and you need additional clarification, please advise and/or edit your original post with additional sample data and final output... such as parameters being filtered for too.
POST CLARIFICATION.
Per your comments, here are the clarifications...
The "SomeMainTable T2" is actually a breakdown of the actual table name within your database and "T2" is the ALIAS reference. Imagine your table name is "SomethingReallyLongDetail". Would you prefer to write your query something like
select
SomethingReallyLongDetail.Column1,
SomethingReallyLongDetail.Column7,
SomethingReallyLongDetail.Column20
from
SomethingReallyLongDetail
OR...
select
SRL.Column1,
SRL.Column7,
SRL.Column20
from
SomethingReallyLongDetail SRL
In this case, I used an alias "SRL" to more easily correlate to the table name as an acronym / abbreviation vs having to type the long value over and over, then have more chance of typing mistakes. Simply for readability providing the "alias" reference within the query. So, I did not know your ACTUAL table name, so I made it up but using the "T1" and "T2" references to stay in-line with your original post.
Next, COALESCE(). Since this query does a LEFT-JOIN, The right-side table may (or not) actually have a record match on the ID as you know might not always exist. Since I was trying to pull the "Statement" column from that second table (alias T1), that description could be NULL which you probably would not want to show in any sort of output. To prevent that, COALESCE() says, give me the value from the first parameter in the list... If that value is null, give me the second value. In this case the second value is just an empty string.
Parameters in the query. Your original query had reference to XXX and YYY such as you knew of a specific T1.ID value you wanted to narrow down to pulling out, but a different value YYY as being part of the statement. So the place where you had an "XXX", I just put a place-holder here for you to apply/put any value you were specifically looking for. Similarly for your "YYY" value, another place-holder for that. Just substitute whatever criteria you were looking for.
Finally that AND part of the where clause. This is for the condition of the LEFT-JOIN. Since you KNOW that not all records will have a match in the "T1" secondary table, with the LEFT JOIN, the ID will be found and HAVE a value, or there will not be a value and thus NULL.
If there is no matching record, you would never be able to compare some string, int, date, whatever to a column as it would be null. So I am doing
(T1.T1_ID IS NULL -- as in there was no match
OR T1.Statement = SomeOtherParameterValue ) -- there WAS a match, and I only want where the statement equals a given value.
Per your comments and example results, your query SHOULD be simplified to...
Select
T2.ID,
T2.T1_ID,
T2.Requirement,
coalesce( T1.Statement, '' ) StatementFromT1Table
from
T2
LEFT JOIN T1
on T2.Requirement = T1.T1_ID
where
T2.Requirement = 0
OR T1.T1_ID IS NOT NULL
In your case, the final answer is... I want all records where there is no requirement (thus = 0) OR the record DOES have a match in the T1 table (thus T1.T1_ID IS NOT NULL)
I am thinking that you want a LEFT JOIN:
SELECT 1
FROM `T2` LEFT JOIN
`T1`
ON `T1`.`T1_ID` = `T2`.`REQUIREMENT`
WHERE (`T2`.`REQUIREMENT` <> 0) OR
(`T1`.`STATEMENT` = YYY AND `T2`.`T1_ID` = XXX);
This returns rows for all T2 values where REQUIREMENT != 0. It also returns the rows generated by the JOIN. Of course 1 is not very descriptive, so you want be able to tell which rows are which.
Your question would be much easier to follow with sample data and desired results.
if T2.REQUIREMENT=0 I know for sure, that there will not be any
T1.T1_ID entry that will match the JOIN requirements
So in order to get 1 returned when T2.REQUIREMENT=0, the join must match this condition too:
SELECT 1
FROM `T1` INNER JOIN `T2`
ON `T1`.`T1_ID` = `T2`.`REQUIREMENT` OR `T2`.`REQUIREMENT`=0
WHERE `T2`.`T1_ID`=XXX
AND `T1`.`STATEMENT`=YYY
Edit:
or just append 1s with UNION for all rows that have T2.REQUIREMENT=0:
SELECT 1
FROM `T1` INNER JOIN `T2`
ON `T1`.`T1_ID` = `T2`.`REQUIREMENT`
WHERE `T2`.`T1_ID`=XXX
AND `T1`.`STATEMENT`=YYY
UNION ALL
SELECT 1
FROM `T2`
WHERE `T2.REQUIREMENT=0`
this will work even if T1 is empty.
My table has a columns labeled primary_key and summary_id. The value in the second field summary_id in each record maps to the primary_key field of another record. There is a third field template_id. I need to select those records for which:
template_id is a certain value. Let's say 4.
primary_key matches at least one of the records' summary_id field.
Please don't tell me to redesign the tables. My next project will have a better design, but I don't have time for that now. I need to do this with one or more queries; the fewer the better. Ideally, there's some way to do this with one query, but I'm okay if it requires more.
This is how far I've gotten with my own query. (I know it's seriously lacking, which is why I need help.)
SELECT DISTINCT esjp_content.template_id
FROM esjp_content
INNER JOIN esjp_hw_config ON esjp_content.template_id = esjp_hw_config.proc_id
INNER JOIN esjp_assets ON esjp_hw_config.primary_key = esjp_assets.hw_config_id
WHERE
esjp_content.summary_id > 0
AND
(esjp_assets.asset_label='C001498500' OR esjp_assets.asset_label='H0065' OR esjp_assets.asset_label='L0009');
SELECT
esjp_content.primary_key, esjp_content.template_id, esjp_content.content, esjp_content.summary_id
FROM
esjp_content
WHERE
esjp_content.template_id = 4;
I need the records that summary_id points to. For example, if summary_id is 90, then I need the record where primary_key is 90.
You're looking for the existence of at least one row where summary_id = your primary key. like this.
SELECT *
FROM esjp_content c
WHERE template_id = 4
AND EXISTS (SELECT 1 FROM esjp_content c2 WHERE c2.summary_id = c.primary_key)
You can JOIN same table by using both IDs:
SELECT
t1.*
FROM
esjp_content t1
INNER JOIN esjp_content t2 ON t1.summary_id = t2.primary_key
WHERE
t1.template_id = 4
this is probably something simple but I can't wrap my head around it. I've tried IN, NOT EXISTS, EXCEPT, etc... and still can't seem to get this right.
I have two tables.
Table A
-----------
BK
NUM
Table B
------------
BK
NUM
How do I write a query to remove all records from table A, that are not in table B based on the two fields. So if Table A has a record where BK = 1 and NUM = 2, then it should look in table B. If table B also has a record where BK = 1 and NUM = 2 then do nothing, but if not, delete that record from table A. Does that make sense?
Any help is much appreciated.
You can do so
delete from tablea
where (BK,NUM) not in
(select BK,NUM from tableb)
using exists
delete from tablea a
where not exists
(select 1 from tableb where BK=a.BK and NUM = a.NUM)
Another alternative is to use an anti-join pattern, a LEFT [OUTER] JOIN and then a predicate in the WHERE clause that filters out all matches.
It's easiest to write this as a SELECT first, test it, and then convert to a DELETE.
SELECT t.*
FROM tablea t
LEFT
JOIN tableb s
ON s.BK = t.BK
AND s.NUM = t.NUM
WHERE s.BK IS NULL
The LEFT JOIN returns all rows from t along with matching rows from s. The "trick" is the predicate in the WHERE clause... we know that s.BK will be non-NULL on all matching rows (because the value had to satisfy an equality comparison, in a predicate in the ON clause). So s.BK will be NULL only for rows in t that didn't have a matching row in s.
For MySQL, changing that into a DELETE statement is easy, just replace the SELECT keyword with DELETE. (We could write either DELETE t or DELETE t.*, either of those will work.
(This is an illustration of only one (of several) possible approaches.)