I have table:
id | parent | regno | person
1 | 0 | 12 | 5
2 | 1 | 12 | 15
3 | 0 | 13 | 5
4 | 0 | 14 | 6
I have MySQL query...
SELECT *
FROM table
WHERE person='5';
...that returns rows 1 and 3.
In this table row 1 and 2 are related (same regno).
How can i build this query to include related rows?
Basically when searching for person 5 i need MySQL query to return following:
id | parent | regno | person
1 | 0 | 12 | 5
2 | 1 | 12 | 15
3 | 0 | 13 | 5
Parent column has id of column it is related to, but it can be positive and negative integer. All related rows always have same regno.
Thank you.
You want all people who have a regno that is the same as the regno of anyone who is person 5:
--this main query finds all people with the regno from the subquery
SELECT *
FROM table
WHERE regno IN
( --this subquery finds the list of regno
SELECT regno
FROM table
WHERE person = '5'
)
There are other ways to write this; i'm not a fan of IN, and personally would write it like this:
SELECT t.*
FROM table t
INNER JOIN
(
SELECT DISTINCT regno
FROM table
WHERE person = '5'
) u
WHERE t.regno = u.regno
But it's harder to understand, and it's quite likely that these queries would end up being executed identically internally anyway. In this form the DISTINCT is required to make the regno from the subquery unique. If it were not, joined rows would end up duplicated. Why do I prefer it over IN? In some database systems IN's implementation can be very naive and low performing. "Never use IN to create a list longer then you would write by hand" is an old mantra I tend to stick to. This join pattern is also more flexible, can work with multiple values. Not every database supports Oracle-esque where x,y in ((1,3),(3,4)) value multiples
As an aside (and partly in response to the first comment on this answer) it would be more typical and more useful/usual to have the database prepare a set of rows that had parent and child data on the same line
It would look more like this:
SELECT *
FROM
table c
LEFT OUTER JOIN
table p
ON c.regno = p.regno AND p.parent = 1
WHERE c.person = '5' AND c.parent=0
This is assuming your "parent" column is 0 1 indicating true false.. you seem to have made a comment that parent is the id of the relative (not sure if it's parent-of or parent-is)
For a table where there is an id, and parentid column, and the parentid is set to a value when the row is a child of that other id;
id, parentid, name
1, null, Daddy
2, 1, Little Jonny
3, 1, Little Sarah
That looks like:
SELECT *
FROM
table c
INNER JOIN
table p
ON c.parentid = p.id
WHERE p.parentid ID NULL
Rows can have only one parent. A NULL in the parent id defines the row as being a parent, otherwise it's a child. You could turn this logic on its head if you wanted, call the column isparentof and have all child rows with null in the isparentof, and anyone who is a parent of a child, out the child id in isparentof. This then limits you to one child per multiple parents (single child families).. the query to pull them out is broadly the same
You can get all the id values for the person = '5' in a Derived Table.
Now, join back to the main table, matching either the absolute of parent (to get the child row(s)) or the id (to get the parent id row itself).
Based on discussion in comments, Try:
SELECT t.*
FROM your_table AS t
JOIN
(
SELECT id AS parent_id
FROM your_table
WHERE person = '5'
) AS dt
ON dt.parent_id = ABS(t.parent) OR
dt.parent_id = t.id
It is hard to comprehend though, why would you put negative values in parent!
Related
I'd need a query to SELECT all of these entries that are not repeated with another value. I explain the case in the next lines.
The situation
I've got a table of items and values. Each item can be repeated with different values. Let's say I have the following set of records in Table B:
item_id value type_value
ID Item A 0 0
ID Item B 0 0
ID Item A 1 0
ID Item C 1 1
These items, as probably you have already guessed, are IDs, so the "original" items with their information are in another table. What I'm trying to do is to select from the "original" table those items which are in this second table that I've explained.
What I need
As I introduced before, I need to select from a Table A all those items which IDs are IN Table B, but only those which have the value set to 0 and no other record is set to 1 with the same type of "type_value".
Because of the "original" table, I need to do so in a WHERE clause with an INNER SELECT. The result that would be output would be, in this case:
item_id value type_value
ID Item B 0 0
ID Item C 1 1
If we decided to only SELECT those with a specified type_value, I know how to do that, so do not worry about it.
The problem
I am able to do so, at least almost. My problem comes when I have the same item_id with different value fields, so when I try to say "WHERE value != 1", for example, this still gets selected as there is another record with value = 0.
The question
How could I SELECT the rows I'd like in an inner select in a WHERE clause of a main query without having to repeat the whole SELECT with a NOT IN and adding a "WHERE value = 1" to exclude those who have that value?
As it can be a long and complex query (the main one), I'd like to keep it as simple as possible. Of course, as I said before, I can copy the whole query and select those with value set to the one I do not want to, and put a "AND NOT IN" before that SELECT. But that'd repeated code and I think performance could be affected.
Thanks to all of you for your time!
If you need further explanation, please, let me know!
EDIT
Table_A
+----+--------+
| id | name |
+----+--------+
| 1 | Item A |
+----+--------+
| 2 | Item B |
+----+--------+
| 3 | Item C |
+----+--------+
Table_B
+---------+-------+
| item_id | value |
+---------+-------+
| 1 | 0 |
+---------+-------+
| 2 | 0 |
+---------+-------+
| 1 | 1 |
+---------+-------+
| 3 | 1 |
+---------+-------+
Sample query
SELECT name
FROM Table_A
WHERE id IN (SELECT item_id
FROM Table_B
WHERE "item_value is equal to 0 and no other row has this item_id with a item_value different from 0")
Result query
+---------+
| name |
+---------+
| Item B |
+---------+
You could do it by using GROUP BY and HAVING, for example:
SELECT a.*, b.itemid, SUM(b.value) AS vc FROM tableb b
INNER JOIN tablea a ON a.itemid = b.itemid
GROUP BY b.item_id
HAVING vc = 0
I used really simple query structure to make you understand the solution.
There might be better queries to do the job.
Here goes:
First you need to get the rows that you do not want to appear:
This is how it is done :
SELECT item_id,COUNT(item_id) how_many FROM my_table GROUP BY item_id HAVING (how_many>1)
Now you need to select from the table the rows that item_id does not appear in the above query, it is done this way:
SELECT T1.item_id,T1.value,T1.type_value
FROM my_table T1
WHERE (T1.item_id not in (SELECT T.item_id from (SELECT item_id,COUNT(item_id) how_many FROM my_table GROUP BY item_id HAVING (how_many>1))T ))
You can see that I used the "not in" operator and that I named the previous result table - T
Make sure you understand the above
Ask if you need more info
Example dataset:
id | tag
---|------
1 | car
1 | bike
2 | boat
2 | bike
3 | plane
3 | car
id and tag are both indexed.
I am trying to get the id who matches the tags [car, bike] (the number of tags can vary).
A naive query to do so would be:
SELECT id
FROM test
WHERE tag = 'car'
OR tag = 'bike'
GROUP BY id
HAVING COUNT(*) = 2
However, doing so is quite inefficient because of the group by and the fact that any line that match one tag is taken into account for the group by (and I have a large volumetry).
Is there a more efficient query for this situation?
The only solution I see would be to have another table containing something like:
id | hash
---|------
1 | car,bike
2 | boat,bike
3 | plane,car
But this is not an easy solution to implement and maintain up to date.
Additional infos:
the name matching must be exact (no fulltext index)
the number of tags is not always 2
try this:
SELECT id
FROM test
WHERE tag in('car','bike')
GROUP BY id
HAVING COUNT(*) = 2
And create a nonclustered index on tag column
Here you go:
select id from TEST where tag = 'car' and ID in (select id from TEST where tag='bike')
not sure if I get you, but try this:
select tag, count(*) as amount
into #temp
from MYTABLE
group by tag
select t1.tag
from #temp t1 join #temp t2 on t1.amount=t2.amount and t1.tag=t2.tag and t1.amount=2
should result bike and car since they both have 2 rows, whihc is equal to 2
I have my table: call it tblA THis table has three rows, id, sub-id, and visibility
sub-id is the primary key (it defines taxonomies for id). I'm trying to build a query that selects every id that appears less than three times.
here is an example query/result
select * from tbla where id = 188002;
+--------+--------+-------------+
| sub-id | id | visibility |
+--------+--------+-------------+
| 284922 | 188002 | 2 |
| 284923 | 188002 | 2 |
| 284924 | 188002 | 0 |
+--------+--------+-------------+
From what i've seen here and here it looks like I need to join the table on...itself. I dont really understand what that accomplishes.
If anyone has insight into this, it is appreciated. I will continue to research it and update this topic with any additional information I come across.
Thanks
SELECT id
FROM tbla
GROUP BY id
HAVING COUNT(*) < 3
If you want to select all columns from the table, you will have to use #Joe's query in a sub-select:
SELECT * FROM tbla a
WHERE a.id IN (SELECT DISTINCT b.id
FROM tbla b
GROUP BY b.id
HAVING COUNT(*) < 3)
This query first selects all id's that have fewer than 3 duplicates.
The distinct eliminates duplicates, the query works the same without, but slightly slower.
Next it selects all rows that have an id that meets the criteria in the sub-select i.e. that have fewer than 3 duplicate id's.
The reason that you cannot do this in one go is that the group by heaps all rows with the same id together into one super-row (for want of a better metafor) .
You cannot separate out the columns that are not in the group by clause.
The outer select solves this.
For simplicity, I will give a quick example of what i am trying to achieve:
Table 1 - Members
ID | Name
--------------------
1 | John
2 | Mike
3 | Sam
Table 1 - Member_Selections
ID | planID
--------------------
1 | 1
1 | 2
1 | 1
2 | 2
2 | 3
3 | 2
3 | 1
Table 3 - Selection_Details
planID | Cost
--------------------
1 | 5
2 | 10
3 | 12
When i run my query, I want to return the sum of the all member selections grouped by member. The issue I face however (e.g. table 2 data) is that some members may have duplicate information within the system by mistake. While we do our best to filter this data up front, sometimes it slips through the cracks so when I make the necessary calls to the system to pull information, I also want to filter this data.
the results SHOULD show:
Results Table
ID | Name | Total_Cost
-----------------------------
1 | John | 15
2 | Mike | 22
3 | Sam | 15
but instead have John as $20 because he has plan ID #1 inserted twice by mistake.
My query is currently:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
Adding DISTINCT s.planID filters the results incorrectly as it will only show a single PlanID 1 sold (even though members 1 and 3 bought it).
Any help is appreciated.
EDIT
There is also another table I forgot to mention which is the agent table (the agent who sold the plans to members).
the final group by statement groups ALL items sold by the agent ID (which turns the final results into a single row).
Perhaps the simplest solution is to put a unique composite key on the member_selections table:
alter table member_selections add unique key ms_key (ID, planID);
which would prevent any records from being added where the unique combo of ID/planID already exist elsewhere in the table. That'd allow only a single (1,1)
comment followup:
just saw your comment about the 'alter ignore...'. That's work fine, but you'd still be left with the bad duplicates in the table. I'd suggest doing the unique key, then manually cleaning up the table. The query I put in the comments should find all the duplicates for you, which you can then weed out by hand. once the table's clean, there'll be no need for the duplicate-handling version of the query.
Use UNIQUE keys to prevent accidental duplicate entries. This will eliminate the problem at the source, instead of when it starts to show symptoms. It also makes later queries easier, because you can count on having a consistent database.
What about:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN
(select distinct ID, PlanID from member_selections) s
USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
By the way, is there a reason you don't have a primary key on member_selections that will prevent these duplicates from happening in the first place?
You can add a group by clause into the inner query, which groups by all three columns, basically returning only unique rows. (I also changed 'premium' to 'cost' to match your example tables, and dropped the agent part)
SELECT
sq.ID,
sq.name,
SUM(sq.Cost) AS total_cost
FROM
(
SELECT
m.id,
m.name,
g.Cost
FROM
members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
GROUP BY
m.ID,
m.NAME,
g.Cost
) sq
group by
sq.ID,
sq.NAME
Editted heavily!
The original question was based on a misunderstanding of how IN() treats a column from a results set from a join. I thought IN( some_join.some_column ) would treat a results column as a list and loop through each row in place. It turns out it only looks at the first row.
So, the adapted question: Is there anything in MySQL that can loop through a column of results from a join from a WHERE clause?
Here's the super-simplified code I'm working with, stripped down from a complex crm search function. The left join and general idea are relics from that query. So for this query, it has to be an exclusive search - finding people with ALL specified tags, not just any.
First the DB
Table 1: Person
+----+------+
| id | name |
+----+------+
| 1 | Bob |
| 2 | Jill |
+----+------+
Table 2: Tag
+-----------+--------+
| person_id | tag_id |
+-----------+--------+
| 1 | 1 |
| 1 | 2 |
| 2 | 2 |
| 2 | 3 |
+-----------+--------+
Nice and simple. So, naturally:
SELECT name, GROUP_CONCAT(tag.tag_id) FROM person LEFT JOIN tag ON person.id = tag.person_id GROUP BY name;
+------+--------------------------+
| name | GROUP_CONCAT(tag.tag_id) |
+------+--------------------------+
| Bob | 1,2 |
| Jill | 2,3 |
+------+--------------------------+
So far so good. So what I'm looking for is something that would find only Bob in the first case and only Jill in the second - without using HAVING COUNT(DISTINCT ...) because that doesn't work in the broader query (there's a seperate tags inheritance cache and a ton of other stuff).
Here's my original sample queries - based on the false idea that IN() would loop through all rows at once.
SELECT DISTINCT name FROM person LEFT JOIN tag ON person.id = tag.person_id
WHERE ( ( 1 IN (tag.tag_id) ) AND ( 2 IN (tag.tag_id) ) );
Empty set (0.00 sec)
SELECT DISTINCT name FROM person LEFT JOIN tag ON person.id = tag.person_id
WHERE ( ( 2 IN (tag.tag_id) ) AND ( 3 IN (tag.tag_id) ) );
Empty set (0.00 sec)
Here's my new latest failed attempt to give an idea of what I'm aiming for...
SELECT name, GROUP_CONCAT(tag.tag_id) FROM person LEFT JOIN tag ON person.id = tag.person_id
GROUP BY person.id HAVING ( ( 1 IN (GROUP_CONCAT(tag.tag_id) ) ) ) AND ( 2 IN (GROUP_CONCAT(tag.tag_id)) );
Empty set (0.00 sec)
So it seems it's taking a GROUP_CONCAT string, of either 1,2 or 2,3, and is treating it as a single entity rather than an expression list. Is there any way to turn a grouped column into an expression list that IN () or =ANY() will treat as a list?
Essentially, I'm trying to make IN() loop iteratively over something that resembles an array or a dynamic expression list, which contains all the rows of data that come from a join.
Think about what your code is doing logically:
( 1 IN (tag.tag_id) ) AND ( 2 IN (tag.tag_id) )
is equivalent to
( 1 = (tag.tag_id) ) AND (2 = (tag.tag_id) )
There's no way tag.tag_id can satisfy both conditions at the same time, so the AND is never true.
It looks like the OR version you cited in your question is the one you really want:
SELECT DISTINCT name FROM person LEFT JOIN tag ON person.id = tag.person_id
WHERE ( ( 1 IN (tag.tag_id) ) OR ( 2 IN (tag.tag_id) ) );
Using the IN clause more appropriately, you could write that as:
SELECT DISTINCT name FROM person LEFT JOIN tag ON person.id = tag.person_id
WHERE tag.tag_id in (1,2);
One final note, because you're referencing a column from the LEFT JOINed table in your WHERE clause (tag.tag_id), you're really forcing that to behave like an INNER JOIN. To truly get a LEFT JOIN, you'd need to move the criteria out of the WHERE and make it part of the JOIN conditions instead:
SELECT DISTINCT name FROM person LEFT JOIN tag ON person.id = tag.person_id
AND tag.tag_id in (1,2);
WHERE ( ( 1 IN (tag.tag_id) ) AND ( 2 IN (tag.tag_id) ) );
This will never return any results since tag.tag_id cannot be 1 and 2 at the same time.
Additionally is there a reason you're using 1 IN (blah) rather than blah = 1?