How to see if all values within group are unique/identify those that aren't - unique

Say that I have data like this:
group value
1 fox
1 fox
1 fox
2 dog
2 cat
3 frog
3 frog
4 dog
4 dog
I want to be able to tell if all values of value are the same within group. Another way to see this is if I could create a new variable that contains all unique values of value within group like the following:
group value all_values
1 fox fox
1 fox fox
1 fox fox
2 dog dog cat
2 cat dog cat
3 frog frog
3 frog frog
4 dog dog
4 dog dog
As we see, all groups except group 2 have only one distinct entry for value.
One way I thought that a similar thing (but not as good) could be done is to do the following:
bys group: egen tag = tag(value)
bys group: egen sum = sum(tag)
And then based on the value of sum I could determine if there were more than one entry.
However, egen tag does not work with bysort. Is there any other efficient way to get the information I need?

There are several ways to do this. One is:
clear
set more off
input ///
group str5 value
1 fox
1 fox
1 fox
2 dog
2 cat
3 frog
3 frog
4 dog
4 dog
end
*-----
bysort group (value) : gen onevalue = value[1] == value[_N]
list, sepby(group)
Suppose you have missings, but want to ignore them (not drop them); then the following works:
clear
set more off
input ///
group str5 value
1 fox
1 fox
1 fox
2 dog
2 cat
3 frog
3 frog
4 dog
4 dog
5 ox
5 ox
5
6 cow
6 goat
6
end
*-----
encode value, gen(value2)
bysort group (value2) : replace value2 = value2[_n-1] if missing(value2)
by group: gen onevalue = value2[1] == value2[_N]
list, sepby(group)
See also this FAQ, which has technique that resembles your original strategy.

Related

MySQL, perform a query that selects and filters the results

I have a table with this structure:
id seller class ref color pricekg
1 manta apple apple-red red 0.147
2 manta pear pear-green green 0.122
3 poma apple apple-red red 0.111
4 arnie melon melon-green green 0.889
5 pinas pineapple pinneaple-brown brown 0.890
6 gordon apple apple-red red 0.135
I would need to get some fruits from some sellers, with some preferences.
My first objective is to know who sells what im looking for, and after I know that, pick the best one.
When I do the first query I get this:
Query ->
SELECT *
FROM `fruits`
WHERE `seller`
IN ("manta", "poma", "pinas", "gordon")
AND `class` IN ("apple", "pineapple")
ORDER BY id
Result 1 ->
1 manta apple apple-red red 0.147
3 poma apple apple-red red 0.111
5 pinas pineapple pinneaple-brown brown 0.890
6 gordon apple apple-red red 0.135
So far so good, however i get 3 sellers who have red apple's with the apple-red ref.
Now this is the part that i can't resolve...
With this result, I would like to filter the duplicated apples refs ( since i want to buy from one seller ).
If there's duplicates, select the one with the seller manta.
If there's duplicates, and no one of them is seller manta, then select the one with the lowest cost per kilogram.
So after the result 1, the second query (or subquery, or if there's a way to do it all in one query i really don't know what would be the best way) expected result would be:
1 manta apple apple-red red 0.147
5 pinas pineapple pinneaple-brown brown 0.890
Or in case manta didn't sell these it would be:
3 poma apple apple-red red 0.111
5 pinas pineapple pinneaple-brown brown 0.890
Is it possible to do this with only one query?
Or may I somehow do a view from the result or temporal table and then execute one more query to filter the duplicates.
How could I do this?
This is a prioritization query.
I think you want:
select f.*
from fruits f
where f.seller in ('manta', 'poma', 'pinas', 'gordon') and
f.class in ('apple', 'pineapple') and
f.id = (select f2.id
from fruits f2
where f2.seller in ('manta', 'poma', 'pinas', 'gordon') and
f2.class in ('apple', 'pineapple') and
f2.ref = f.ref
order by (f2.seller = 'manta') desc, -- put manta sellers first
f2.price asc -- then order by lowest price
limit 1
);

Find results from another table based on a conditional row

I have two tables, one is a list of families who have pets for adoption, the other is the list of potential adopters.
Some familes are only giving away all of their pets at once (all), but some are giving them away separately (one). This is indicated in the "how_many" row.
The adopters have preferences considering the animals, some would adopt only a specific one, other would adopt more than one.
What I'd like to achieve is to write a single query which connects the families with the potential adopters.
So if there's a family who would give away a cat and a bird (not separately), e.g. the Williams family (ID 3, all) they would be connected with User 4, who would like adopt a cat and a bird (besides a dog, but that doesn't matter, not all 3 have to match, just a subset).
Or, if there's a family who would give away their pets separately, e.g. the Smith family (ID 1, one), they would be connected with all those adopters who would like to adopt a bird or a cat, like User 1, User 2, etc.
Is there a way to achieve this in one query?
Families
id animal family how_many
1 bird Smith one
1 cat Smith one
2 bird Johnson one
2 dog Johnson one
3 cat Williams all
3 bird Williams all
4 bird Brown one
5 cat Jones all
6 bird Miller all
7 bird Davis one
7 cat Garcia one
7 bird Garcia one
7 dog Garcia one
Adopters
id animal adopter
1 cat User 1
1 bird User 1
2 bird User 2
2 dog User 2
3 bird User 3
3 dog User 4
4 cat User 4
4 dog User 4
4 bird User 4
5 bird User 5
6 bird User 6
Hmmm . . . I think the most reasonable approach is to combine two queries, one for the "one"s and one for the "all"s:
select f.*, a.id
from families f join
adopters a
on f.animal = a.animal
where f.how_many = 'one'
union all
select f.id, group_concat(f.animal), f.family, f.how_many, a.id
from families f left join
adopters a
on f.animal = a.animal
where f.how_many = 'all'
group by f.id, f.family, f.how_many, a.id
having count(*) = count(a.animal);

Making a win-lose record query in Access

Sorry in advance if I make some grammatical mistakes. The thing is that I'm making a MMA database with tables such as "Fighters" and "Fights".
In the table "Fights" I have two fields: WINNER and LOSER, so, in order to see how many fights are won or lose by a fighter I made two queries: one counting the wins and another one counting the losses. But I feel that is kinda useless.
Queries in SQL view:
SELECT FIGHTS.WINNER, Count(FIGHTS.WINNER) AS WIN
FROM FIGHTS
GROUP BY FIGHTS.WINNER;
___
SELECT FIGHTS.LOSER, Count(FIGHTS.LOSER) AS LOSE
FROM FIGHTS
GROUP BY FIGHTS.LOSER;
Resulting:
WINNER WINS
Raquel Pennington 1
Sara McMann 1
Sarah D'Alelio 2
Sarah Maloy 1
____
LOSER LOSE
Kaitlin Young 2
Lacey Schuckman 1
Lisa Ellis 1
Meghan Wright 1
I'd like a query that shows the losers and vice versa, so it could be something like this:
WINNER WINS
Raquel Pennington 1
Sara McMann 1
Sarah D'Alelio 2
Sarah Maloy 1
Kaitlin Young 0
Lacey Schuckman 0
Lisa Ellis 0
Meghan Wright 0
___
LOSER LOSE
Kaitlin Young 2
Lacey Schuckman 1
Lisa Ellis 1
Meghan Wright 1
Raquel Pennington 0
Sara McMann 0
Sarah D'Alelio 0
Sarah Maloy 0
I tried a lot of different combinations, queries, but it always ended messed up with names duplicated, incorrect records...
If I get to make this queries then the rest is piece of cake. I feel that I'm halfway there, but I'm totally blocked.
If you need screenshots or more info just ask, english is not my first language and it's hard to explain myself. Thanks in advance.
Using two tables - Fights and Fighters:
This query will return the total Wins/Losses:
SELECT Fighter
, COUNT(F1.WINNER) AS Wins
, COUNT(F2.LOSER) AS Losses
FROM (Fighters LEFT JOIN Fights F1 ON Fighters.Fighter = F1.Winner)
LEFT JOIN Fights F2 ON Fighters.Fighter = F2.Loser
GROUP BY Fighter
Giving this result:
In design view the query would look like this:
(Note: I haven't aliased the table names or result fields so the resulting query will have CountOfWINNER and CountOfLoser as field names).

mqsql - check if selected rows contain the same value

I need to check in mysql if certain columns contain the same value, but don't actually know the value yet. All the solutions I found until now were using count in combination with a where clause. But that doesn't work for me, because I don't know the values of the colums. For example:
Index ColB ColC ColD ColE
1 1 cat 1.3 black
2 1 cat 1.3 black
3 1 cat 1.3 white
4 1 cat 1.3 tiger
5 1 cat 1.3 white
I would like to check if the 3 columns ColB,ColC and ColD have the same value. For the table above it should return true. However for the following table it should return false
Index ColB ColC ColD ColE
1 1 dog 1.3 black
2 1 cat 1.3 black
3 2 cat 1.3 white
4 1 cat 1.3 tiger
5 1 cat 2.7 white
The rule should be sth like that: if(ColB_hasDifferentValues || ColC_hasDifferentValues || ColD_hasDifferentValues) { return true } ;
Is that possible? As I said before, I don't know which animals are included in ColC, as users can insert new animals.
Thanks a lot in advance!
Just use max() and min():
select (case when max(b) = min(b) and max(c) = min(c) and max(d) = min(d)
then 'same'
else 'different'
end)
from t;
This logic ignores NULL values (the OP does not mention NULL values at all). The idea can be extended, but the logic is a wee bit more complex.

MS-ACCESS How to create query to select all records with fields matching from another query?

I know this question may sound confusing, but let me try to simplify it.
I have a query, let's call query1, and a much larger table of all products. Here is query1:
Item_Code Description Order_Qty Option
1000 Prod1 5 Blue
1005 Prod5 3 Brown
1602 Prod6 1 Red
5620 Prod8 6 Yellow
9865 Prod2 1 Brown
1624 Prod3 3 Brown
9876 Prod12 4 Blue
Now in my table, I have a much bigger list of products with the same format. I want to make a new query that contains ALL that are blue, brown, red, and yellow. It works, but there is always a duplicate.
I'm not sure how to post my attempt but I'll explain what I tried. I made a new query and included the table and query1. I made a relationship between the two, to include only rows where the "option" is equal. But for some reason the resulting query will come out with repeats. Such as:
Item_Code Description Order_Qty Option
1000 Prod1 5 Blue
1009 <-- Prod2 6 Blue
1009 <-- Prod2 6 Blue
1010 <-- Prod9 7 Blue
1010 <-- Prod9 7 Blue
1011 <-- Prod11 9 Blue
1011 <-- Prod11 9 Blue
9876 <-- Prod12 4 Blue
9876 <-- Prod12 4 Blue
1005 <-- Prod5 3 Brown
1005 <-- Prod5 3 Brown
9865 <-- Prod2 1 Brown
9865 <-- Prod2 1 Brown
1624 <-- Prod3 3 Brown
1624 <-- Prod3 3 Brown
9877 Prod99 7 Brown
1111 <-- Prod67 8 Brown
1111 <-- Prod67 8 Brown
1602 Prod6 1 Red
1752 Prod56 2 Red
5620 Prod8 6 Yellow
And the worst part is, it won't always repeat. Maybe I'm approaching it wrong.
I know this may be a case of tldr, but if anyone could help that would be great.
Thanks.
Sounds like you need to add a GROUP BY clause or a DISTINCT keyword to the query.
SELECT DISTINCT Item_Code, Description, Order_Qty, Option
(or click on whatever option in Access does the same thing.)
It's impossible to diagnose why the query is returning "duplicate" rows given the information you've provided.
The query is returning "duplicate" rows when a row in all_products is matching two or more rows from query1, so you get a copy of the row from all_products for each matching row from query1.
You might get better performance if you got a distinct list of Option from query1. (I don't really "do" MS Access. I brought up the question because it was tagged "mysql". The Jet database engine is cool and all, but it just doesn't work well in a multiuser environment.) In SQL Server, Oracle, MySQL, DB2, Teradata, et al. we'd write the query something like this:
SELECT p.Item_Code
, p.Description
, p.Order_Qty
, p.Option
FROM mytable p
JOIN ( SELECT q1.Option
FROM query1 q1
GROUP BY q1.Option
) q
ON q.Option = p.Option
Your query is returning duplicates because you are probably only selecting columns form one table but your join is not specific enough so the Cartesian product generated multiplies your results. I.e. because your only join is option = option that means every single blue will join to every other blue. You probably want more restraints in your ON clause in the join.
Using distinct will probably make it look like the right answer but you are just masking the problem.
If I understand your question correctly, you want to use Query1 as a type of lookup table to select rows from Table.
So you could do:
select * from table
where color in
(select color from query1)
If you want to make the selection criteria based on 2 fields you could do:
select * from table
where color1&color2 in
(select color1&color2 from query1)
If you want to look for omissions, you could do:
select * from table
where color not in
(select color from query1)
Note that this only selects data from table. It doesn't do any joins between table and query1.
I think the others are correct. You are getting weird duplicates with a join because you have multiple rows for the join item in each table.