Select rows that match arbitrary criterias - mysql

I have a MySql table like this:
ID Category Value
A LOCATION COPENHAGEN
A LOCATION MADRID
B LOCATION MADRID
C LOCATION MADRID
C LOCATION AARHUS
C POSITION WEBDESIGN
D POSITION SYSADMIN
D POSITION WEBDESIGN
A GENDER MALE
B GENDER MALE
C GENDER FEMALE
Id, Category and Values together forms the key. Note that they are dynamic (user defined) and the table will be large.
Now I want to get all ID that match some criteria. To be returned, it is enough for the ID to match one value within each Category.
For example:
GIVE ME ALL IDS WHO HAVE
((LOCATION = COPENHAGEN OR LOCATION = MADRID) AND (GENDER = MALE))
should return A, B
Another example:
GIVE ME ALL IDS WHO HAVE
((POSITION = SYSADMIN OR POSITION = WEBDESIGN) AND (GENDER = FEMALE) AND (LOCATION = COPENHAGEN OR LOCATION = MADRID))
should return C
The returned ids are to be used as a subquery for another query - so performance matters.
UPDATE
I've created this sql fiddle with sample data and a proposed not working solution.

[Removed initial response as it was non-functional. In making it functional I came up with the query in the second edit which is much preferable.]
Edit
It seems like you're trying to create a schema to keep track of people working in various locations, and from your data it seems like they might hold multiple titles among various offices. The below schema would allow you to define each person, job, and location only once and then connect them using a linking table.
TABLE People
p_id INT AI PK
p_name VARCHAR
p_sex BOOL
...
TABLE Offices
o_ID INT AI PK
o_name VARCHAR
o_location VARCHAR
...
TABLE Jobs
j_ID INT AI PK
j_name VARCHAR
...
TABLE People_Jobs_Offices --linking table
p_ID INT PK
o_ID INT PK
j_ID INT PK
Now get all of the Sysadmins in Madrid or Copenhagen:
SELECT *
FROM People_Jobs_Offices pjo
INNER JOIN People p
ON pjo.p_id = p.p_id
INNER JOIN Jobs j
on pjo.j_id = j.j_id
INNER JOIN Offices o
ON pjo.o_id = o.o_id
WHERE
j.j_name = 'SYSADMIN'
AND ( o.o_location = 'MADRID'
OR o.o_location = 'COPENHAGEN' )
This approach is called Database Normalization, and generally makes the best use of indexes in any RDBMS, as well as keeping the number of rows in your tables to a minimum by avoiding the duplication of data.
Edit²
I've re-fiddled your SQLfiddle. Much of this would have to be done porgrammatically, ie. determining which categories/how many joins/table aliases/column names, but you'd want to start out with whatever table the ID column references so you have a solid starting point for these joins, I've just created the users table for illustration.
CREATE TABLE users (`ID` varchar(16),`name` varchar(16));
INSERT INTO users (`ID`, `name`)
VALUES ('A', 'andrew'), ('B', 'bob'), ('C', 'charla');
And the query:
SELECT u.*,
gen.value 'gender',
pos.value 'position',
loc.value 'location'
FROM users u
LEFT JOIN yourtable gen
ON u.ID = gen.ID
AND gen.category = 'GENDER'
LEFT JOIN yourtable pos
ON u.id = pos.id
AND pos.category = 'POSITION'
LEFT JOIN yourtable loc
ON u.id = loc.id
AND loc.category = 'LOCATION'
And the output:
ID NAME GENDER POSITION LOCATION
A andrew MALE (null) COPENHAGEN
A andrew MALE (null) MADRID
B bob MALE WEBDESIGN MADRID
B bob MALE SYSADMIN MADRID
C charla FEMALE (null) MADRID
C charla FEMALE (null) AARHUS
I would also make certain that, with the potential for many joins, you have indexes on category as well as ID, if not all 3 columns in the table.

The solution - for the moment - is to make a subquery for each category. Like this:
SELECT DISTINCT ID
FROM yourtable
WHERE ID IN
(SELECT ID from yourtable WHERE (Category = 'POSITION' AND Value = 'SYSADMIN'))
AND ID IN
(SELECT ID from yourtable WHERE (Category = 'LOCATION' AND Value = 'MADRID') OR (Category = 'LOCATION' AND Value = 'COPENHAGEN'))
Working example can be seen here at SQL Fiddle.
It is working, but I am worried about performance when I have many rows or am checking in many categories.

Related

Select the rest of the records when one condition is met SQL

I have a user base and the languages ​​assigned to them.
My tables:
users
id | name
lang
id | name
users_lang_user
id | users_id | lang_id
I would like to retrieve a user who has at least one record in the relationship database where lang.id = 1, and get their other languages.
SELECT *
from `users` as users
JOIN `users_lang_user` as lang
ON lang.user_id = users.id AND lang.lang_id = '1'
But now I only have where lang id = 1.
How can I get the rest of the user records if this one condition is true
For example, how can I get users where lang id = 12 but also must have record where lang id = 1
If I'm not mistaken, you would like to get the username which has at least two languages and 1 of them must be lang_id 1 , and its associated language name other than the one with the lang_id 1. Wrote and tested this in workbench:
create table users (id int,name varchar(10));
insert users values(1,'john'),(2,'mary'),(3,'sarah');
create table lang (id int,name varchar(10));
insert lang values(1,'english'),(2,'german'),(3,'maori');
create table users_lang_user(id int,users_id int,lang_id int);
insert users_lang_user values(1,1,3),(2,3,1),(3,2,1),(4,2,2);
select u.name,l.name from users u
join users_lang_user ul
on u.id=ul.users_id
join lang l
on ul.lang_id=l.id
where ul.users_id in (select users_id from users_lang_user
group by users_id having count(users_id)>1 and min(lang_id)=1)
and lang_id!=1
;
-- result set:
mary german
Note in this case, as the lang_id 1 happens to be the lowest value of all values ,we can use aggregate function min(lang_id)=1 to cover the lang_id column which is not listed in the group by clause.
One way is
SELECT u.*, l.id langId
FROM `users` u
JOIN (
SELECT user_id
FROM `users_lang_user`
WHERE lang_id = '1') ul1 ON u.id = ul1.user_id
JOIN `users_lang_user` l ON l.user_id = u.id

How to find out results for not matching particular condition in SQL from multiple tables?

I have 3 tables :
Person table stores basic person wise details with ID as primary Key
This person can have relationships (father / mother etc), which are saved in Relationship table, however the users for them are created in Person table (e.g. ID = 2,3 in person table), This way we know that 2,3 are related to user 1 (carry).
We also have 3rd table - address, which store user ID wise addresses.(for both a user and his related persons, who are also users)
I want to find out if an address exists for either a user or for his related users in SQL. How to achieve this ?
You can combine two rules and search on the combined table as below
SELECT * FROM
(
SELECT username,id,Address.Address
FROM Person
INNER JOIN Address ON Person.id = Address.Userid
UNION ALL
SELECT username,id,Address.Address
FROM Person
INNER JOIN Relationship ON Relationship.Relatedid = Person.id
INNER JOIN Address ON Relationship.Userid = Address.Userid
) as RES
WHERE Address = 'xyz road'
Also you can find DBFiddle link to workout
Query:
select p.id,p.username,(case when a.userid is null then 'No' else 'Yes'end) IsAddressAvailable
from Person p
left join Address a on p.id=a.Userid
Output:
id
username
IsAddressAvailable
1
Carry
Yes
2
Carry-Father
No
3
Carry-Mother
Yes
db<fiddle here

SQL Genius need .. Complex MySQL query

I am trying to optimise my php by doing as much work on the MySQL server as possible. I have this sql query which is pulling data out of a leads table, but at the same time joining two tags tables to combine the result. I am looking to add a company which is linked through a relations table.
So the table that holds the relationship between the two is relations_value which simply states (I add example data)
parenttable (companies) | parentrecordid (10) | childtable (leads) | childrecordid (1)
the companies table has quite a few columns but the only two relevant are;
id (10) | companyname (my company name)
So this query currently grabs everything I need but I want to bring the companyname into the query:
SELECT leads.id,
GROUP_CONCAT(c.tag ORDER BY c.tag) AS tags,
leads.status,
leads.probability
FROM `gs_db_1002`.leads
LEFT JOIN ( SELECT *
FROM tags_module
WHERE tagid IN ( SELECT id
FROM tags
WHERE moduleid = 'leads' ) ) as b
ON leads.id = b.recordid
LEFT JOIN `gs_db_1002`.tags as c
ON b.tagid = c.id
GROUP BY leads.id,
leads.status,
leads.probability
I need to be able to go into the relations_values table and pull parenttable and parentrecordid by selecting childtable = leads and childrecordid = 1 and somehow join these so that I am able to get companyname as a column in the above query...
Is this possible?
I have created a sqlfiddle: sqlfiddle.com/#!2/023fa/2 So I am looking to add companies.companyname as column to the query.
I don't know what your primary keys and foreign keys are that link each table together.. if you could give a better understanding of what ID's are linked to eachother it would make this a lot easier... however i did something that does return the correct result... but since all of the ID's are = 1 then it could be incorrect.
SELECT
leads.id, GROUP_CONCAT(c.tag ORDER BY c.tag) AS tags,
leads.status, leads.probability, companyname
FROM leads
LEFT JOIN (
SELECT * FROM tags_module WHERE tagid IN (
SELECT id FROM tags WHERE moduleid = 'leads' )
) as b ON leads.id = b.recordid
LEFT JOIN tags as c ON b.tagid = c.id
LEFT JOIN relations_values rv on rv.id = b.recordid
LEFT JOIN companies c1 on c1.createdby = rv.parentrecordid
GROUP BY leads.id,leads.status, leads.probability

Free search on multiple tables using JOIN

I have a DB of Students who have a fullname, a location, and a list of schools they frequented.
"student" table
id | fullname | location
------------------------
"location" table
id | zipcode | city
-------------------
"school" table
id | name
---------
"student_school" table (which holds two foreign keys on school and user to create for each user a list of schools)
id | id_student | id_school
---------------------------
I want to perform a search through the students comparing the search term with student.fullname, location.zipcode, location.city, school.name and return all the students matching one (or more) of these conditions.
Note that student can have a null location, so we need an extern join.
Here is example of matching based on regular expressions:
SELECT distinct s.id, s.fullname
FROM student_school ss
JOIN student s ON s.id = ss.id_student
LEFT JOIN LOCATION l ON s.LOCATION = l.id
JOIN school sc ON ss.id_school = sc.id
WHERE (s.fullname RLIKE 'Sasha.*') or
(ifnull(l.zipcode RLIKE '100.*', 0)) or
(ifnull(l.city RLIKE 'New.*', 0)) or
(sc.name RLIKE '.*2')
And here is complete SQL fiddle
So the general idea is to join all student data, filter rows matching at least one criterion and use distinct to group data and avoid duplicating of results.
Update:
To include students with no school records you may use following FROM phrase:
student_school ss
RIGHT JOIN student s ON s.id = ss.id_student
LEFT JOIN LOCATION l ON s.LOCATION = l.id
LEFT JOIN school sc ON ss.id_school = sc.id

MySQL selecting rows with a max id and matching other conditions

Using the tables below as an example and the listed query as a base query, I want to add a way to select only rows with a max id! Without having to do a second query!
TABLE VEHICLES
id vehicleName
----- --------
1 cool car
2 cool car
3 cool bus
4 cool bus
5 cool bus
6 car
7 truck
8 motorcycle
9 scooter
10 scooter
11 bus
TABLE VEHICLE NAMES
nameId vehicleName
------ -------
1 cool car
2 cool bus
3 car
4 truck
5 motorcycle
6 scooter
7 bus
TABLE VEHICLE ATTRIBUTES
nameId attribute
------ ---------
1 FAST
1 SMALL
1 SHINY
2 BIG
2 SLOW
3 EXPENSIVE
4 SHINY
5 FAST
5 SMALL
6 SHINY
6 SMALL
7 SMALL
And the base query:
select a.*
from vehicle a
join vehicle_names b using(vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
group
by a.id
having count(distinct c.attribute) = 2;
So what I want to achieve is to select rows with certain attributes, that match a name but only one entry for each name that matches where the id is the highest!
So a working solution in this example would return the below rows:
id vehicleName
----- --------
2 cool car
10 scooter
if it was using some sort of max on the id
at the moment I get all the entries for cool car and scooter.
My real world database follows a similar structure and has 10's of thousands of entries in it so a query like above could easily return 3000+ results. I limit the results to 100 rows to keep execution time low as the results are used in a search on my site. The reason I have repeats of "vehicles" with the same name but only a different ID is that new models are constantly added but I keep the older one around for those that want to dig them up! But on a search by car name I don't want to return the older cards just the newest one which is the one with the highest ID!
The correct answer would adapt the query I provided above that I'm currently using and have it only return rows where the name matches but has the highest id!
If this isn't possible, suggestions on how I can achieve what I want without massively increasing the execution time of a search would be appreciated!
If you want to keep your logic, here what I would do:
select a.*
from vehicle a
left join vehicle a2 on (a.vehicleName = a2.vehicleName and a.id < a2.id)
join vehicle_names b on (a.vehicleName = b.vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct c.attribute) = 2;
Which yield:
+----+-------------+
| id | vehicleName |
+----+-------------+
| 2 | cool car |
| 10 | scooter |
+----+-------------+
2 rows in set (0.00 sec)
As other said, normalization could be done on few levels:
Keeping your current vehicle_names table as the primary lookup table, I would change:
update vehicle a
inner join vehicle_names b using (vehicleName)
set a.vehicleName = b.nameId;
alter table vehicle change column vehicleName nameId int;
create table attribs (
attribId int auto_increment primary key,
attribute varchar(20),
unique key attribute (attribute)
);
insert into attribs (attribute)
select distinct attribute from vehicle_attribs;
update vehicle_attribs a
inner join attribs b using (attribute)
set a.attribute=b.attribId;
alter table vehicle_attribs change column attribute attribId int;
Which led to the following query:
select a.id, b.vehicleName
from vehicle a
left join vehicle a2 on (a.nameId = a2.nameId and a.id < a2.id)
join vehicle_names b on (a.nameId = b.nameId)
join vehicle_attribs c on (a.nameId=c.nameId)
inner join attribs d using (attribId)
where d.attribute in ('SMALL', 'SHINY')
and b.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct d.attribute) = 2;
The table does not seems normalized, however this facilitate you to do this :
select max(id), vehicleName
from VEHICLES
group by vehicleName
having count(*)>=2;
I'm not sure I completely understand your model, but the following query satisfies your requirements as they stand. The first sub query finds the latest version of the vehicle. The second query satisfies your "and" condition. Then I just join the queries on vehiclename (which is the key?).
select a.id
,a.vehiclename
from (select a.vehicleName, max(id) as id
from vehicle a
where vehicleName like '%coo%'
group by vehicleName
) as a
join (select b.vehiclename
from vehicle_names b
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
group by b.vehiclename
having count(distinct c.attribute) = 2
) as b on (a.vehicleName = b.vehicleName);
If this "latest vehicle" logic is something you will need to do a lot, a small suggestion would be to create a view (see below) which returns the latest version of each vehicle. Then you could use the view instead of the find-max-query. Note that this is purely for ease-of-use, it offers no performance benefits.
select *
from vehicle a
where id = (select max(b.id)
from vehicle b
where a.vehiclename = b.vehiclename);
Without going into proper redesign of you model you could
1) Add a column IsLatest that your application could manage.
This is not perfect but will satisfy you question (until next problem, see not at the end)
All you need is when you add a new entry to issue queries such as
UPDATE a
SET IsLatest = 0
WHERE IsLatest = 1
INSERT new a
UPDATE a
SET IsLatest = 1
WHERE nameId = #last_inserted_id
in a transaction or a trigger
2) Alternatively you can find out the max_id before you issue your query
SELECT MAX(nameId)
FROM a
WHERE vehicleName = #name
3) You can do it in single SQL, and providing indexes on (vehicleName, nameId) it should actually have decent speed with
select a.*
from vehicle a
join vehicle_names b ON a.vehicleName = b.vehicleName
join vehicle_attribs c ON b.nameId = c.nameId AND c.attribute = 'SMALL'
join vehicle_attribs d ON b.nameId = c.nameId AND d.attribute = 'SHINY'
join vehicle notmax ON a.vehicleName = b.vehicleName AND a.nameid < notmax.nameid
where a.vehicleName like '%coo%'
AND notmax.id IS NULL
I have removed your GROUP BY and HAVING and replaced it with another join (assuming that only single attribute per nameId is possible).
I have also used one of the ways to find max per group and that is to join a table on itself and filter out a row for which there are no records that have a bigger id for a same name.
There are other ways, search so for 'max per group sql'. Also see here, though not complete.