MySQL Query - Object Queuing Based on Object Property to Skill Mapping - mysql

Okay, so I know the title is a bit cryptic so I'll do what I can to explain the "problem" I have and the solution I am currently using.
Problem:
An 'object' of work needs to be distributed to the apropriate user based on said object's properties.
The idea is that there is an object of work has properties. Those properties are mapped to skills. A user has skills and is able to work on an object which is within the user's skillset.
There are several [three] property definitions and I currently have the following table structures.
|-- Object to Property Set 1 -- Property Set 1 to Skill --|
Object Table -|-- Object to Property Set 2 -- Property Set 2 to Skill --|-- User Skill -- User Table
|-- Object to Property Set 3 -- Property Set 3 to Skill --|
The query may be a bit easier to understand:
SELECT counts.object_id,
COUNT(DISTINCT counts.object_skill) object_skill_count,
COUNT(DISTINCT counts.user_skill) user_skill_count
FROM
(SELECT object.object_id,
sp.skill_id object_skill,
us.skill_id user_skill
FROM object_table object
LEFT JOIN object_property op ON op.object_id = object.object_id
LEFT JOIN skill_property sp ON sp.property_id = op.property_id
LEFT JOIN user_skill us ON us.skill_id = sp.skill_id
AND us.active = 1
AND us.user_id = {$userid} -- <=- inserted from a PHP script
AND object.state = 1
UNION SELECT object.object_id,
sf.skill_id object_skill,
us.skill_id user_skill
FROM object_table object
LEFT JOIN object_flag obf ON obf.object_id = object.object_id
LEFT JOIN skill_flag sf ON sf.flag_id = obf.flag_id
LEFT JOIN user_skill us ON us.skill_id = sf.skill_id
AND us.active = 1
AND us.user_id = {$userid} -- <=- inserted from a PHP script
AND object.state = 1
UNION SELECT object.object_id,
svf.skill_id object_skill,
us.skill_id user_skill
FROM object_table object
LEFT JOIN object_creator oc ON oc.creator_id = object.creator_id
LEFT JOIN skill_creator sc ON sc.flag_id = oc.flag_id
LEFT JOIN user_skill us ON us.skill_id = sc.skill_id
AND us.active = 1
AND us.user_id = {$userid} -- <=- inserted from a PHP script
AND object.state = 1) counts
GROUP BY counts.object_id
Here we get a count of all the skills an object as well as count the number of skills the user has on that same object. If the two counts match, we know the user can work on the object. If the object's skill count exceeds the user's count, the object is beyond the user's capabilities and will not be assigned to that user.
While the above query works, it slows significantly when thrown at a large[r] table. Would like to know if there is a better way of doing things. And, since the internet is filled with amazing people, here we are.
Retroactive Update:
The Left joins in this case are there because objects can have no properties. This equates to the count 0-0 and thus makes the object workable by anyone.

It looks ok. Conditions placed on data joins instead of where clause, no order by's.
Possible options:
1) Check for missing indexes
http://basitaalishan.com/2013/03/13/find-missing-indexes-using-sql-servers-index-related-dmvs/
2) Change 'left joins' to 'inner joins'
INNER JOIN vs LEFT JOIN performance in SQL Server
3) Use 'UNION ALL' instead of 'Union'
performance of union versus union all

Related

Returning value from query based on results

I have this query I'm trying to build to display specific information for a stored table. I'm needing the query to also display the Enemy Guild Name but I'm having trouble getting the query to take the Enemy Guild ID and link it to the name.
SELECT g.wPos as wPos, g.szGuildName as szGuildName, g.dwGuildExpWeek as dwGuildExpWeek, g.dwEnemyGuildID as dwEnemyGuildID, gm.wPower as wPower, gd.szName as szName
FROM guild as g
LEFT JOIN guild_member AS gm ON gm.dwGuildID = g.dwGuildID AND gm.wPower = '1'
LEFT JOIN gamedata AS gd ON gd.dwID = gm.dwRoleID
WHERE g.wPos = '1'
The output of the query right now results in the following:
Query Results Currently
What I need it to do now is take the dwEnemyGuildID it finds and then use that ID to search for the szGuildName while also displaying the other data it finds.
Use the concept of SELF JOIN, in which we will join same table again if we have a field which is a reference to the same table. Here dwEnemyGuildID is reference to the same table.
A trivial example of the same is finding Manager for an employee from employees table.
Reference: Find the employee id, name along with their manager_id and name
SELECT
g.wPos as wPos,
g.szGuildName as szGuildName,
g.dwGuildExpWeek as dwGuildExpWeek,
g.dwEnemyGuildID as dwEnemyGuildID,
enemy_g.szGuildName as szEnemyGuildName, -- pick name from self joined table
gm.wPower as wPower,
gd.szName as szName
FROM guild as g
LEFT JOIN guild_member AS gm ON gm.dwGuildID = g.dwGuildID AND gm.wPower = '1'
LEFT JOIN guild AS enemy_g ON g.dwEnemyGuildID = enemy_g.dwGuildID -- Use self join
LEFT JOIN gamedata AS gd ON gd.dwID = gm.dwRoleID
WHERE g.wPos = '1';

Return a different datatype from postgresql

I have the below query in PG
SELECT
project.project_id,
project.project_name,
category.category_name,
array_agg(row(skill.skill_name,projects_skills.projects_skills_id)) AS skills
FROM project
JOIN projects_skills ON project.project_id = projects_skills.project_id
JOIN skill ON projects_skills.skill_id = skill.skill_id
JOIN category ON project.category_id = category.category_id
GROUP BY project.project_name,project.project_id, category.category_name;
of particular interest is the below line which seems to return a pseudo-type tuple
array_agg(row(skill.skill_name,projects_skills.projects_skills_id)) AS skills
I'm unable to create a view of this because of the pseudo type - in addition to this, the row function seems to return a tuple set like the below:
skills: '{"(Python,3)","(Node,4)","(Javascript,5)"}' }
I could painfully parse it in JavaScript by replacing '(' to '[' etc. but could I do something in postgres to return it preferably as an object?
One possible solution is to register a row type (once):
CREATE TYPE my_type AS (skill_name text, projects_skills_id int);
I am guessing text and int as data types. Use the actual data types of the underlying tables.
SELECT p.project_id, p.project_name, c.category_name
, array_agg((s.skill_name, ps.projects_skills_id)::my_type) AS skills
FROM project p
JOIN projects_skills ps ON p.project_id = ps.project_id
JOIN skill s ON ps.skill_id = s.skill_id
JOIN category c ON p.category_id = c.category_id
GROUP BY p.project_id, p.project_name, c.category_name;
There are many other options, depending on your version of Postgres and what you need exactly.
As well as the excellent suggestions to use JSON in the comments, and #Erwin 's to use a registered composite type, you can use a two-dimension array, or a multivalues approach:
Just replace your line
array_agg(row(skill.skill_name::text,projects_skills.projects_skills_id::text)) AS skills
with the following:
Two dimension array option 1
array_agg(array[skill.skill_name::text,projects_skills.projects_skills_id::text]) AS skills
-- skills will be '{{Python,3},{Node,4},{Javascript,5}}', thus
-- skills[1][1] = 'Python' and skills[1][2] = '3' -- id is text
Two dimension array option 2
array[array_agg(skill.skill_name),array_agg(projects_skills.projects_skills_id)] AS skills
-- skills will be '{{Python,Node,Javascript},{3,4,5}}', thus
-- skills[1][1] = 'Python' and skills[2][1] = '3' -- id is text
Multivalues
array_agg(skill.skill_name) AS skill_names,
array_agg(projects_skills.projects_skills_id) AS skills_ids
-- skills_names = '{Python,Node,Javascript}' and skill_ids = '{3,4,5}', thus
-- skills_names[1] = 'Python' and skills_ids[1] = 3 -- id is integer

Why can't I exclude this row based on a condition?

http://sqlfiddle.com/#!3/3ec1f/119
Here's my fiddle...I want the result to look like this but the query I'm using doesn't do that:
My problem with the query is that I can't seem to exclude "The Kingdom of the Crystal Skull" using the exclusion_flag condition. I also don't know why it seems that Contract 3 (Raiders of the Lost Arc) is not showing up either. I have been toiling with this for hours and have no idea what the problem is. I tried looking into subqueries, but I'm not sure that's the solution...
There's a couple of questions/issues there so I'll try to address them individually.
1) You can't exclude "The Kingdom of the Crystal Skull" using the exclusion_flag because contract_sid 7 and 8 both refer to product_list_sid 3 which includes "The Kingdom of the Crystal Skull" - you would need to create a separate product_list_sid if you wanted a contract which excluded it.
2) "Raiders of the Lost Arc" (contract_sid 3) isn't showing up because it's a "single product" contract, and your query only joins from scope to product_list_join using product_list_id - contract_sid 3 is in the product_sid column so you need a separate join to cater for contracts that use product_sid instead of product_list_sid (I assume that a contract can't use both). This is a pretty dodgy schema design but here's a query that solves that issue. Notice the use of LEFT OUTER JOIN to indicate that the table being joined to might not contain any rows (for example when scope.product_list_sid is NULL but scope.product_sid is not).
SELECT s.contract_sid,
c.contract_description,
ISNULL(p.product_description, p2.product_description) AS product_description
FROM scope s
JOIN contracts c ON (c.contract_sid = s.contract_sid)
LEFT OUTER JOIN
product_list_join plj ON (plj.product_list_sid = s.product_list_sid)
LEFT OUTER JOIN
products p ON (p.product_sid = plj.product_sid)
LEFT OUTER JOIN
products p2 ON (p2.product_sid = s.product_sid)
WHERE s.exclusion_flag = 'N'
ORDER BY s.contract_sid;
Here's the SQLFiddle for my solution: http://sqlfiddle.com/#!3/fc62e/10
Edit: After posting this I realised what you're actually trying to do - the scope table not only provides the details of contracts but also provides specific products to exclude from contracts. Again, this is bad schema design and there should be a separate scope_exclusions table or something, but here's a query that does that and excludes "The Kingdom of the Crystal Skull" as requested:
SELECT inner_query.contract_description,
inner_query.product_description
FROM (
SELECT s.contract_sid,
c.contract_description,
ISNULL(p.product_sid, p2.product_sid) AS product_sid,
ISNULL(p.product_description, p2.product_description) AS product_description
FROM scope s
JOIN contracts c ON (c.contract_sid = s.contract_sid)
LEFT OUTER JOIN
product_list_join plj ON (plj.product_list_sid = s.product_list_sid)
LEFT OUTER JOIN
products p ON (p.product_sid = plj.product_sid)
LEFT OUTER JOIN
products p2 ON (p2.product_sid = s.product_sid)
WHERE s.exclusion_flag = 'N'
) inner_query
WHERE NOT EXISTS ( SELECT 1
FROM scope
WHERE exclusion_flag = 'Y'
AND contract_sid = inner_query.contract_sid
AND product_sid = inner_query.product_sid )
ORDER BY inner_query.contract_description;
SQL Fiddle: http://sqlfiddle.com/#!3/fc62e/14

Extracting Member who did not contribute

How do you find out who did not contribute to a particular fund raiser that we all just did. There are many titles to the different charities, I however just want to extract the non-contributors for a particular charity title. Is there anyway to do this? When I do the the syntax below it comes up as an empty set. The search is done by way of the table Id matching and left joins. Please see below.
SELECT
moiid,
trim(concat(name.fname,' ' ,name.mname,' ',name.lname)) as Brother,
name.moiid as Members_ID,
sum(otherpay.othpayamt) as NO_Contribution,
quadlt.ltfname as quad
FROM name
LEFT JOIN OTHERPAY ON name.moiid = otherpay.othpaymoiid
LEFT JOIN quadlt ON name.quadlt = quadlt.ltid
WHERE Otherpay.othpaytitle like '%food drive%'
AND otherpay.othpaymoiid IS NULL
AND name.type = 'BOI'
AND name.type <> 'jrboi'
AND name.city = 'SUFFOLK'
GROUP BY brother
ORDER BY name.quadlt, brother
When you add conditions to the where clause for tables that are left joined, you effectively turn them into an inner join, requiring them to return records.
You can move the conditions to the join itself:
SELECT moiid, trim(concat(name.fname,' ' ,name.mname,' ',name.lname)) as Brother, name.moiid as Members_ID, sum(otherpay.othpayamt) as NO_Contribution, quadlt.ltfname as quad
FROM name
LEFT JOIN OTHERPAY
ON name.moiid = otherpay.othpaymoiid
AND Otherpay.othpaytitle like '%food drive%'
LEFT JOIN quadlt ON name.quadlt = quadlt.ltid
WHERE
otherpay.othpaymoiid IS NULL
AND name.type = 'BOI'
AND name.type <> 'jrboi'
AND name.city = 'SUFFOLK'
GROUP BY brother
ORDER BY name.quadlt, brother

MySQL Invalid query: Too high level of nesting for select

I switched the code to Andrews solution:
SELECT s1.biz_name, s1.biz_info, s1.e_address, s1.e_city, s1.e_state,
s1.e_postal, s1.e_zip_full, s1.loc_LAT_centroid, s1.loc_LONG_centroid,
s1.biz_phone, s1.biz_phone_ext, s1.biz_fax, s1.biz_email, s1.web_url,
s2.upc as upc2, s2.retailprice as retailprice2, s2.dollar_sales as
dollar_sales2, s2.dollar_sales_ly as dollar_sales_ly2, s2.todaydate as
todaydate2, s2.datetimesql as datetimesql2, s2.shelfposition as
shelfposition2, s2.reg_sale as reg_sale2, s2.representative as
representative2, s2.notes as notes2, s3.upc as upc3, s3.retailprice as
retailprice3, s3.dollar_sales as dollar_sales3, s3.dollar_sales_ly as
dollar_sales_ly3, s3.todaydate as todaydate3, s3.datetimesql as
datetimesql3, s3.shelfposition as shelfposition3, s3.reg_sale as reg_sale3,
s3.representative as representative3, s3.notes as notes3, s4.upc as upc4,
s4.retailprice as retailprice4, s4.dollar_sales as dollar_sales4,
s4.dollar_sales_ly as dollar_sales_ly4, s4.todaydate as todaydate4,
s4.datetimesql as datetimesql4, s4.shelfposition as shelfposition4,
s4.reg_sale as reg_sale4, s4.representative as representative4, s4.notes as
notes4, s5.upc as upc5, s5.retailprice as retailprice5, s5.dollar_sales as
dollar_sales5, s5.dollar_sales_ly as dollar_sales_ly5, s5.todaydate as
todaydate5, s5.datetimesql as datetimesql5, s5.shelfposition as
shelfposition5, s5.reg_sale as reg_sale5, s5.representative as
representative5, s5.notes as notes5
FROM allStores AS s1
LEFT OUTER JOIN storeCheckRecords AS s2
ON s1.e_address = s2.e_address AND s2.upc = '650637119004'
LEFT OUTER JOIN storeCheckRecords AS s3
ON s1.e_address = s3.e_address AND s3.upc = '650637119011'
LEFT OUTER JOIN storeCheckRecords AS s4
ON s1.e_address = s4.e_address AND s4.upc = '650637374007'
LEFT OUTER JOIN storeCheckRecords AS s5
ON s1.e_address = s5.e_address AND s5.upc = '650637374014'
WHERE s2.e_address IS NOT NULL
OR s3.e_address IS NOT NULL
OR s4.e_address IS NOT NULL
OR s5.e_address IS NOT NULL
Here is the new error: Invalid query: Too many tables; MySQL can only use 61 tables in a join
Any other ideas? Thanks for the help.
Could be related to
MySQL bug #41156, List of derived tables acts like a chain of mutually-nested subqueries.
The bug log indicates it was verified against MySQL 5.0.72, 5.1.30, and 6.0.7.
Fixed in MySQL 5.1.37, MySQL 5.4.2 (which became 5.5.something), and NDB 7.1.0.
Regarding your redesigned query in the question above:
Pivot queries can be tricky. You can use the method suggested by Andrew in his answer. If you search for many UPC values, you need to write application code to build the SQL query, appending as many JOIN clauses as the number of UPC values you're searching for.
MySQL does have a limit on the number of joins that can be done in a single query, but the example you should doesn't reach the limit. That is, the query you show does work.
I assume that you're showing an example query searching for four UPC codes, whereas your app may construct the query dynamically for a greater number of UPC codes, and that may be more than 61 sometimes.
It looks like the goal of your query is to return stores that has at least one of the listed UPC codes. You can do that more simply in the following query:
SELECT DISTINCT s.*
FROM allStores AS s
JOIN storeCheckRecords AS cr
ON s.e_address = cr.e_address
AND cr.upc IN ('650637119004','650637119011','650637374007','650637374014');
You can use this method in other ways, for example to find stores that have all four of the UPC's:
SELECT s.*
FROM allStores AS s
JOIN storeCheckRecords AS cr
ON s.e_address = cr.e_address
AND cr.upc IN ('650637119004','650637119011','650637374007','650637374014');
GROUP BY s.e_address
HAVING COUNT(DISTINCT upc) = 4;
Or to find stores that some but not all four of the UPC's:
SELECT s.*
FROM allStores AS s
JOIN storeCheckRecords AS cr
ON s.e_address = cr.e_address
AND cr.upc IN ('650637119004','650637119011','650637374007','650637374014');
GROUP BY s.e_address
HAVING COUNT(DISTINCT upc) < 4;
Or to find stores that lack all four of the UPC's:
SELECT s.*
FROM allStores AS s
JOIN storeCheckRecords AS cr
ON s.e_address = cr.e_address
AND cr.upc IN ('650637119004','650637119011','650637374007','650637374014');
WHERE cr.e_address IS NULL;
You still have to write some code to build this query, but it's a bit easier to do, and it doesn't exceed any limits on the number of joins or subqueries you can run.
This should give you the same results without using subqueries:
SELECT s1.biz_name,
...
s2.upc AS upc2,
...
s3.upc AS upc3,
...
s4.upc AS upc4,
...
s5.upc AS upc5,
...
FROM allStores AS s1
LEFT OUTER JOIN storeCheckRecords AS s2 ON s1.e_address = s2.e_address
LEFT OUTER JOIN storeCheckRecords AS s3 ON s1.e_address = s3.e_address
LEFT OUTER JOIN storeCheckRecords AS s4 ON s1.e_address = s4.e_address
LEFT OUTER JOIN storeCheckRecords AS s5 ON s1.e_address = s5.e_address
WHERE (s2.e_address IS NOT NULL
OR s3.e_address IS NOT NULL
OR s4.e_address IS NOT NULL
OR s5.e_address IS NOT NULL)
AND (s2.upc = '650637119004' OR s2.upc IS NULL)
AND (s3.upc = '650637119011' OR s3.upc IS NULL)
AND (s4.upc = '650637374007' OR s4.upc IS NULL)
AND (s5.upc = '650637374014' OR s5.upc IS NULL)
I would simplify to just get all the elements first with a simple WHERE IN clause... You appear to be doing a pivot table to show T1 to T2 to T3 to T4 to T5. If you get all the data in individual rows, then you can have STATIC columns across the top showing the details per row under each other.
SELECT
t1.brand,
t1.biz_name,
t1.biz_info,
t1.e_address,
t1.e_city,
t1.e_state,
t1.e_postal,
t1.e_zip_full,
t1.loc_LAT_centroid,
t1.loc_LONG_centroid,
t1.biz_phone,
t1.biz_phone_ext,
t1.biz_fax,
t1.biz_email,
t1.web_url,
t1.upc,
t1.retailprice,
t1.dollar_sales,
t1.dollar_sales_ly,
t1.todaydate,
t1.datetimesql,
t1.shelfposition,
t1.reg_sale,
t1.representative,
t1.notes
FROM
storeCheckRecords as t1
WHERE
t1.upc IN ( '650637119004', '650637119011', '650637374007', '650637374014')
such as..
Brand Bus Addr UPC Retail$ Sales Notes
xyz Bus Name UPC ... etc... Cur Yr
Bus Info Shelf Info Last Yr
Address, (Cit/State/Zip)
Lat / Long
Phone / Fax
Email / Web
----
Next Entry
Does it really matter that exact same address is the same as opposed to who carries an item? What if one entry is "123 Main St", another is "123B Main St", and "123 Main St - Suite B", you wouldn't find a match.
Additionally, you mention some having up to 75 UPC codes... Put those in a separate table and use that as the first table joined to the "StoreCheckRecords" and get them all... instead of manually keying all the columns suffixed from 2 to 75... or however many on the next run being only 17, and yet another 4... I think you may be too fixed into what you are trying to get out from the data.
You could even GROUP by the common "e_address" you originally wanted the matches based on and provide that group as a break between sections reported to the user.