SQL self-join to return specific rows - mysql

Skip to bottom to avoid long-winded explanation
Ok, so.
I'm working on a company intranet for managing client jobs. Jobs are comprised of Elements: an example element might be "Build a six-page website", or "Design a logo".
Each element consists of a collection of role-hours, so "Build a six-page website" might include four hours of "Developer" rate and two hours of "Designer" rate (ok, maybe a little longer :)
Obviously, different clients get different hourly rates. And, although that's already accounted for in the system, it's not giving us enough flexibilty. Traditionally, our account managers have been rather... ad hoc... with their pricing: the "Build a six-page website" element might include the standard four hours of developer for client "Bob", but eight hours for client "Harry".
Bear with me. I will get to actual code soon.
Elements are, of course, stored in the "Elements" database table - which is composed of little more than an ID and a text label.
My work-in-progress solution to the "we need client-specific elements" problem is to add a "client" field to this table. We can then go through and add any client-specific versions of the available elements, tweaking them to taste.
When the account managers go to add elements to their jobs, they should only see elements that are either (a) available to anyone - that is, they have a NULL client field, or (b) specific to the job client.
So far, so SELECT WHERE.
But that isn't going to cut it. If I add a second "Build a six-page website" element specifically for Harry, then an account manager adding elements to a job for Harry will see both the standard version, and Harry's version of the element. This is no good. They should only see the standard version if there's not an applicable client-specific version.
Ok... soooo: as well as adding a "client" field to the elements table, add a "parent element" field. We can then do something magically self-referential involving joining the table to itself, and fetch only the relevant roles.
My long-awaited question is thus:
Oh look, an actual question
id label client parent_element
1 Standard Thing NULL NULL
2 Harrys Thing 1 1
3 Bobs Thing 2 1
4 Different Thing NULL NULL
Given this table structure, how can I write a single SQL query that will accept a "client ID" parameter and return:
For client ID 1, rows 2 and 4
For client ID 2, rows 3 and 4
For client ID 42, rows 1 and 4
For extra bonus points, the results should include the parent element label. So for client ID 1, for example:
id label standardised_label client parent_element
2 Harrys Thing Standard Thing 1 1
4 Different Thing Different Thing NULL NULL

SELECT mm.*, md.label AS standardized_label
FROM mytable md
LEFT JOIN
mytable mc
ON mc.parent_element = md.id
AND mc.client = #client
JOIN mytable mm
ON mm.id = COALESCE(mc.id, md.id)
WHERE md.client IS NULL
Create an index on (client, parent_element) for this to work fast.
See SQLFiddle.

Related

How do I get all strings that do not contain another string in MySQL?

I have a table called "Domains" with field "Name" (unique, always lowercase) which contains a list of domains and subdomains on my server like:
blah.example.com
www.example.com
www.blah.example.com
example.com
example.nl
example.org
Looking at this list, names 1, 2 and 3 are subdomains of item 4. And I'm looking to just find all domains in this table without these subdomains. Or, to be more precise, any name that does not have part of it in the name from another record. Thus only item 4, 5 and 6.
If record 4 was missing then this query would also have item 1 and 2 as result, but not item 3. After all, item 3 has item 1 as part of it.
Just trying to find the query that can provide me this result... Something with select d.name from domains where d.name not in... Well, there my mind goes blank.
Why?
This list of domains is generated by my web server which registers every new domain that gets requested on it. I'm working on a reporting page where I would display the top domain names to see if there are any weird domains in it. For some reason, I sometimes see unknown domain names in these requests and this might give some additional insight in it all.
I am going to change my code so it will include references to parent domains in the same table in the future but for now I'll have to deal with this and a simple SQL solution.
Use a self-join that matches on suffixes using LIKE
SELECT d1.name
FROM domains AS d1
LEFT JOIN domains AS d2 ON d1.name LIKE CONCAT('%.', d2.name)
WHERE d2.name IS NULL
DEMO

MySQL finding data if any 4 of 5 columns are found in a row

I have an imported table of several thousand customers, the development I am working on runs on the basis of anonymity for purchase checkouts (customers do not need to log in to check out), but if enough of their details match the database record then do a soft match and email the (probably new) email address and eventually associate the anonymous checkout with the account record on file.
This is rolling out this way due to the age of the records, many people have the same postal address or names but not the same email address, likewise some people will have moved house and some people will have changed name (marriage etc).
What I think I am looking for is a MySQL CASE system, however the CASE questions on Stack Overflow I've found don't appear to cover what I'm trying to get from this query.
The query should work something like this:
$input[0] = postcode (zip code)
$input[1] = postal address
$input[2] = phone number
$input[3] = surname
$input[4] = forename
SELECT account_id FROM account WHERE <4 or more of the variables listed match the same row>
The only way I KNOW I can do this is with a massive bunch of OR statements but that's excessive and I'm sure there's a cleaner more concise method.
I also apologise in advance if this is relatively easy but I don't [think I] know the keyword to research constructing this. As I say, CASE is my best guess.
I'm having trouble working out how to manipulate CASE to fit what I'm trying to do. I do not need to return the values only the account_id from the valid row (only) that matches 4 or 5 of the given inputs.
I imagine that I could construct a layout that does this:
SELECT account_id CASE <if postcode_column=postcode_var> X=X+1
CASE <if surname_column=surname_var> X=X+1
...
...
WHERE X > 3
Is CASE the right idea?
If not, What is the process I need to use to achieve the desired results?
What is [another] MySQL keyword / syntax I need to research, if not CASE.
Here is your pseudo query:
SELECT account_id
FROM account
WHERE (postcode = 'pc')+
(postal_address = 'pa')+
(phone_number = '12345678901')+
(surname = 'sn')+
(forename= 'fn') > 3

How to do a MYSQL conditional select statement

Background
I'm faced with the following problem, relating to three tables
class_sectors table contains three categories of classes
classes table contains a list of classes students can attend
class_choices contains the first, second and third class choice of the student, for each sector. So for sector 1 Student_A has class_1 as first choihce, class_3 as second choice and class_10 as third choice for example, then for sector 2 he has another three choices, etc...
The class_choices table has these columns:
kp_choice_id | kf_personID | kf_sectorID | kf_classID | preference | assigned
I think the column names are self explanatory. preference is either 1, 2 or 3. And assigned is a boolean set to 1 once we have reviewed a student's choices and assigned them to a class.
Problem:
Writing an sql query that tells the students what class they are assigned to for each sector. If their class hasn't been assigned, it should default to show their first preference.
I have actually got this to work, but using two (very bloated??) sql queries as follows:
$choices = $db -> Q("SELECT
*, concat_ws(':', `kf_personID`, `kf_sectorID`) AS `concatids`
FROM
`class_choices`
WHERE
(`assigned` = '1')
GROUP BY
`concatids`
ORDER BY
`kf_personIDID` ASC,
`kf_sectorID` ASC;");
$choices2 = $db -> Q("SELECT
*, concat_ws(':', `kf_personID`, `kf_sectorID`) AS `concatids`
FROM
`class_choices`
WHERE
`preference` = '1'
GROUP BY
`concatids`
HAVING
`concatids` NOT IN (".iimplode($choices).")
ORDER BY
`kf_personID` ASC,
`kf_sectorID` ASC;");
if(is_array($choices2)){
$choices = array_merge($choices,$choices2);
}
Now $choices does have what I want.
But I'm sure there is a way to simplify this, merge the two SQL queries, and so it's a bit more lightweight.
Is there some kind of conditional SQL query that can do this???
Your solution uses two steps to enable you to filter the data as needed. Since you are generating a report, this is a pretty good approach even if it looks a bit more verbose than you might like.
The advantage of this approach is that it is much easier to debug and maintain, a big plus.
To improve the situation, you need to consider the data structure itself. When I look at the class_choices table, I see the following fields: kf_classID, preference, assigned which contain the key information.
For each class, the assigned field is either 0 (default) or 1 (when the class preference is assigned for the student). By default, the class with preference = 1 is the assigned one since you display it in the report when assigned=0 for all the student's class choices in a particular sector.
The data model could be improved by imposing a business rule as follows:
For preference=1 set the default value assigned=1. When the class selection process
takes place, and if the student gets assigned the 2nd or 3rd choice, then preference 1 is unassigned and the alternate choice assigned.
This means a bit more code in the application but it makes the reporting a bit easier.
The source of the difficulty is that the assignment process does not explicitly assign the 1st preference. It only updates assigned if the student cannot get the 1st choice.
In summary, your SQL is good and the improvements come from taking another look at the data model.
Hope this helps, and good luck with the work!

MySQL - return one row from 2 rows in the same table, overwrite the contents of the first 'default' with the populated fields of the second 'override'

I am trying to make use of the mobile device lookup data in the WUFL database at http://wurfl.sourceforge.net/smart.php but I'm having problems getting my head around the MySQL code needed (I use Coldfusion for the server backend). To be honest its really doing my head in but I'm sure there is a straightforward approach to this.
The WUFL is supplied as XML (approx 15200 records to date), I have the method written that saves the data to a MySQL database already. Now I need to get the data back out in a useful way!
Basically it works like this: firstly run a select using the userAgent data from a CGI pull to match against a known mobile device (row 1) using LIKE; if found then use the resultant fallback field to look up the default data for the mobile device's 'family root' (row 2). The two rows need to be combined by overwriting the contents of (row 2) with the specific mobile device's features of (row 1). Both rows contain NULL entries and not all the features are present in (row 1).
I just need the fully populated row of data returned if a match is found. I hope that makes sense, I would provide what I think the SQL should look like but I will probably confuse things even more.
Really appreciate any assistance!
This would be my shot at it in SQL Server. You would need to use IFNULL instead of ISNULL:
SELECT
ISNULL(row1.Feature1, row2.Feature1) AS Feature 1
, ISNULL(row1.Feature2, row2.Feature2) AS Feature 2
, ISNULL(row1.Feature3, row2.Feature3) AS Feature 3
FROM
featureTable row1
LEFT OUTER JOIN featureTable row2 ON row1.fallback = row2.familyroot
WHERE row1.userAgent LIKE '%Some User Agent String%'
This should accomplish the same thing in MySQL:
SELECT
IFNULL(row1.Feature1, row2.Feature1) AS Feature 1
, IFNULL(row1.Feature2, row2.Feature2) AS Feature 2
, IFNULL(row1.Feature3, row2.Feature3) AS Feature 3
FROM
featureTable AS row1
LEFT OUTER JOIN featureTable AS row2 ON row1.fallback = row2.familyroot
WHERE row1.userAgent LIKE '%Some User Agent String%'
So what this does, is takes your feature table, aliases it as row1 to get your specific model features. We then join it back to itself as row2 to get the family features. Then the ISNULL function says "if there is no Feature1 value in row 1 (it's null) then get the Feature1 value from row2".
Hope that helps.

DynamicQuery: How to select a column with linq query that takes parameters

We want to set up a directory of all the organizations working with us. They are incredibly diverse (government, embassy, private companies, and organizations depending on them ). So, I've resolved to create 2 tables. Table 1 will treat all the organizations equally, i.e. it'll collect all the basic information (name, address, phone number, etc.). Table 2 will establish the hierarchy among all the organizations. For instance, Program for illiterate adults depends on the National Institute for Social Security which depends on the Labor Ministry.
In the Hierarchy table, each column represents a level. So, for the example above, (i)Labor Ministry - Level1(column1), (ii)National Institute for Social Security - Level2(column2), (iii)Program for illiterate adults - Level3(column3).
To attach an organization to an hierarchy, the user needs to go level by level(i.e. column by column). So, there will be at least 3 situations:
If an adequate hierarchy exists for an organization(for instance, level1: US Embassy), that organization can be added (For instance, level2: USAID).--> US Embassy/USAID, and so on.
How about if one or more levels are missing? - then they need to be added
How about if the hierarchy need to be modified? -- not every thing need to be modified.
I do not have any choice but working by level (i.e. column by column). I does not make sense to have all the levels in one form as the user need to navigate hierarchies to find the right one to attach an organization.
Let's say, I have those queries in my repository (just that you get the idea).
Query1
var orgHierarchy = (from orgH in db.Hierarchy
select orgH.Level1).FirstOrDefault;
Query2
var orgHierarchy = (from orgH in db.Hierarchy
select orgH.Level2).FirstOrDefault;
Query3, Query4, etc.
The above queries are the same except for the property queried (level1, level2, level3, etc.)
Question: Is there a general way of writing the above queries in one? So that the user can track an hierarchy level by level to attach an organization.
In other words, not knowing in advance which column to query, I still need to be able to do so depending on some conditions. For instance, an organization X depends on Y. Knowing that Y is somewhere on the 3rd level, I'll go to the 4th level, linking X to Y.
I need to select (not manually) a column with only one query that takes parameters.
=======================
EDIT
As I just said to #Mark Byers, all I want is just to be able to query a column not knowing in advance which one. Check this out:
How about this
Public Hierarchy GetHierarchy(string name)
{
var myHierarchy = from hierarc in db.Hierarchy
where (hierarc.Level1 == name)
select hierarc;
retuen myHierarchy;
}
Above, the query depends on name which is a variable. It mighbe Planning Ministry, Embassy, Local Phone, etc.
Can I write the same query, but this time instead of looking to much a value in the DB, I impose my query to select a particular column.
var myVar = from orgH in db.Hierarchy
where (orgH.Level1 == "Government")
select orgH.where(level == myVariable);
return myVar;
I don't pretend that select orgH.where(level == myVariable) is even close to be valid. But that is what I want: to be able to select a column depending on a variable (i.e. the value is not known in advance like with name).
Thanks for helping
How about using DynamicQueryable?
http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx
Your database is not normalized so you should start by changing the heirarchy table to, for example:
OrganizationId Parent
1 NULL
2 1
3 1
4 3
To query this you might need to use recursive queries. This is difficult (but not impossible) using LINQ, so you might instead prefer to create a parameterized stored procedure using a recursive CTE and put the query there.