How do I get all strings that do not contain another string in MySQL? - mysql

I have a table called "Domains" with field "Name" (unique, always lowercase) which contains a list of domains and subdomains on my server like:
blah.example.com
www.example.com
www.blah.example.com
example.com
example.nl
example.org
Looking at this list, names 1, 2 and 3 are subdomains of item 4. And I'm looking to just find all domains in this table without these subdomains. Or, to be more precise, any name that does not have part of it in the name from another record. Thus only item 4, 5 and 6.
If record 4 was missing then this query would also have item 1 and 2 as result, but not item 3. After all, item 3 has item 1 as part of it.
Just trying to find the query that can provide me this result... Something with select d.name from domains where d.name not in... Well, there my mind goes blank.
Why?
This list of domains is generated by my web server which registers every new domain that gets requested on it. I'm working on a reporting page where I would display the top domain names to see if there are any weird domains in it. For some reason, I sometimes see unknown domain names in these requests and this might give some additional insight in it all.
I am going to change my code so it will include references to parent domains in the same table in the future but for now I'll have to deal with this and a simple SQL solution.

Use a self-join that matches on suffixes using LIKE
SELECT d1.name
FROM domains AS d1
LEFT JOIN domains AS d2 ON d1.name LIKE CONCAT('%.', d2.name)
WHERE d2.name IS NULL
DEMO

Related

MySQL: Find all multiple second level domains

I have a table with thousands offers from different countries. Some offers run on different domains for different countries. For example supershop runs three different domains for three different countries:
supershop.com, supershop.fr & supershop.nl
In my database, the URL entries may look different:
http://supershop.com
https://www.supershop.fr/home/index.php
https://supershop.nl
Now, how can I SELECT all rows of the same SLD (Second Level Domain) names?
It should be something like
SELECT
landingpage,
COUNT(landingpage)
FROM
angebote
GROUP BY REGEXP "^(https?://|www\\.)[\.A-Za-z0-9\-]+\\.[a-zA-Z]{2,4}"
HAVING COUNT(landingpage) > 1
Grouped by this part: [\.A-Za-z0-9\-]
Any solutions/hints?
You can use REGEXP_REPLACE to extract the second level domain from each URL, and then GROUP BY that value:
SELECT REGEXP_REPLACE(landingpage, "^(?:https?://(?:www\\.)?)([A-Za-z0-9-]+)\\.[a-zA-Z]{2,4}(/.*)?$", "\\1") AS sld,
COUNT(*) AS count
FROM angebote
GROUP BY sld
Output (for your sample data)
sld count
supershop 3
Demo on dbfiddle
Note I've made some minor modifications to your regex to make it work with REGEXP_REPLACE to extract the second level domain.

Joining 2 tables together and using the where function based on a separate mysql query

I am building a training platform for work. I have created the requirements for a user to be trained based on a role given to them. If that role is aligned to a document it will sit against the user. I have managed to get most of the way but am struglling on the best way to finish the where statement within mysqli.
tbldocfiles is a list of my files. I am looking at docid (could be multiple files associated to the document)
tbltrainingaccess sets the roles (driver, warehouseman, customer services) and shows which role (by id) is associated to the document in docfiles.
tblusertraining is the list of users and what role they have associated to them. (driver, warehouseman, customer services).
I am listing the documents associated to the user so have thought the following is the best way:
Look at the user and how many roles he/she is allocated
Look at the roles returned in point 1 (where function)
Identify and match the documents that have the same roles as the user (Join function)
create the list, then look at the unique values for docid. (distinct value)
Example User Bri has the driver and warehouseman role.
There are 5 documents in the db, 3 of them are associated to the driver role (docid 1,2,3) and 2 of them are associated to the warehouseman role (docid 2,4) the 5th document is associayted to customerservice.
My query should do this:
List all documents associated to the roles, that are associated to the user Bri
1
2
3
2
4
Now select unique values (using docid) from the above list:
1,2,3,4.
So my answer will be a used as a count function at the end using mysql_fetch_rows
SELECT DISTINCT tbldocfiles.docid FROM tbldocfiles LEFT JOIN tbltrainingaccess ON (tbldocfiles.docid = tbltrainingaccess.docid) where groupid='1' or groupid='9'
The above code works. but i've got myself confused.
The where statement needs to be the result of a query similar to :
select * from tblusertrainingrole where userid='1' (1 will be a variable based on page selection)
the result in this would be 1, 9 which are the groupid results.
Basically any help would be appreciated! I am sure it will be simple but have burnt myself out on this for a while and most answers in here helped with joining but not the where statement (that I could find)
Thank you in advance everyone!
You can do a select statement in the where. Since it is an or statement you can use in for the results. Please replace * with the column name for the value you need. Should look like
where groupid in (select * from tblusertrainingrole where userid = '1')

SQL self-join to return specific rows

Skip to bottom to avoid long-winded explanation
Ok, so.
I'm working on a company intranet for managing client jobs. Jobs are comprised of Elements: an example element might be "Build a six-page website", or "Design a logo".
Each element consists of a collection of role-hours, so "Build a six-page website" might include four hours of "Developer" rate and two hours of "Designer" rate (ok, maybe a little longer :)
Obviously, different clients get different hourly rates. And, although that's already accounted for in the system, it's not giving us enough flexibilty. Traditionally, our account managers have been rather... ad hoc... with their pricing: the "Build a six-page website" element might include the standard four hours of developer for client "Bob", but eight hours for client "Harry".
Bear with me. I will get to actual code soon.
Elements are, of course, stored in the "Elements" database table - which is composed of little more than an ID and a text label.
My work-in-progress solution to the "we need client-specific elements" problem is to add a "client" field to this table. We can then go through and add any client-specific versions of the available elements, tweaking them to taste.
When the account managers go to add elements to their jobs, they should only see elements that are either (a) available to anyone - that is, they have a NULL client field, or (b) specific to the job client.
So far, so SELECT WHERE.
But that isn't going to cut it. If I add a second "Build a six-page website" element specifically for Harry, then an account manager adding elements to a job for Harry will see both the standard version, and Harry's version of the element. This is no good. They should only see the standard version if there's not an applicable client-specific version.
Ok... soooo: as well as adding a "client" field to the elements table, add a "parent element" field. We can then do something magically self-referential involving joining the table to itself, and fetch only the relevant roles.
My long-awaited question is thus:
Oh look, an actual question
id label client parent_element
1 Standard Thing NULL NULL
2 Harrys Thing 1 1
3 Bobs Thing 2 1
4 Different Thing NULL NULL
Given this table structure, how can I write a single SQL query that will accept a "client ID" parameter and return:
For client ID 1, rows 2 and 4
For client ID 2, rows 3 and 4
For client ID 42, rows 1 and 4
For extra bonus points, the results should include the parent element label. So for client ID 1, for example:
id label standardised_label client parent_element
2 Harrys Thing Standard Thing 1 1
4 Different Thing Different Thing NULL NULL
SELECT mm.*, md.label AS standardized_label
FROM mytable md
LEFT JOIN
mytable mc
ON mc.parent_element = md.id
AND mc.client = #client
JOIN mytable mm
ON mm.id = COALESCE(mc.id, md.id)
WHERE md.client IS NULL
Create an index on (client, parent_element) for this to work fast.
See SQLFiddle.

MySQL - return one row from 2 rows in the same table, overwrite the contents of the first 'default' with the populated fields of the second 'override'

I am trying to make use of the mobile device lookup data in the WUFL database at http://wurfl.sourceforge.net/smart.php but I'm having problems getting my head around the MySQL code needed (I use Coldfusion for the server backend). To be honest its really doing my head in but I'm sure there is a straightforward approach to this.
The WUFL is supplied as XML (approx 15200 records to date), I have the method written that saves the data to a MySQL database already. Now I need to get the data back out in a useful way!
Basically it works like this: firstly run a select using the userAgent data from a CGI pull to match against a known mobile device (row 1) using LIKE; if found then use the resultant fallback field to look up the default data for the mobile device's 'family root' (row 2). The two rows need to be combined by overwriting the contents of (row 2) with the specific mobile device's features of (row 1). Both rows contain NULL entries and not all the features are present in (row 1).
I just need the fully populated row of data returned if a match is found. I hope that makes sense, I would provide what I think the SQL should look like but I will probably confuse things even more.
Really appreciate any assistance!
This would be my shot at it in SQL Server. You would need to use IFNULL instead of ISNULL:
SELECT
ISNULL(row1.Feature1, row2.Feature1) AS Feature 1
, ISNULL(row1.Feature2, row2.Feature2) AS Feature 2
, ISNULL(row1.Feature3, row2.Feature3) AS Feature 3
FROM
featureTable row1
LEFT OUTER JOIN featureTable row2 ON row1.fallback = row2.familyroot
WHERE row1.userAgent LIKE '%Some User Agent String%'
This should accomplish the same thing in MySQL:
SELECT
IFNULL(row1.Feature1, row2.Feature1) AS Feature 1
, IFNULL(row1.Feature2, row2.Feature2) AS Feature 2
, IFNULL(row1.Feature3, row2.Feature3) AS Feature 3
FROM
featureTable AS row1
LEFT OUTER JOIN featureTable AS row2 ON row1.fallback = row2.familyroot
WHERE row1.userAgent LIKE '%Some User Agent String%'
So what this does, is takes your feature table, aliases it as row1 to get your specific model features. We then join it back to itself as row2 to get the family features. Then the ISNULL function says "if there is no Feature1 value in row 1 (it's null) then get the Feature1 value from row2".
Hope that helps.

Count of related items in a 2nd table with zero results needed (query check please)

This MySQL statement is a bit over my head. I pieced it togather through a lot of Google searches. It seems to work right but I just wanted to see if I could get a thumbs up. I'm paranoid I did something a bit off and some issue could come up I'm not understanding.
I have a 'directories' table, 'folders' table and 'documents' table. (directories have many folders, folders have many documents).
On a web page, I have a select where a user can choose a directory (which has many folders). This query is for an AJAX call that loads a second select with the list of all folders belonging to the directory (getting the id's and names to load the 'folders' select).
So, this query will be made against one directory to get a list of folder id's and folder names for that directory. I also needed the folder name to contain a count of how many documents are contained in each folder. Also, I originally had just "join" which did not return zero results but changing it to "left join" listed folders with 0 documents (don't have an understanding of the different types of joins yet).
MY FRANKEN-QUERY:
SELECT f.id, CONCAT(f.folder_name , ' (', COUNT(DISTINCT d.id), ' documents')') AS folder_name
FROM folders f
LEFT JOIN documents d ON d.folder_id = f.id
WHERE f.directory_id = '2'
GROUP BY f.id
ORDER BY f.folder_name
RESULTS (seems to work fine):
id folder_name
1 MAIN (2 documents)
8 test1 (2 documents)
9 test2 (3 documents)
50 test3 (0 documents)
Thanks - much appreciated!
It looks fine offhand, but just run a couple tests on your data ans make sure you get consistent (correct) results.
Assuming document.id is a primary key, you can remove the DISTINCT keyword from the count.
For more on the various join types
http://en.wikipedia.org/wiki/Join_%28SQL%29