regular expression to extract json values in mysql field - mysql

I have a "users" table with an "assignments" field that has a list of course IDs and when then are assigned and whether they are required or optional in one json-like string (missing the top-level braces)
"BUS1077":{"startDate":"2013-09-16","hasPrerequisite":"","list":"required"},
"CMP1042":{"startDate":"2013-09-16","hasPrerequisite":"","list":"optional"},
"CMP1108":{"startDate":"2013-09-16","hasPrerequisite":"","list":"required"}
I have a another table, called "progress" that lists the course ids, like BUS1078, and whether they are completed or not.
I need a query to select the users who have completed all their required courses.
somthing like:
SELECT userid FROM users
where (count([ids from users.assignments where list:"required"] as courseid)
=count([extracted ids] joined using( courseid) where "complete"=1))
so there are just two tables
users (userid,assignments)
progress (id,userid,courseid,complete)
in the end I want to have selected the userids where each REQUIRED course is complete
(note, the database itself is much more complex, but this represents the gist of the problem)

As of MySQL 5.1 you can do this with built-in functions of common_schema you can use for this purpose. I haven't used it myself but I've found a nice blog about how you can parse JSON stored data and do something usefull with it.
The blog: http://mechanics.flite.com/blog/2013/04/08/json-parsing-in-mysql-using-common-schema/

I'm not familiar with the RegEx implementation in MySQL, but this basic approach should work:
SELECT userid FROM users WHERE NOT EXISTS(
SELECT NULL FROM assignments WHERE NOT EXISTS(
SELECT NULL FROM progress WHERE
progress.userid = users.userid
AND REGEXMATCH(
assignments.assignment,
'(^|,)"' + progress.courseid + '":.*?"list":"required"\}') >= 0
)
)
)
This should find all users where there is not a required assignment that the user hasn't completed.
Given the course IDs and the word "required" are unlikely to appear out of context, the regular expression itself could likely be much more naive, such as:
'"' + progress.courseid + '"[^}]+"required"'
I don't know about MySQL's current limitations when it comes to correlated subqueries, but the same thing could be accomplished with joins. Using EXISTS should be preferred over COUNT, since counting requires aggregation across the entire dataset rather than allowing a short-cut on the first non-match found.

if your courseid is always 7 characters long and the list in assignments field can have up to maximum of 10 courses
you can use this sqlFiddle
SELECT U.userId
FROM users U
WHERE NOT EXISTS
(SELECT 1 FROM
(SELECT users.userid,courseName,
(Assignments REGEXP CONCAT('"',courseName,'"[^}]+(:"required"})'))as Required,
Assignments,
courseid,complete
FROM
(SELECT userid,courseName FROM
(SELECT userid,SUBSTRING_INDEX(SUBSTRING_INDEX(assignments,'":{"startDate',course.num),'"',-1) as courseName
FROM users,(SELECT 1 as num
UNION SELECT 2
UNION SELECT 3
UNION SELECT 4
UNION SELECT 5
UNION SELECT 6
UNION SELECT 7
UNION SELECT 8
UNION SELECT 9
UNION SELECT 10)course
)T WHERE LENGTH(courseName)=7
)Courses
INNER JOIN users ON users.userid = Courses.userid
LEFT JOIN progress ON users.userid = progress.userid
AND Courses.courseName = progress.courseId
AND progress.complete = 1
)AllCourses
WHERE AllCourses.userId = U.userId
AND AllCourses.Required = 1
AND Complete IS NULL
)
What the query does is it grabs the courseName(s) from assignment fields and see if it's required and sets required flag, then LEFT JOIN with progress and we have the Required column and Complete is NULL when the course doesn't exist in progress or when complete is not 1.
We then select user id WHERE there does not EXISTS (a record in their Courses where Required = 1 AND Complete IS NULL)
In the fiddle, I have user 2 having only completed an optional course. So userId 2 is not returned.
You can just run the inner select for AllCourses subquery and see the data of all the courses for all users and whether they completed a course that is required or not.

Related

SQL: Return immediately if one matched record found

I have one table with user and their posts. It looks like "user_id | post_id | post_status".
Now I have a list of userid (ex, 100 users) and I want to know how many of them has at least one post that gets deleted (ex, post_status 3).
Here is my sample search:
select count(distinct user_id)
from post_table
where user_id in ( {my set} )
and post_status=3
It runs super slow since it iterates the entire table. Is there a way to speed up the query?
Use something like
SELECT COUNT(*)
FROM
-- the list of userid as a rowset
( SELECT 123 AS user_id UNION ALL
SELECT 456 UNION ALL
-- ...
SELECT 789
) user_id_list
WHERE EXISTS ( SELECT NULL
FROM post_table
WHERE post_table.user_id = user_id_list.user_id
AND post_table.post_status = 3 )
If your MySQL version is 8.0.4 or above then you may provide the users list as CSV/JSON and parse it using JSON_TABLE (the query text will be more compact).
INDEX(post_status, user_id)
may help speed up your query, especially if very few rows have status=3.
This could also speed up Akina's solution.

What is the proper MySQL way to take data from 4 rows, 1 column, and separate into 9 columns?

I've studied and tried days worth of SQL queries to find "something" that will work. I have a table, apj32_facileforms_subrecords, that uses 7 columns. All the data I want to display is in 1 column - "value". The "record" displays the number of the entry. The "title" is what I would like to appear in the header row, but that's not as important as "value" to display in 1 row based upon "record" number.
I've tried a lot of CONCAT and various Pivot queries, but nothing seems to do more than "get close" to what I'd like as the end result.
Here's a screen shot of the table:
The output "should" be linear, so that 1 row contains 9 columns:
Project; Zipcode; First Name; Last Name; Address; City; Phone; E-mail; Trade (in that order). And the values in the 9 columns come from "value" as they relate to the "record" number.
I know there are LOT of examples that are similar, but nothing I've found covers taking all the values from "value" and CONCAT to 1 row.
This works to get all the data I want - SELECT record,value FROM apj32_facileforms_subrecords WHERE (record IN (record,value)) ORDER BY record
But the values are still in multiple rows. I can play with that query to get just the values, but I'm still at a loss to get them into 1 row. I'll keep playing with that query to see if I can figure it out before one of the experts here shows me how simple it is to do that.
Any help would be appreciated.
Using SQL to flatten an EAV model representation into a relational representation can be somewhat convoluted, and not very efficient.
Two commonly used approaches are conditional aggregation and correlated subqueries in the SELECT list. Both approaches call out for careful indexing for suitable performance with large sets.
correlated subqueries example
Here's an example of the correlated subquery approach, to get one value of the "zipcode" attribute for some records
SELECT r.id
, ( SELECT v1.value
FROM `apj32_facileforms_subrecords` v1
WHERE v1.record = r.id
AND v1.name = 'zipcode'
ORDER BY v1.value LIMIT 0,1
) AS `Zipcode`
FROM ( SELECT 1 AS id ) r
Extending that, we repeat the correlated subquery, changing the attribute identifier ('firstname' in place of 'zipcode'. looks like we we could also reference it by element, e.g. v2.element = 2
SELECT r.id
, ( SELECT v1.value
FROM `apj32_facileforms_subrecords` v1
WHERE v1.record = r.id
AND v1.name = 'zipcode'
ORDER BY v1.value LIMIT 0,1
) AS `Zipcode`
, ( SELECT v2.value
FROM `apj32_facileforms_subrecords` v2
WHERE v2.record = r.id
AND v2.name = 'firstname'
ORDER BY v2.value LIMIT 0,1
) AS `First Name`
, ( SELECT v3.value
FROM `apj32_facileforms_subrecords` v3
WHERE v3.record = r.id
AND v3.name = 'lastname'
ORDER BY v3.value LIMIT 0,1
) AS `Last Name`
FROM ( SELECT 1 AS id UNION ALL SELECT 2 ) r
returns something like
id Zipcode First Name Last Name
-- ------- ---------- ---------
1 98228 David Bacon
2 98228 David Bacon
conditional aggregation approach example
We can use GROUP BY to collapse multiple rows into one row per entity, and use conditional tests in expressions to "pick out" attribute values with aggregate functions.
SELECT r.id
, MIN(IF(v.name = 'zipcode' ,v.value,NULL)) AS `Zip Code`
, MIN(IF(v.name = 'firstname' ,v.value,NULL)) AS `First Name`
, MIN(IF(v.name = 'lastname' ,v.value,NULL)) AS `Last Name`
FROM ( SELECT 1 AS id UNION ALL SELECT 2 ) r
LEFT
JOIN `apj32_facileforms_subrecords` v
ON v.record = r.id
GROUP
BY r.id
For more portable syntax, we can replace MySQL IF() function with more ANSI standard CASE expression, e.g.
, MIN(CASE v.name WHEN 'zipcode' THEN v.value END) AS `Zip Code`
Note that MySQL does not support SQL Server PIVOT syntax, or Oracle MODEL syntax, or Postgres CROSSTAB or FILTER syntax.
To extend either of these approaches to be dynamic, to return a resultset with a variable number of columns, and variety of column names ... that is not possible in the context of a single SQL statement. We could separately execute SQL statements to retrieve information, that would allow us to dynamically construct a SQL statement of a form show above, with an explicit set of columns to be returned.
The approaches outline above return a more traditional relational model, (individual columns each with a value).
non-relational munge of attributes and values into a single string
If we have some special delimiters, we could munge together a representation of the data using GROUP_CONCAT function
As a rudimentary example:
SELECT r.id
, GROUP_CONCAT(v.title,'=',v.value ORDER BY v.name) AS vals
FROM ( SELECT 1 AS id ) r
LEFT
JOIN `apj32_facileforms_subrecords` v
ON v.record = r.id
AND v.name in ('zipcode','firstname','lastname')
GROUP
BY r.id
To return two columns, something like
id vals
-- ---------------------------------------------------
1 First Name=David,Last Name=Bacon,Zip Code=98228
We need to be aware that the return from GROUP_CONCAT is limited to group_concat_max_len bytes. And here we have just squeezed the balloon, moving the problem to some later processing, to parse the resulting string. If we have any equal signs or commas that appear in the values, it's going to make a mess of parsing the result string. So we will have to properly escape any delimiters that appear in the data, so that GROUP_CONCAT expression is going to get more involved.

return values of table 1 based on single column in table 2

I have 3 tables that I am using and need to make a query to return data from one table based on the value of a single column in the second table.
tbl_user
ID
login
pass
active
mscID
tbl_master
ID
name
training_date
MSCUnit
Active
tbl_msc
mscID
mscName
my current SQL statement:
SELECT
tbl_master.ID,
tbl_master.name,
tbl_master.training_date,
tbl_master.MSCUnit,
tbl_master.active,
tbl_user.mscID
FROM
tbl_master,
tbl_user
WHERE
tbl_master.active = 1 AND tbl_master.MSCUnit = tbl_user.mscID
The values stored in tbl_msc.mscID is a varchar(11) and it contains a string similar to A00 or A19. This is also the Primary key in the table.
The values stored in tbl_user.mscID matches that of tbl_msc.mscID. The values stored in tbl_master.UnitMSC also matches that of tbl_msc.mscID.
My goal is to return all records from tbl_master where the currently logged in user has the same mscID. The problem I am having is the statement returns all records in tbl_master.
I have tried several different join statements and for some reason, I cannot get this to filter correctly.
I am missing something. Any assistance in the SQL statement would be appreciated.
Thanks,
Will
You should be writing this using joins. I don't know how you know who the current user is, but the idea is to join the three tables together:
SELECT m.ID, m.name, m.training_date, m.MSCUnit, m.active,
u.mscID
FROM tbl_master m JOIN
tbl_user u
ON m.MSCUnit = u.mscID JOIN
tbl_msc msc
ON msc.mscID = u.msc_ID
WHERE m.active = 1 AND msc.mscName = ?;
Notice the use of proper, explicit, standard JOIN syntax and table aliases.
Select a.*, b.userid from
table_master a, table_user b where
a.mscunit in (select mscid from
table_user where active=1)
This should point you in the right direction.

Get a list of ids not present in a table

I have a list of ids, and I want to query a mysql table for ids not present in the table.
e.g.
list_of_ids = [1,2,4]
mysql table
id
1
3
5
6
..
Query should return [2,4] because those are the ids not in the table
since we cant view ur code i can only work on asumption
Try this anyway
SELECT id FROM list_of_ids
WHERE id NOT IN (SELECT id
FROM table)
I hope this helps
There is a horrible text-based hack:
SELECT
substr(result,2,length(result)-2) AS notmatched
FROM (
SELECT
#set:=replace(#set,concat(',',id,','),',') AS result
FROM (
select #set:=concat(',',
'1,2,4' -- your list here
,',')
) AS setinit,
tablename --Your tablename here
) AS innerview
ORDER BY LENGTH(result)
LIMIT 1;
If you represent your ids as a derived table, then you can do this directly in SQL:
select list.val
from (select 1 as val union all
select 2 union all
select 4
) list left outer join
t
on t.id = list.val
where t.id is null;
SQL doesn't really have a "list" type, so your question is ambiguous. If you mean a comma separated string, then a text hack might work. If you mean a table, then something like this might work. If you are constructing the SQL statement, I would advise you to go down this route, because it should be more efficient.

Right way to do this query

I have a table
form (
int id )
webformsystemflags ( int id, int formid int sysflagid )
sysflag ( int id, name char(10) )
form table is a table which has all the forms
webform is a table which has the forms which have flags applied to it. It has a foreign key formid which is id to the form table and sysflagid which is foreign key to the sys flag table
sys flag is the table which contains the flags. Lets say I have flags defined as 1,2,3
I can have forms which don't have all the flags applied to it, some may have 1, some may have 2 or some may have 3 applied to it or some may have none.
How can I find all the forms which have either flag 1 or flag 2 or flag 3 applied to it ?
This is a common trick to find EXCLUSION. The value I have below of "FlagYouAreExpectingTo_NOT_Exist" is explicitly the one you expect NOT to be there. Here's how it works.
Get every form and LEFT JOIN to the Web System Flags table WITH FINDING the matching form, and flag setting you DO NOT want. If it finds a valid entry for the form and flag, the "formid" in the (wsf) table will exist. So, we want all that DON'T exist, hence the closing WHERE wsf.formid is null.
It will be NULL for those where it is NOT already flagged.
select
f.ID
from
forms f
left join webformsystemflags wsf
on f.id = wsf.formid
AND wsf.sysflagid = FlagYouAreExpectingTo__NOT__Exist
where
wsf.formid is null
You could use a subquery:
SELECT * FROM `form` WHERE `id` IN (SELECT `formid` FROM `webformsystemflags`)
Careful with subqueries on huge databases though. You could do the same thing with joins but this is an easy solution that will get you going.
Or for all results that DO NOT have a certain flag:
SELECT * FROM `form` WHERE `id` IN (SELECT `formid` FROM `webformsystemflags` WHERE `sysflagid` != 1 OR `sysflagid` != 2)
or a join method:
SELECT f.*, r.`sysflagid` FROM `form` f LEFT JOIN `webformsystemflags` r ON r.`formid` = f.`id` WHERE r.`sysflagid` != null
will get you the forms and the related flags. However, it will not get ALL flags in one row if the form has multiple flags on it. That one you may need to do a concat on the flags, but this answer is already growing unnecessarily complex.
*LAST EDIT *
Ok nutsandbolts - You need to update your question cause the two of us have overshot ourselves in a number of different queries and it isn't really helping to come back saying it doesnt give the right results. The right results can easily be reached by simply examining the queries we have provided and using the general logic behind them to compose the query that is right for you.
So my last suggestion - you say you want a query that will return a form IF it has a certain flag applied to it AND that is does NOT have other flags applied to it.
Here it is supposing you wanted all forms with a flag of 1 AND NOT 2 or 3 or none:
SELECT f.*, r.`sysflagid` FROM `form` f LEFT JOIN `webformsystemflags` r ON r.`formid` = f.`id` WHERE r.`sysflagid` =1 AND r.`formid` NOT IN (SELECT `formid` FROM `webformsystemflags` WHERE `sysflagid` = 2 OR `sysflagid` = 3)
Because your webformsystemflags is relational this query will NOT return any forms that do not exist in the webformsystemflags table - so you don't need to consider null.
If this is not what you're looking for I strongly suggest you rewrite your question with absolute and perfect clarity on your needs cause after this one I'm out of this conversation. Much luck to you though. Have fun.
You can use an exists clause to pull records like this:
select a.*
from form a
where exists (select 1
from webformsystemflags
where formid = a.id
and sysflagid IN (1,2,3))
This won't give you the associated flag. If you want that:
select a.*, b.sysflagid
from form a
join (select formid, sysflagid
from webformsystemflags
where sysflagid in (1,2,3)) b
on a.id = b.formid
There are many different ways to solve this.
EDIT: By reading a comment on the other answer it seems the question was unclear. You want the result forms that only have ONE flag? i.e. the form has flag 1 but not 2 or 3?
edit2: if you really just want a true/false query pulling only the true (has a flag):
select a.*, b.sysflagid
from form a
join webformsystemflags b on a.id = b.formid
If you want forms without flags:
select a.*
from form a
left join webformsystemflags b on a.id = b.formid
where b.formid is null
edit3: Based on comment, forms with one flag and not one of the others:
select a.*
from form a
where exists (select 1 from webformsystemflags where formid = a.id and sysflagid = 1)
and (
not exists (select 1 from webformsystemflags where formid = a.id and sysflagid = 2)
or
not exists (select 1 from webformsystemflags where formid = a.id and sysflagid = 3)
)