Display ID sharing 2 values from same attribute - mysql

I am trying to get the eNum (employee number) of that who masters 2 values (MySQL and Python) from the same attribute column, The closest I get is down below, but the eNum is duplicated. I want to get just one eNum once. I think I am messing it up in the WHERE clause... I don't know...
mysql> select * from employee_expert;
+------+---------+
| eNum | package |
+------+---------+
| E246 | Excel |
| E246 | MySQL |
| E246 | Python |
| E246 | Word |
| E403 | Jave |
| E403 | MySQL |
| E892 | Excel |
| E892 | PHP |
| E892 | Python |
+------+---------+
mysql> SELECT eNum, package
FROM employee_expert
WHERE (package = 'MySQL' OR package = 'Python') AND (package = 'MySQL' OR package = 'Python')
GROUP BY package;
+------+---------+
| eNum | package |
+------+---------+
| E246 | MySQL |
| E246 | Python |
+------+---------+

The WHERE clauses contains unnecessary duplication of the condition package = 'MySQL' OR package = 'Python'. Using WHERE (package = 'MySQL' OR package = 'Python') is enough. Or, to make it more readable you can write WHERE package IN ('MySQL', 'Python').
Your query selects the employees that know 'MySQL' or 'Python' or both.
It looks like you want to select the employees that know both 'MySQL' and 'Python'. You need to use a JOIN for this purpose:
SELECT f.eNum
FROM employee_expert f # 'f' from 'first'
INNER JOIN employee_expert s USING(eNum) # 's' from 'second'
WHERE f.package = 'MySQL'
AND s.package = 'Python'
Unfortunately, this approach does not scale very well if you need to find by a larger set of languages. A better approach would be to use the original query and group the results by eNum like this:
SELECT eNum, COUNT(DISTINCT package) AS nbLangs
FROM employee_expert
WHERE package IN ('MySQL', 'Python') # <------------------------------------+
GROUP BY eNum # Make one entry for each employee |
HAVING nbLangs = 2 # Replace '2' with the number of items in this list --+
This query counts the number of known languages for all the employees that know at least one of the languages in the list then keeps only those that knows all of them.

I think the problem is in the design itself, come to think of it, an employee can master MANY packages and a package can be mastered by MANY employees, it's a many to many relationship, in terms of database that will produce a table employee_package for example which contains a primary key composed of the primary key of each table
+------+------------+
| eNum | package_id |
+------+------------+
| E246 | 1 |
| E246 | 2 |
| E246 | 3 |
| E892 | 1 |
+------+------------+
then your request will be something like :
SELECT DISTINCT e.eNum from employees e JOIN employee_package ep on ep.eNum = e.eNum
WHERE ep.package_id = 1 OR ep.package_id = 2
-- let's say that id 1 is for MySQL and id 2 is for Python

Related

MS Access - Data Type Mismatch in Criteria Expression

Using the query grid , comparing a String field with a Replace function result of another String field (same table) results in a Data Type Mismatch error when trying to filter for ‘Not Like’ (or <>).
‘TypeName’ confirms that all records are of type “String”.
The problem is caused by “MyStrCalc: Replace([StrA],".","_")” which is compared with StrB. StrA contains Null for some records. These are filtered out (Criterium = “Is Not Null”). But even when creating a new query that uses the result of the first, the same error occurs. I have also tried Nz.
If I use Make Table to create a new table where StrA “Is Not Null” and run effectively the same query, there’s no issue.
The data in the table changes frequently, so having to create a separate table every time (tens of thousands of records) is a real nuisance.
Any suggestions how to make the query work would be greatly appreciated.
(By the way – the version used is MS Access 2019 under Windows 10, both with latest updates.)
Thank you for your much appreciated quick reply.
I tried a few things as detailed below with the fourth attempt providing the desired result.
Source table t1:
| UID | StrA | StrB |
| ---:| ----- | ----- |
| 1 | Str.1 | Str_1 |
| 2 | | Str_2 |
| 3 | Str.3 | Str_4 |
Desired Result = StrA<>StrB after replacing dots in StrA with underscores:
| UID | StrA | StrB
| ---:| ----- | -----
| 2 | | Str_2
| 3 | Str.3 | Str_4
q1_Bad:
SELECT t1.UID, t1.StrA, t1.StrB, Replace([StrA],".","_",1,-1,1) AS StrACalc
FROM t1
WHERE (((Replace([StrA],".","_",1,-1,1)) Not Like [StrB]));
Result: “Data type mismatch in criteria expression”.
q2_Runs_CannotFilter:
SELECT t1.UID, t1.StrA, t1.StrB, Replace([StrA],".","_",1,-1,1) AS StrACalc, [StrACalc] Not Like [StrB] AS StrACalc_NtEq_StrB
FROM t1
WHERE (((t1.StrA) Is Not Null));
Result: Runs, but filtering field ‘StrACalc_NtEq_StrB’ (SQL or after running query) results in “Data type mismatch in criteria expression”.
q3_OK_SQL_FilterFail:
SELECT t1.UID, t1.StrA, t1.StrB, Replace(Nz([StrA]),".","_",1,-1,1) AS StrACalc, Nz([StrACalc] Not Like [StrB]) AS StrACalc_NtEq_StrB
FROM t1;
Result: Runs, but filtering field ‘StrACalc_NtEq_StrB’ is only possible after running query. Adding “Nz([StrACalc] Not Like [StrB]) AS StrACalc_NtEq_StrB” results in “Enter Parameter Value | StrACalc”.
Note: If the result of the above is called in another query, the SQL filtering will work.
q4_OK
SELECT t1.UID, t1.StrA, t1.StrB
FROM t1
WHERE (t1.StrB) Not Like Replace(Nz([StrA]),".","_",1,-1,1);
Finally – Desired result:
| UID | StrA | StrB |
| ---:| ----- | ----- |
| 2 | | Str_2 |
| 3 | Str.3 | Str_4 |

Separating a comma separated string to a new table

I inherited a project that has comma separated strings stored in a field called 'subsector' in a table named 'com_barchan_project'. I need to change this horrible design, since it's proving to be an issue trying to parse through this field. See HERE for the full story:
| id | name | sector | subsector |
+----+------+--------+-----------+
| 1 | test | 2 | 3,4,7 |
+----+------+--------+-----------+
| 2 | door | 5 | 2 |
I have created a new table called 'com_barchan_project_subsector_join' with the required fields and would like to move the values stored in 'com_barchan_project' to this new empty table.
Can anyone help me with the SQL statement that would accomplish this?
Here's what the new 'com_barchan_project_subsector_join' table should look like:
| id | project_id | subsector_id |
+----+------------+--------------+
| 1 | 1 | 3 |
+----+------------+--------------+
| 2 | 1 | 4 |
+----+------------+--------------+
| 3 | 1 | 7 |
+----+------------+--------------+
| 4 | 2 | 2 |
Once I move over the data, I will remove the 'subsector' field from the 'com_barchan_project' table and be done with it.
Thanks for your help!!!
John
Using shorter table names for brevity/clarity; and assuming you have (or can easily make) a comprehensive subsectors table...and assuming your csv are stored in a consistent format (no spaces at least).
INSERT INTO `project_subsectors` (project_id, subsector_id)
SELECT p.id, s.id
FROM projects AS p
INNER JOIN subsectors AS s ON p.subsector = s.id
OR p.subsector LIKE CONCAT(s.id, ',%')
OR p.subsector LIKE CONCAT('%,', s.id, ',%')
OR p.subsector LIKE CONCAT('%,', s.id)
;
I can't guarantee it will be fast; I'd be surprised if it was.
ON FIND_IN_SET(s.id, p.subsector) > 0 may work as well, but I am not as familiar with the behavior of that function.

Display only one row for values that appear multiple times

I have multiple rows with the same name in this table, and I want to show only one of row of each. For example, with the following data:
| name | number |
+------+--------+
| exe | 1 |
| exe | 10 |
| exe | 2 |
| bat | 1 |
| exe | 3 |
| bat | 4 |
I would like to see the following results:
| name | number |
+------+--------+
| exe | 16 |
| bat | 5 |
How can I achieve this result?
Duplicate response: My question only have 1 table, the JOIN ..ON command creates confusion in understanding, i think this simple question can help many guys!
Try something like this:
SELECT t.`name`, SUM(t.`number`) AS `number`
FROM mytable t
GROUP BY t.`name`
ORDER BY `number` DESC
let the database return the result you want, rather than mucking with returning a bloatload of rows, and collapsing them on the client side. There's plenty of work for the client to do without doing what the database can do way more efficiently.
You can use an aggregation function for this:
SELECT name, SUM(number) AS total
FROM myTable
GROUP BY name;
Here is a reference on aggregate functions, and here is an SQL Fiddle example using your sample data.

MySQL to Redis - Import and Model

I'm thinking to use Redis to cache some user data snapshot(s) in order to speed up the access to that data (one of the reasons is because my MySQL table(s) suffer of lock contention) and I'm looking for the best way to import in one step a table like this(which may contain from a few record to millions of records):
mysql> select * from mytable where snapshot = 1133;
+------+--------------------------+----------------+-------------------+-----------+-----------+
| id | email | name | surname | operation | snapshot |
+------+--------------------------+----------------+-------------------+-----------+-----------+
| 2989 | example-2989#example.com | fake-name-2989 | fake-surname-2989 | 2 | 1133 |
| 2990 | example-2990#example.com | fake-name-2990 | fake-surname-2990 | 10 | 1133 |
| 2992 | example-2992#example.com | fake-name-2992 | fake-surname-2992 | 5 | 1133 |
| 2993 | example-2993#example.com | fake-name-2993 | fake-surname-2993 | 5 | 1133 |
| 2994 | example-2994#example.com | fake-name-2994 | fake-surname-2994 | 9 | 1133 |
| 2995 | example-2995#example.com | fake-name-2995 | fake-surname-2995 | 7 | 1133 |
| 2996 | example-2996#example.com | fake-name-2996 | fake-surname-2996 | 1 | 1133 |
+------+--------------------------+----------------+-------------------+-----------+-----------+
into the Redis key-value store.
I can have many "snapshots" to load into Redis, and the basic access pattern is (SQL like syntax)
select * from mytable where snapshot = ? and id = ?
these snapshots can also coming from others table, so the "global unique ID per snapshot" is the column snapshot, ex:
mysql> select * from my_other_table where snapshot = 1134;
+------+--------------------------+----------------+-------------------+-----------+-----------+
| id | email | name | surname | operation | snapshot |
+------+--------------------------+----------------+-------------------+-----------+-----------+
| 2989 | example-2989#example.com | fake-name-2989 | fake-surname-2989 | 1 | 1134 |
| 2990 | example-2990#example.com | fake-name-2990 | fake-surname-2990 | 8 | 1134 |
| 2552 | example-2552#example.com | fake-name-2552 | fake-surname-2552 | 5 | 1134 |
+------+--------------------------+----------------+-------------------+-----------+-----------+
The loaded snapshot into redis never change, they are available only for a week via TTL
There is a way to load in one step this kind of data(rows and columns) into redis combining redis-cli --pipe and HMSET?
What is the best model to use in redis in order to store/get this data (thinking at the access pattern)?
I have found the redis-cli --pipe Redis Mass Insertion (and also MySQL to Redis in One Step) but I can't figure out the best way to achieve my requirements (load from mysql in one step all rows/colums, best redis model for this) using HMSET
Thanks in advance
Cristian.
Model
To be able to query your data from Redis the same way as:
select * from mytable where snapshot = ?
select * from mytable where id = ?
You'll need the model below.
Note: select * from mytable where snapshot = ? and id = ? does not make a lot of sense here, since it's the same as select * from mytable where id = ?.
Key type and naming
[Key Type] [Key name pattern]
HASH d:{id}
ZSET d:ByInsertionDate
SET d:BySnapshot:{id}
Note: I used d: as a namespace but you may want to rename it with the name of your domain model.
Data insertion
Insert a new line from Mysql into Redis:
hmset d:2989 id 2989 email example-2989#example.com name fake-name-2989 ... snapshot 1134
zadd d:ByInsertionDate {current_timestamp} d:2989
sadd d:BySnapshot:1134 d:2989
Another example:
hmset d:2990 id 2990 email example-2990#example.com name fake-name-2990 ... snapshot 1134
zadd d:ByInsertionDate {current_timestamp} d:2990
sadd d:BySnapshot:1134 d:2990
Cron
Here is the algorithm that must be run each day or week depending on your requirements:
for key_name in redis(ZREVRANGEBYSCORE d:ByInsertionDate -inf {timestamp_one_week_ago})
// retrieve the snapshot id from d:{id}
val snapshot_id = redis(hget {key_name} snapshot)
// remove the hash (d:{id})
redis(del key_name)
// remove the hash entry from the set
redis(srem d:BySnapshot:{snapshot_id} {key_name})
// clean the zset from expired keys
redis(zremrangebyscore d:ByInsertionDate -inf {timestamp_one_week_ago})
Usage
select * from my_other_table where snapshot = 1134; will be either:
{snapshot_id} = 1134
for key_name in redis(smembers d:BySnapshot:{snapshot_id})
print(redis(hgetall {keyname}))
or write a lua script to do this directly on redis side. Finally:
select * from my_other_table where id = 2989; will be:
{id} = 2989
print(redis(hgetall d:{id}))
Import
This part is quite easy, just read the table and follow the above model. Depending on your requirements you may want to import all (or a part of) your data with an hourly/daily/weekly cron.

MySQL query - only exact result or every choice

I've a query that I need some help with -
As part of a form I've got a serial number field that is populated if there is a serial number, blank if it's not, or no result if it's an invalid serial number.
select *
from cust_site_contract as cs
where cs.serial_no = 'C20050' or (cs.serial_no <> 'C20050' and if(cs.serial_no = 'C20050',1,0)=0)
limit 10;
Here's a sample of the regular data:
+----------------------+-----------+-----------+-----------
| idcust_site_contract | system_id | serial_no | end_date
+----------------------+-----------+-----------+-----------
| 561315 | SH001626 | C19244 | 2009-12-21
| 561316 | SH001626 | C19244 | 2010-06-30
| 561317 | SH002125 | C19671 | 2010-05-31
| 561318 | SH001766 | C14781 | 2010-09-25
| 561319 | SH001766 | C14781 | 2011-02-15
| 561320 | SH002059 | C19020 | 2008-07-09
| 561321 | SH002639 | C18889 | 2008-03-31
| 561322 | SH002639 | C18889 | 2008-06-30
| 561323 | SH002715 | C20051 | 2010-04-30
| 561324 | SH002719 | C20057 | 2010-04-30
And an exact result would look something like this:
| 561487 | SH002837 | C20050 | 2012-07-04
I was writing this as a subquery so I could match the system_ids to customer and contract names, but realised I was getting garbage pretty early on.
I'm tempted to try and simplify it by saying the third case might not hold true (i.e. if it's an invalid serial number, allow the choice of any customer name and simply flag it in the data)
Has anyone got any ideas of where I'm going wrong? The combination of conditions is clearly wrong, and I can't work out how to make each side of the or statement mutually exclusive
Even if I try to evaluate only the if(sn = 'blah') I get the wrong result for obvious reasons, but can't think of a sane way to express it.
Many thanks
Scott
If there is is no contract with a serial number of C20050, this query will return all rows, otherwise, it will return only one row where serial_no is C20050:
SELECT a.*
FROM cust_site_contract a
INNER JOIN
(
SELECT COUNT(*) AS rowexists
FROM cust_site_contract
WHERE serial_no = 'C20050'
) b ON b.rowexists = 0
UNION ALL
(
SELECT *
FROM cust_site_contract
WHERE serial_no = 'C20050'
LIMIT 1
)
If you just write the query as below you will get blank if doesn't exists or it's an invalid serial number.
select cs.serial_no from cust_site_contract as cs where cs.serial_no = 'C20050'