MySql GROUPBY value Based on data - mysql

I have a query which returns the following dataset (Original Image) :
+ ------------- + -------- + ---------- + ------------------------ + ------------------- + --------- + ----------------------- + ---------------------- + ----------------- +
| col_0_0_ | col_1_0_ | col_2_0_ | col_3_0_ | col_4_0_ | col_5_0_ | col_6_0_ | col_7_0_ | col_8_0_ |
+ ------------- + -------- + ---------- + ------------------------ + ------------------- + --------- + ----------------------- + ---------------------- + ----------------- +
| LAI-100003662 | dsa | 4546576766 | dfdfdfd2#lendingkart.com | 2015-11-30 02:30:11 | Sultanpur | Incomplete Applications | Application Incomplete | Documents Pending |
| LAI-100003662 | dsa | 4546576766 | dfdfdfd2#lendingkart.com | 2015-11-30 02:30:11 | Sultanpur | Incomplete Applications | Null | Null |
+ ------------- + -------- + ---------- + ------------------------ + ------------------- + --------- + ----------------------- + ---------------------- + ----------------- +
Now when I apply a GROUPBY col_0_0, on the query which results this dataset, I get only only one row which is (Original Image):
+ ------------- + -------- + ---------- + ------------------------ + ------------------- + --------- + ----------------------- + ---------------------- + ----------------- +
| col_0_0_ | col_1_0_ | col_2_0_ | col_3_0_ | col_4_0_ | col_5_0_ | col_6_0_ | col_7_0_ | col_8_0_ |
+ ------------- + -------- + ---------- + ------------------------ + ------------------- + --------- + ----------------------- + ---------------------- + ----------------- +
| LAI-100003662 | dsa | 4546576766 | dfdfdfd2#lendingkart.com | 2015-11-30 02:30:11 | Sultanpur | Incomplete Applications | Application Incomplete | Documents Pending |
+ ------------- + -------- + ---------- + ------------------------ + ------------------- + --------- + ----------------------- + ---------------------- + ----------------- +
1) Why does GROUP BY only give me the first row and not the second row from the original dataset?
2) How does GROUP BY actually work in this scenario?
SQL QUERY with GROUP BY :
select loan0_.col_0_0_,
loan0_.col_1_0_,
loan0_.col_2_0_,
loan0_.col_3_0_,
loan0_.col_4_0_,
loan0_.col_5_0_,
dsastatus2_.col_6_0_,
dsastatus2_.col_7_0_,
dsastatus2_.col_8_0_
FROM loan0_
cross join user1_
cross join dsastatus2_
where loan0_.L_USER_ID=user1_.U_GUID
and loan0_.L_LEADSOURCE='DSA'
and (loan0_.L_SUB_STATUS_ID=dsastatus2_.ADMIN_STATUS_ID
or loan0_.L_STATUS_ID=dsastatus2_.ADMIN_STATUS_ID)
and user1_.U_REFID='dsa001'
and (loan0_.L_APPLICATION_ID like 'LAI-100003662')
GROUP BY col_0_0_ ;

To answer the questions directly::
1) Why does GROUP BY only give me the first row and not the second row from the original dataset?
Because that's the way the MSQL engine works. Read the docs. "the server is free to choose any value from each group (not in the group by), so unless they are the same, the values chosen are indeterminate"
2) How does GROUP BY actually work in this scenario?
See above
MySQL extended Group by direct quote from docs:
https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
SQL99 and later permits such nonaggregates per optional feature T301 if they are functionally dependent on GROUP BY columns: If such a relationship exists between name and custid, the query is legal. This would be the case, for example, were custid a primary key of customers.
MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them. (Before 5.7.5, MySQL does not detect functional dependency and ONLY_FULL_GROUP_BY is not enabled by default. For a description of pre-5.7.5 behavior, see the MySQL 5.6 Reference Manual.)
If ONLY_FULL_GROUP_BY is disabled, a MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns. This causes MySQL to accept the preceding query. In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Result set sorting occurs after values have been chosen, and ORDER BY does not affect which value within each group the server chooses. Disabling ONLY_FULL_GROUP_BY is useful primarily when you know that, due to some property of the data, all values in each nonaggregated column not named in the GROUP BY are the same for each group.

The reason you see only one line is because this is what GROUP BY does -- it combines records with the same values into one. In this case the value is LAI-1000...3662 is the one value.
Now on most SQL systems if you include columns that are not in the group by or aggregate function it will give you an error but on mysql it just gives you a random value from the other column's possibilities.

Related

MS Access - Data Type Mismatch in Criteria Expression

Using the query grid , comparing a String field with a Replace function result of another String field (same table) results in a Data Type Mismatch error when trying to filter for ‘Not Like’ (or <>).
‘TypeName’ confirms that all records are of type “String”.
The problem is caused by “MyStrCalc: Replace([StrA],".","_")” which is compared with StrB. StrA contains Null for some records. These are filtered out (Criterium = “Is Not Null”). But even when creating a new query that uses the result of the first, the same error occurs. I have also tried Nz.
If I use Make Table to create a new table where StrA “Is Not Null” and run effectively the same query, there’s no issue.
The data in the table changes frequently, so having to create a separate table every time (tens of thousands of records) is a real nuisance.
Any suggestions how to make the query work would be greatly appreciated.
(By the way – the version used is MS Access 2019 under Windows 10, both with latest updates.)
Thank you for your much appreciated quick reply.
I tried a few things as detailed below with the fourth attempt providing the desired result.
Source table t1:
| UID | StrA | StrB |
| ---:| ----- | ----- |
| 1 | Str.1 | Str_1 |
| 2 | | Str_2 |
| 3 | Str.3 | Str_4 |
Desired Result = StrA<>StrB after replacing dots in StrA with underscores:
| UID | StrA | StrB
| ---:| ----- | -----
| 2 | | Str_2
| 3 | Str.3 | Str_4
q1_Bad:
SELECT t1.UID, t1.StrA, t1.StrB, Replace([StrA],".","_",1,-1,1) AS StrACalc
FROM t1
WHERE (((Replace([StrA],".","_",1,-1,1)) Not Like [StrB]));
Result: “Data type mismatch in criteria expression”.
q2_Runs_CannotFilter:
SELECT t1.UID, t1.StrA, t1.StrB, Replace([StrA],".","_",1,-1,1) AS StrACalc, [StrACalc] Not Like [StrB] AS StrACalc_NtEq_StrB
FROM t1
WHERE (((t1.StrA) Is Not Null));
Result: Runs, but filtering field ‘StrACalc_NtEq_StrB’ (SQL or after running query) results in “Data type mismatch in criteria expression”.
q3_OK_SQL_FilterFail:
SELECT t1.UID, t1.StrA, t1.StrB, Replace(Nz([StrA]),".","_",1,-1,1) AS StrACalc, Nz([StrACalc] Not Like [StrB]) AS StrACalc_NtEq_StrB
FROM t1;
Result: Runs, but filtering field ‘StrACalc_NtEq_StrB’ is only possible after running query. Adding “Nz([StrACalc] Not Like [StrB]) AS StrACalc_NtEq_StrB” results in “Enter Parameter Value | StrACalc”.
Note: If the result of the above is called in another query, the SQL filtering will work.
q4_OK
SELECT t1.UID, t1.StrA, t1.StrB
FROM t1
WHERE (t1.StrB) Not Like Replace(Nz([StrA]),".","_",1,-1,1);
Finally – Desired result:
| UID | StrA | StrB |
| ---:| ----- | ----- |
| 2 | | Str_2 |
| 3 | Str.3 | Str_4 |

Group rows by the same value in the field, while matching on partial value only

I have a table that has many rows (between a few 1000s to a few million).
I need my query to do the following:
group results by the same part of the value in the field;
order by the biggest group first.
The table has mostly values that have only some part are similar (and i.e. suffix would be different). Since the number of similar values is huge - I cannot predict all of them.
Here is i.e. my table:
+--------+-----------+------+
| Id | Uri | Run |
+--------+-----------+------+
| 15145 | select_123| Y |
| 15146 | select_345| Y |
| 15148 | delete_123| N |
| 15150 | select_234| Y |
| 15314 | delete_334| N |
| 15315 | copy_all | N |
| 15316 | merge_all | Y |
| 15317 | select_565| Y |
| 15318 | copy_all | Y |
| 15319 | delete_345| Y |
+--------+-----------+------+
What I would like to see, something like this (the Count part is desirable but not required):
+-----------+------+
| Uri | Count|
+-----------+------+
| select | 4 |
| delete | 3 |
| copy_all | 2 |
| merge_all| 1 |
+-----------+------+
If you're using MySQL 5.x, you can strip the trailing _ and digits from the Uri value using this expression:
LEFT(Uri, LENGTH(Uri) - LOCATE('_', REVERSE(Uri)))
Using a REGEXP test to see if the Uri ends in _ and some digits, we can then process the Uri according to that and then GROUP BY that value to get the counts:
SELECT CASE WHEN Uri REGEXP '_[0-9]+$' THEN LEFT(Uri, LENGTH(Uri) - LOCATE('_', REVERSE(Uri)))
ELSE Uri
END AS Uri2,
COUNT(*) AS Count
FROM data
GROUP BY Uri2
Output:
Uri2 Count
copy_all 2
delete 3
merge_all 1
select 4
Demo on SQLFiddle
The format of the string makes it uneasy to parse it with string functions.
If you are running MySQL 8.0, you can truncate the string with regexp_replace(), then group by and order by:
select regexp_replace(uri, '_\\d+$', '') new_uri, count(*) cnt
from mytable
group by new_uri
order by cnt desc
If you're using MySQL 8.x, you can use REGEXP_REPLACE() to remove the numeric suffixes from select_XXX and delete_XXX, then group by the result.
SELECT REGEXP_REPLACE(uri, '_[0-9]+$', '') AS new_uri, COUNT(*) as count
FROM yourTable
GROUP BY new_uri
You can do as below and create a view and using the case expression + substr find which are 'select' and 'delete'.
Following the view you can query it with the count/group_by.
WITH view_1 AS (
SELECT
CASE
WHEN substr(uri, 1, 6) = 'select' THEN
substr(uri, 1, 6)
WHEN substr(uri, 1, 6) = 'delete' THEN
substr(uri, 1, 6)
ELSE uri
END AS uri
FROM
your_table
)
SELECT
uri,
COUNT(uri) as "Count"
FROM
view_1
GROUP BY
uri
ORDER BY count(uri) DESC;
Output will be
delete 5
merge_all 4
select 3
copy_all 3

MySQL query to match dates and times placed in separate columns

I have two tables
------------------------
| Vehicles |
------------------------
+ id +
| name |
+ available_from_date +
| available_from_time |
+ available_to_date +
| available_to_time |
-----------------------
------------------------
| Reserved_Vehicles |
------------------------
+ id +
| vehicle_id |
+ reserved_from_date +
| reserved_from_time |
+ reserved_to_date +
| reserved_to_time |
-----------------------
I want to query vehicles table such that I get only those vehicles which meet the availability date and time and also not already reserved for that time.
For example, I want to search vehicles which are available FROM date 2012-07-27 & time 10:00 TO date 2012-08-15 & time 14:00.
How to solve above problem with one query?
Thanks in advance. :)
It sounds like you could just use AND in your WHERE clause. Is that not working?
Do you need to query both tables? Or can you safely assume that if a car is reserved at a given time then it's not available, and if it's available then it's not reserved?

SQL GROUP BY: intervals in continuity?

The idea is that say you have the following table.
-------------
| oID | Area|
-------------
| 1 | 5 |
| 2 | 2 |
| 3 | 3 |
| 5 | 3 |
| 6 | 4 |
| 7 | 5 |
-------------
If grouping by continuity is possible this pseudo query
SELECT SUM(Area) FROM sample_table GROUP BY CONTINUITY(oID)
would return
-------------
| SUM(Area) |
-------------
| 10 |
| 12 |
-------------
Where the continuity break arises at oID or rather the lack thereof an entry representing oID 4.
Does such functionality exist within the standard functions of Sql?
There is no such functionality in "standard functions of SQL", but it is possible to get the desired result set by using some tricks.
With the subquery illustrated below we create a virtual field which you can use to GROUP BY in the outer query. The value of this virtual field is incremented each time when there is a gap in the sequence of oID. This way we create an identifier for each of those "data islands":
SELECT SUM(Area), COUNT(*) AS Count_Rows
FROM (
/* #group_enumerator is incremented each time there is a gap in oIDs continuity */
SELECT #group_enumerator := #group_enumerator + (#prev_oID != oID - 1) AS group_enumerator,
#prev_oID := oID AS prev_oID,
sample_table.*
FROM (
SELECT #group_enumerator := 0,
#prev_oID := -1
) vars,
sample_table
/* correct order is very important */
ORDER BY
oID
) q
GROUP BY
group_enumerator
Test table and data generation:
CREATE TABLE sample_table (oID INT auto_increment, Area INT, PRIMARY KEY(oID));
INSERT INTO sample_table (oID, Area) VALUES (1,5), (2,2), (3,3), (5,3), (6,4), (7,5);
I need to thank Quassnoi for pointing out this trick in my related question ;-)
UPDATE: added test table and data and fixed duplicate column name in example query.
Here's a blog post that provides a very thorough explanation and example related to grouping by contiguous data. If you have any issues comprehending it or implementing it, I can attempt to provide an implementation for your problem.

SQL Query to fill column with combination of other columns

I have table with this structure
ID | Parameter1 | Parameter 2 | Multiplication
1 | 1024 | 100 |
2 | 1200 | 200 |
3 | 1600 | 300 |
4 | 1900 | 400 |
I want to fill column Multiplication with combined string from Parameter 1 and Parameter 2
ID | Parameter1 | Parameter 2 | Multiplication
1 | 1024 | 100 | 1024_100
2 | 1200 | 200 | 1200_200
3 | 1600 | 300 | 1600_300
4 | 1900 | 400 | 1900_400
Help me please to create this SQLQuery
Using SQL then the following query should work..
Assuming the Param fields are ints use cast to make them strings
UPDATE Table1 SET Multiplication = CAST(Parameter1 AS VARCHAR(10)) + '_' + CAST(Parameter2 AS VARCHAR(10))
Else if they are already strings (e.g., varchar, text) just don't cast. I.e.,
UPDATE Table1 SET Multiplication = Parameter1 + '_' + Parameter2
Just change Table1 to the name of your table
An alternate for SQL Server is to add a column to handle this for you. It will automatically update the value if either parameter1 or parameter2 changes:
ALTER TABLE myTable
ADD myJoinedColumn AS CAST(Parameter1 AS VARCHAR(10)) + '_' + CAST(Parameter2 AS VARCHAR(10))
Or as #Scozzard mentions in his answer, if they are already strings:
ALTER TABLE myTable
ADD myJoinedColumn AS (Parameter1 + '_' + Parameter2)
For MySQL:
update Table1 set
Multiplication = concat(cast(Parameter1 as char), '_', cast(Parameter2 as char))
More about cast and concat in MySQL 5.0 Reference Manual.
update Table1 set Multiplication = CONCAT_WS('_',Parameter1,Parameter2)
update tablename
set Multiplication = convert(varchar, Parameter1) + '_' + convert(varchar, Parameter2)