SQL query in MySQL containing mathematical comparison - mysql

I need to have a SQL that finds values from table B using (randomize) values on table A in comparative manner. Table A values has been produces in randomize manner. Table B values have been order in a way of cumulative distribution function. What is needed is that SQL will get the first row from table B which satisfy the criteria.
Table A:
+----+-------+
| ID | value |
+----+-------+
| 1 | 0.1234|
| 2 | 0.8923|
| 3 | 0.5221|
+----+-------+
Table B:
+----+-------+------+
| ID | value | name |
+----+-------+------+
| 1 | 0.2000| Alpha|
| 2 | 0.5000| Beta |
| 3 | 0.7500| Gamma|
| 4 | 1.0000| Delta|
+----+-------+------+
Result should be:
+----+-------+------+
| ID | value | name |
+----+-------+------+
| 1 | 0.1234| Alpha|
| 2 | 0.8923| Delta|
| 3 | 0.5221| Gamma|
+----+-------+------+
Value 0.1234 is smaller than all the values of B, but Alpha has smallest value.
Value 0.8923 is smaller than 1.000 --> Delta.
Value 0.5221 is smaller than both 0.7500 and 1.000 but 0.7500 is smallest --> Gamma.
This query works only if table A has one value:
select value, name
from B
where (select value from A) &lt value;
Any ideas how to get this work with full table A?

You can use subquery to get the data you need:
SELECT a.ID, a.value,
(SELECT b.name FROM TableB b WHERE a.value < b.value ORDER BY b.ID ASC LIMIT 1) as name
FROM TableA a
In this case for each row in table A you find the first record in table B, that has larger number in column value. Depending on your requirements the operator < might beed to be updated to <= - it depends on your requirements

Related

How to loop through an array in SQL where clause?

I have a MySQL table which has the following columns and records:
| Name | Total | GivenBy |
| ---- | -------- | ------------ |
| Z | 200 |['A','B','C'] |
| X | 240 |['A','D','C'] |
I would like to extract Record No. 1 on the basis of 3rd column where the SQL query would be like:
SELECT * FROM mytable WHERE GivenBy='B';
Is there a way I can loop through the list in third column and take out the respective string as required in the SQL WHERE clause in a single query?
Please note that I cannot add more columns in the table.
If you can please provide the query as MySQL compatible, I would really appreciate it.
The "array" you show isn't quite valid JSON, but if you use double-quotes instead of single-quotes, you can use JSON_TABLE() to do this:
CREATE TABLE MyTable
(
Name CHAR(1) PRIMARY KEY,
Total INT NOT NULL,
GivenBy JSON NOT NULL
);
INSERT INTO MyTable VALUES
('Z', 200, '["A","B","C"]'),
('X', 240, '["A","D","C"]');
SELECT Name, Total, g.Value
FROM MyTable
CROSS JOIN JSON_TABLE(GivenBy, '$[*]' COLUMNS(Value CHAR(1) PATH '$')) AS g;
+------+-------+-------+
| name | total | value |
+------+-------+-------+
| X | 240 | A |
| X | 240 | D |
| X | 240 | C |
| Z | 200 | A |
| Z | 200 | B |
| Z | 200 | C |
+------+-------+-------+
But the best choice is not to store "arrays" in MySQL. Store the values one per row in a second table.
You can use the "like" keyword with regex to match your requirements in the third column.
select * from table where givenBy like "%B%";
Something similar would work.
You need to run a script:
Retrieve the list of unique values in the GivenBy column using the following query:
SELECT DISTINCT JSON_EXTRACT(GivenBy, '$[*]') AS GivenByValues
FROM mytable;
Loop through the list of unique values, and for each value, run a query that uses that value in the WHERE clause:
SELECT *
FROM mytable
WHERE JSON_SEARCH(GivenBy, 'one', [current_value_from_loop]) IS NOT NULL;

Exotic GROUP BY In MySQL

Consider a typical GROUP BY statement in SQL: you have a table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| B | 2 |
| A | 3 |
| B | 4 |
+------+-------+
And you ask for
SELECT Name, SUM(Value) as Value
FROM table
GROUP BY Name
You'll receive
+------+-------+
| Name | Value |
+------+-------+
| A | 4 |
| B | 6 |
+------+-------+
In your head, you can imagine that SQL generates an intermediate sorted table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| A | 3 |
| B | 2 |
| B | 4 |
+------+-------+
and then aggregates together successive rows: the "Value" column has been given an aggregator (in this case SUM), so it's easy to aggregate. The "Name" column has been given no aggregator, and thus uses what you might call the "trivial partial aggregator": given two things that are the same (e.g. A and A), it aggregates them into a single copy of one of the inputs (in this case A). Given any other input it doesn't know what to do and is forced to begin aggregating anew (this time with the "Name" column equal to B).
I want to do a more exotic kind of aggregation. My table looks like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| BC | 2 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BCR | 6 |
+------+-------+
And the intended output is
+------+-------+
| Name | Value |
+------+-------+
| A | 8 |
| B | 13 |
+------+-------+
Where does this come from? A and B are the "minimal prefixes" for this set of names: they occur in the data set and every Name has exactly one of them as a prefix. I want to aggregate data by grouping rows together when their Names have the same minimal prefix (and add the Values, of course).
In the toy grouping model from before, the intermediate sorted table would be
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BC | 2 |
| BCR | 6 |
+------+-------+
Instead of using the "trivial partial aggregator" for Names, we would use one that can aggregate X and Y together iff X is a prefix of Y; in that case it returns X. So the first three rows would be aggregated together into a row with (Name, Value) = (A, 8), then the aggregator would see that A and B couldn't be aggregated and would move on to a new "block" of rows to aggregate.
The tricky thing is that the value we're grouping by is "non-local": if A were not a name in the dataset, then AY and AZ would each be a minimal prefix. It turns out that the AY and AZ rows are aggregated into the same row in the final output, but you couldn't know that just by looking at them in isolation.
Miraculously, in my use case the minimal prefix of a string can be determined without reference to anything else in the dataset. (Imagine that each of my names is one of the strings "hello", "world", and "bar", followed by any number of z's. I want to group all of the Names with the same "base" word together.)
As I see it I have two options:
1) The simple option: compute the prefix for each row and group by that value directly. Unfortunately I have an index on the Name, and computing the minimal prefix (whose length depends on the Name itself) prevents me from using that index. This forces a full table scan, which is prohibitively slow.
2) The complicated option: somehow convince MySQL to use the "partial prefix aggregator" for Name. This runs into the "non-locality" problem above, but that's fine as long as we scan the table according to my index on Name, since then every minimal prefix will be encountered before any of the other strings it is a prefix of; we would never try to aggregate AY and AZ together if A were in the dataset.
In a declarative programming language #2 would be rather easy: extract rows one at a time, in alphabetical order, keeping track of the current prefix. If your new row's Name has that as a prefix, it goes in the bucket you're currently using. Otherwise, start a new bucket with that as your prefix. In MySQL I am lost as to how to do it. Note that the set of minimal prefixes is not known beforehand.
Edit 2
It occurred to me that if the table is ordered by Name, this would be a lot easier (and faster). Since I don't know if your data is sorted, I've included a sort in this query, but if the data is sorted, you can strip out (SELECT * FROM table1 ORDER BY Name) t1 and just use FROM table1
SELECT prefix, SUM(`Value`)
FROM (SELECT Name, Value, #prefix:=IF(Name NOT LIKE CONCAT(#prefix, '_%'), Name, #prefix) AS prefix
FROM (SELECT * FROM table1 ORDER BY Name) t1
JOIN (SELECT #prefix := '~') p
) t2
GROUP BY prefix
Updated SQLFiddle
Edit
Having slept on the problem, I realised that there is no need to do the IN, it's enough to just have a WHERE NOT EXISTS clause on the JOINed table:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE NOT EXISTS (SELECT *
FROM table1 t3
WHERE t1.Name LIKE CONCAT(t3.Name, '_%')
)
GROUP BY t1.Name
Updated Explain (Name changed to UNIQUE key from PRIMARY)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index Name Name 11 NULL 6 Using where; Using index; Using temporary; Using filesort
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t3 index NULL Name 11 NULL 6 Using where; Using index
Updated SQLFiddle
Original Answer
Here is one way you could do it. First, you need to find all the unique prefixes in your table. You can do that by looking for all values of Name where it does not look like another value of Name with other characters on the end. This can be done with this query:
SELECT Name
FROM table1 t1
WHERE NOT EXISTS (SELECT *
FROM table1 t2
WHERE t1.Name LIKE CONCAT(t2.Name, '_%')
)
For your sample data, that will give
Name
A
B
Now you can sum all the values where the Name starts with one of those prefixes. Note we change the LIKE pattern in this query so that it also matches the prefix, otherwise we wouldn't count the values for A and B in your example:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE t1.Name IN (SELECT Name
FROM table1 t3
WHERE NOT EXISTS (SELECT *
FROM table1 t4
WHERE t3.Name LIKE CONCAT(t4.Name, '_%')
)
)
GROUP BY t1.Name
Output:
Name Value
A 8
B 13
An EXPLAIN says that both of these queries use the index on Name, so should be reasonably efficient. Here is the result of the explain on my MySQL 5.6 server:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index PRIMARY PRIMARY 11 NULL 6 Using index; Using temporary; Using filesort
1 PRIMARY t3 eq_ref PRIMARY PRIMARY 11 test.t1.Name 1 Using where; Using index
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t4 index NULL PRIMARY 11 NULL 6 Using where; Using index
SQLFiddle Demo
Here are some hints on how to do the task. This locates any prefixes that are useful. That's not what you asked for, but the flow of the query and the usage of #variables, plus the need for 2 (actually 3) levels of nesting, might help you.
SELECT DISTINCT `Prev`
FROM
(
SELECT #prev := #next AS 'Prev',
#next := IF(LEFT(city, LENGTH(#prev)) = #prev, #next, city) AS 'Next'
FROM ( SELECT #next := ' ' ) AS init
JOIN ( SELECT DISTINCT city FROM us ) AS dedup
ORDER BY city
) x
WHERE `Prev` = `Next` ;
Partial output:
+----------------+
| Prev |
+----------------+
| Alamo |
| Allen |
| Altamont |
| Ames |
| Amherst |
| Anderson |
| Arlington |
| Arroyo |
| Auburn |
| Austin |
| Avon |
| Baker |
Check the Al% cities:
mysql> SELECT DISTINCT city FROM us WHERE city LIKE 'Al%' ORDER BY city;
+-------------------+
| city |
+-------------------+
| Alabaster |
| Alameda |
| Alamo | <--
| Alamogordo | <--
| Alamosa |
| Albany |
| Albemarle |
...
| Alhambra |
| Alice |
| Aliquippa |
| Aliso Viejo |
| Allen | <--
| Allen Park | <--
| Allentown | <--
| Alliance |
| Allouez |
| Alma |
| Aloha |
| Alondra Park |
| Alpena |
| Alpharetta |
| Alpine |
| Alsip |
| Altadena |
| Altamont | <--
| Altamonte Springs | <--
| Alton |
| Altoona |
| Altus |
| Alvin |
+-------------------+
40 rows in set (0.01 sec)

MySQL table order by one column when other column has a particular value

I have two mysql tables record_items,property_values with the following structure.
table : property_values (column REC is foreign key to record_items)
id(PK)|REC(FK)| property | value|
1 | 1 | name | A |
2 | 1 | age | 10 |
3 | 2 | name | B |
4 | 3 | name | C |
5 | 3 | age | 9 |
table: record_items
id(PK) |col1|col2 |col3|
1 | v11| v12 | v13|
2 | v21| v22 | v23|
3 | v31| v32 | v33|
4 | v41| v42 | v43|
5 | v51| v52 | v53|
record_items table contains only basic information about the record, where as property_values table keeps record_item as a foreign key and each property and its value is saved in a separate row.
Now I want to get the record_items sorted based on a particular property, say by age.
My HQL query will be like
Select distinct rec from PropertyValues where property="age" order by value;
But this query will be skipping record 2 since it don't have an entry for property age.
I expect the result to have the records which contains age property in sort order appended by those which don't have age property at all. How can I query that?
Here is a raw MySQL query which should do the trick:
SELECT t1.*
FROM record_items t1
LEFT JOIN property_values t2
ON t1.id = t2.REC AND
t2.property = 'age'
ORDER BY CASE WHEN t2.value IS NULL THEN 1 ELSE 0 END, t2.Value
I notice that your Value column in property_values is mixing numeric and text data. This won't work well for sorting purposes.
Demo here

Joining Two Tables in MySQL with diff column name

I'm not very good at joining tables in mysql and I'm still learning,
So I wanted to ask, when joining two tables....
I have 2 tables
So for the first table I want to join the 2 of its columns (id & path) on the second table.
But on the second table there's no column name id and path, there is a column name pathid & value. The field of the pathid column is the same as the id.
it looks like this.
first table
| id | path |
---------------------
| 1 | country/usa |
| 2 | country/jpn |
| 3 | country/kor |
second table
| pathid | value |
-------------------
| 3 | 500 |
| 1 | 10000 |
| 2 | 2000 |
So on the first table, it indicates that for usa the id is 1, japan is 2, korea is 3.
And on the table it says that for pathid no. 3 ( which is the id for korea) the value is 500 and so on with the others.
I want it to look like this. So then the path will be joined on the second table on its corresponding value. How can I do this on mysql? Thank You
Desired Result
| id | path | value |
------------------------------
| 1 | country/usa | 10000 |
| 2 | country/jpn | 2000 |
| 3 | country/kor | 500 |
You can join on the columns irrespective of the column name as long as the data type match.
SELECT id, path, value
FROM firstTable, secondTable
WHERE id = pathid
If you have same column names on both tables then you need to qualify the name using alias. Say the column names for id were same on both tables then whenever you use id you should mention which table you are referring to. other wise it will complain about the ambiguity.
SELECT s.id, path, value
FROM firstTable f, secondTable s
WHERE f.id = s.pathid
Note that I ommited s. on other columns in select, it will work as long as the second table doesn't have columns with same name.

Dynamically display contents of a table without duplicates

How do I display all the contents of a table without duplicates?
Example:
Table 1
______________________
| ID | VALUE |
| 0 | A |
| 1 | A |
| 2 | B |
| 3 | C |
Expected output:
A
B
C
What about
SELECT DISTINCT VALUE FROM table
or use GROUP BY statement with VALUE column
there is not any duplicate rows. you can use distinct keyword only with values like
SELECT DISTINCT VALUE FROM XYZ;
you can do the same using group by
SELECT VALUE FROM XYZ GROUP BY VALUE;
but this is bullshit. a good developer will not use it.