JSON merge arrays and UNIQUE or DISTINCT - mysql

In my MySQL database I have a table features with this structure:
+-------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| val | json | NO | | NULL | |
+-------+---------+------+-----+---------+----------------+
When I select everything from it, like so, I get the following:
mysql> select * from features;
+----+----------------------------------+
| id | val |
+----+----------------------------------+
| 1 | ["apple", "banana", "orange"] |
| 2 | ["apple", "orange", "pineapple"] |
| 3 | ["orange", "banana"] |
| 4 | [] |
+----+----------------------------------+
The value in the val column should always be an array of strings. This array can have any length (>= 0).
The question is:
How can I select all those array values in a single result set, not repeated? So that I get this result and use it in PHP:
+------------+
| arr_values |
+------------+
| apple |
| banana |
| orange |
| pineapple |
+------------+
The only constraint to solve this is that it should be compatible with MySQL v5.7.

If maximal amount of elements per JSON value is limited then (an example for not more than 10 elements)
SELECT DISTINCT JSON_EXTRACT(features.val, CONCAT('$[', numbers.num, ']')) arr_values
FROM features, ( SELECT 0 num UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6 UNION ALL
SELECT 7 UNION ALL
SELECT 8 UNION ALL
SELECT 9 ) numbers
HAVING arr_values IS NOT NULL;
If really the max array size is limited nevertheless (for example, 1000000) then it is possible to generate the dynamic table with proper amount of number. But stored procedure with iterational parsing and temporary table is more safe solution.
UPDATE.
Non-limited solution (stored procedure).
CREATE PROCEDURE get_unique ()
BEGIN
CREATE TEMPORARY TABLE temp (val JSON);
INSERT INTO temp
SELECT val
FROM features;
CREATE TEMPORARY TABLE tmp (val JSON);
cycle: LOOP
INSERT IGNORE INTO tmp
SELECT JSON_EXTRACT(val, '$[0]')
FROM temp;
DELETE
FROM temp
WHERE JSON_EXTRACT(val, '$[1]') IS NULL;
UPDATE temp
SET val = JSON_REMOVE(val, '$[0]');
IF 0 = (SELECT COUNT(*)
FROM temp) THEN
LEAVE cycle;
END IF;
END LOOP;
DROP TEMPORARY TABLE temp;
SELECT DISTINCT *
FROM tmp
WHERE val IS NOT NULL;
DROP TEMPORARY TABLE tmp;
END
fiddle

Related

How do I convert a number enum column to tinyint?

Developing in Laravel 5.7, using a MySQL database. On a couple of my database columns I have the type of enum - didn't do my research and made the enum full of numbers (0-2, or 0-3). After reading the pros and cons, I want to move away from enums and convert them to tinyints.
What's the best way to change the type of the column in my table to tinyint and convert the strings '0','1','2','3' to tinyint?
I don't really want to lose my data in the process.
https://laravel.com/docs/5.7/migrations#modifying-columns has information about modifying columns, however it does not support enums:
Only the following column types can be "changed": bigInteger, binary, boolean, date, dateTime, dateTimeTz, decimal, integer, json, longText, mediumText, smallInteger, string, text, time, unsignedBigInteger, unsignedInteger and unsignedSmallInteger.
To be on a safe side I'd do this using temporary column;
ALTER TABLE tbl ADD COLUMN _temp_col CHAR(1) COLLATE 'latin1_general_ci'; -- CHAR(1) is OK if you only have numeric ENUMs
UPDATE tbl SET _temp_col = col; -- ENUM values would be copied as is
ALTER TABLE tbl MODIFY COLUMN col TINYINT(1) UNSIGNED;
UPDATE tbl SET col = _temp_col; -- Values would be auto-converted to ints
ALTER TABLE tbl DROP COLUMN _temp_col;
Experimenting with MySQL-8.0 generated the following conversion.
The ALTER TABLE seems to convert 'x' -> x+1. So I guess that be altered per the subsequent UPDATE below
select version();
| version() |
| :-------- |
| 8.0.13 |
create table x (y enum ('0','1','2','3') );
✓
insert into x values ('1'),('0'),('2'),('3')
✓
select * from x
| y |
| :- |
| 1 |
| 0 |
| 2 |
| 3 |
alter table x modify y tinyint unsigned;
✓
select * from x;
| y |
| -: |
| 2 |
| 1 |
| 3 |
| 4 |
update x set y=y-1
✓
select * from x
| y |
| -: |
| 1 |
| 0 |
| 2 |
| 3 |
db<>fiddle here

How can I change all ids in my sql table sequentially?

I have about 1000 entries in my database:
id name
0 elephant
0 snake
0 monkey
....
I want no to change all ids afterwards. So that it looks like this:
id name
1 elephant
2 snake
3 monkey
....
How can I achieve this with SQL?
One option would be to leverage an auto increment column, in a new table, then insert your previous content into that table. First, create a new table with an auto increment id:
CREATE TABLE newTable (
id INT AUTO INCREMENT NOT NULL,
name varchar(255)
);
Now insert your old table into the new one:
INSERT INTO newTable (name)
SELECT name FROM oldTable; -- you may select multiple columns here
Two potentials drawbacks here are that now you have an old table that might need to be deleted, and also the order assigned to the names would be arbitrary. But in absence of logic for how to assign the IDs, this approach seems reasonable.
Here is the step by step solution for your problem, hopefully, resolve your issue.
mysql> create table dt1(id int,name varchar(20));
mysql> insert into dt1 values(0,'elephant');
mysql> insert into dt1 values(0,'snake');
mysql> insert into dt1 values(0,'monkey');
mysql> select * from dt1;
+------+----------+
| id | name |
+------+----------+
| 0 | elephant |
| 0 | snake |
| 0 | monkey |
mysql> update dt1 x join (select id,name,#r:=#r+1 as new_id from dt1,(select #r := 0)r) y on (x.name = y.name) set x.id = y.new_id;
Rows matched: 3 Changed: 3 Warnings: 0
mysql> select * from dt1;
+------+----------+
| id | name |
+------+----------+
| 1 | elephant |
| 2 | snake |
| 3 | monkey |

My mysql statement to query by primary key sometimes returns more than one row, so what happened?

My schema is this:
CREATE TABLE `user` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_name` varchar(10) NOT NULL,
`account_type` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1
INSERT INTO user VALUES (1, "zhangsan", "premiumv"), (2, "lisi", "premiumv"), (3, "wangwu", "p"), (4, "maliu", "p"), (5, "hengqi", "p"), (6, "shuba", "p");
I have the following 6 rows in the table:
+----+-----------+--------------+
| id | user_name | account_type |
+----+-----------+--------------+
| 1 | zhangsan | premiumv |
| 2 | lisi | premiumv |
| 3 | wangwu | p |
| 4 | maliu | p |
| 5 | hengqi | p |
| 6 | shuba | p |
+----+-----------+--------------+
Here is mysql to query the table by id:
SELECT * FROM user WHERE id = floor(rand()*6) + 1;
I expect it to return one row, but the actual result is non-predictive. It either will return 0 row, 1 row or sometimes more than one row. Can somebody help clarify this? Thanks!
You're testing each row against a different random number, so sometimes multiple rows will match. To fix this, calculate the random number once in a subquery.
SELECT u.*
FROM user AS u
JOIN (SELECT floor(rand()*6) + 1 AS r) AS r
ON u.id = r.r
This method of selecting a random row from a table seems like a poor design. If there are any gaps in the id sequence (which can happen easily -- MySQL doesn't guarantee that they'll always be sequential, and deleting rows will leave gaps) then it could return an empty result. The usual way to select a random row from a table is with:
SELECT *
FROM user
ORDER BY RAND()
LIMIT 1
The WHERE part must be evaluated for each row to see if there is a match. Because of this, the rand() function is evaluated for every row. Getting an inconsistent number of rows seems reasonable.
If you add LIMIT 1 to your query, the probability of returning rows from the end diminishes.
It's because the WHERE clause floor(rand()*6) + 1 is evaluated against every rows in the table to see if the condition matches the criteria. The value could be different each time it is matched against the row from the table.
You can test with a table that has same values in the column used in WHERE clause, and you can see the result:
select * from test;
+------+------+
| id | name |
+------+------+
| 1 | a |
| 2 | b |
| 1 | c |
| 2 | d |
| 1 | e |
| 2 | f |
+------+------+
select * from test where id = floor(rand()*2) + 1;
+------+------+
| id | name |
+------+------+
| 1 | a |
| 2 | d |
| 1 | e |
+------+------+
In the above example, the expression floor(rand()*2) + 1 returns 1 when matching against the first row (with name = 'a') so it is included in the result set. But then it returns 2 when matching against the forth row (with name = 'd'), so it is also included in the result set even the value of id is different from the value of the first row in the result set.

MySQL INSERT .. UPDATE breaks AUTO_INCREMENT?

There are the following two tables:
create table lol(id int auto_increment, data int, primary key id(id));
create table lol2(id int auto_increment, data int, primary key id(id));
Insert some values:
insert into lol2 (data) values (1),(2),(3),(4);
Now insert using select:
insert into lol (data) select data from lol2;
Do it again:
insert into lol (data) select data from lol2;
Now look at the table:
select * from lol;
I receive:
+----+------+
| id | data |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 4 |
| 8 | 1 |
| 9 | 2 |
| 10 | 3 |
| 11 | 4 |
+----+------+
I'm puzzled by the gap between 4 and 8... What caused this and how can I do it so that there isn't a gap? Thanks a lot!
auto_increment does not guarantee to have increments by 1 in the ID column. And it cannot, because as soon as you work with parallel transactions it would break anyways:
BEGIN BEGIN
INSERT INTO lol VALUES(...) INSERT INTO lol VALUES(..)
... ...
COMMIT ROLLBACK
What ids should be assigned by the database? It cannot know in advance which transaction will succeed and which will be rolled back.
If you need a sequential numbering of your records you would use a query which returns that; e.g.
SELECT COUNT(*) as position, lol.data FROM lol
INNER JOIN lol2 ON lol.id < lol2.id
GROUP BY lol.id

MySQL Alter table, add column with unique random value

I have a table that I added a column called phone - the table also has an id set as a primary key that auto_increments. How can I insert a random value into the phone column, that won't be duplicated. The following UPDATE statement did insert random values, but not all of them unique. Also, I'm not sold I cast the phone field correctly either, but ran into issues when trying to set it as a int(11) w/ the ALTER TABLE command (mainly, it ran correctly, but when adding a row with a new phone number, the inserted value was translated into a different number).
UPDATE Ballot SET phone = FLOOR(50000000 * RAND()) + 1;
Table spec's
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| phone | varchar(11) | NO | | NULL | |
| age | tinyint(3) | NO | | NULL | |
| test | tinyint(4) | NO | | 0 | |
| note | varchar(100) | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
-- tbl_name: Table
-- column_name: Column
-- chars_str: String containing acceptable characters
-- n: Length of the random string
-- dummy_tbl: Not a parameter, leave as is!
UPDATE tbl_name SET column_name = (
SELECT GROUP_CONCAT(SUBSTRING(chars_str , 1+ FLOOR(RAND()*LENGTH(chars_str)) ,1) SEPARATOR '')
FROM (SELECT 1 /* UNION SELECT 2 ... UNION SELECT n */) AS dummy_tbl
);
-- Example
UPDATE tickets SET code = (
SELECT GROUP_CONCAT(SUBSTRING('123abcABC-_$#' , 1+ FLOOR(RAND()*LENGTH('123abcABC-_$#')) ,1) SEPARATOR '')
FROM (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) AS dummy_tbl
);
Try this
UPDATE Ballot SET phone = FLOOR(50000000 * RAND()) * id;
I'd tackle this by generating a (temporary) table containing the numbers in the range you need, then looping through each record in the table you wish to supply with random numbers. Pick a random element from the temp table, update the table with that, and remove it from the temp table. Not beautiful, nor fast.. but easy to develop and easy to test.