MySQL convert multiple columns into JSON - mysql

I'm trying to convert multiple columns from one table into single JSON in another table in mysql database (version 5.7.16). I want use SQL query.
First table look like this
CREATE TABLE `log_old` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`temperature` DECIMAL(5,2) NULL DEFAULT NULL,
`heating_requested` BIT(1) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
)COLLATE='utf8_general_ci'
ENGINE=InnoDB;
Second table look like this
CREATE TABLE `log_new` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
'data' JSON,
PRIMARY KEY (`id`),
)COLLATE='utf8_general_ci'
ENGINE=InnoDB;
data JSON has same format in all rows of log_new table, it should look like this
{
temperature: value,
heatingRequested: false
}
for example log_old look like this
+--+-----------+-----------------+
|id|temperature|heating_requested|
+--+-----------+-----------------+
|1 | 12 | true |
+--+-----------+-----------------+
|2 | 14 | true |
+--+-----------+-----------------+
|3 | 20 | false |
+--+-----------+-----------------+
and I want to log_new looked like this
+--+-----------------------------------------+
|id| data |
+--+-----------------------------------------+
|1 |{temperature:12, heatingRequested: true} |
+--+-----------------------------------------+
|2 |{temperature:14, heatingRequested: true} |
+--+-----------------------------------------+
|3 |{temperature:20, heatingRequested: false}|
+--+-----------------------------------------+
I tried to use JSON_INSERT()
SELECT JSON_INSERT((SELECT data FROM log_new ), '$.temperature',
(SELECT temperature FROM log_old));
but this throw error "subquery returns more than 1 row"
I came with only solutions thats work to use while and do it row by row but this can take long time
DELIMITER //
CREATE PROCEDURE doLog()
BEGIN
SELECT COUNT(*) into #length from log_zone;
SET #selectedid = 1;
WHILE #selectedid < #length DO
SELECT temperature,heating_requested INTO #temperature,#heating_requested FROM log_old where id=#selectedid;
SELECT JSON_OBJECT('temperature',#temperature,'heatingRequested',#heating_requested) into #data_json;
SET #selectedid = #selectedid + 1;
INSERT INTO log_new (data) VALUES (#data_json);
END WHILE;
END;
//
CALL doLog()

As all your data are available on single lines, you don't need to use subqueries or loops to build the json object.
You can try something like :
INSERT INTO log_new (data)
SELECT json_object('temperature',log_old.temperature,'heatingRequested',log_old.heating_requested)
FROM log_old

Use a programming language or BI tool. Your question is very thought out but what I am missing is why does this have to be in mysql?
An RDMS, although many have reporting addons, is not intended to provide this low level manipulation. You are entering in a reporting realm and may need to focus on a view of your data outside of the database. You would be best served using Node, PHP, Python, and just about any actual programming language with strong mysql support (which is just about every modern language out there). BI tools include several free options like Pentaho/Kettle and Google's Data Studio and countless commercial BI options like Tableau and the like.
It is my strong belief that stored procedures, although they have a place, should not be responsible for application logic.

Related

Best practice in MySQL for selecting two interchangeable columns and counting them, returning the most recent result

I have a MySQL table that looks like:
CREATE TABLE `messages` (
`id` int NOT NULL AUTO_INCREMENT,
`from` varchar(12) NOT NULL,
`to` varchar(12) NOT NULL,
`message` text,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=66 DEFAULT CHARSET=latin1;
So each time a message is sent or received, it is stored as:
# id from, to, message, timestamp
'65', '+1231303****', '+1833935****', 'Showtimes', '2022-01-26 09:26:10'
'64', '+1833935****', '+1231303****', 'Showtimes are: 12:30 someresponse', '2022-01-26 09:26:10'
I want to create a index of these conversation threats, and need to be able to execute a query that selects the conversation based on it either being addressed from or to a specific number, and returns the number of rows that match either, while at the same time, returning the last message that was sent. So basically I want it to return:
recipient (the other phone number, not the one I'm using to look up ),count(messages),lastmessage
Individually, I can query this all separately, since most of my experience here resolves around using PHP to untangle the data I'm going after. What I'm curious about is a single query that lets MySQL handle this, rather than submitting multiple queries to the database server. I figure this may be a good time to approach in, since several projects I've coded have ran out of memory to process before with so many queries between so many loops.
Apologies in advance if this has been answered somewhere else already. I searched extensively for an answer, but the few results I found used a completely different table structure than I am using, and the MySQL query I was able to fumble together didn't work. I stand next to my work as a PHP programmer, but my MySQL needs some work. Hence I'm here!
If a conversation thread can be defined by a unique combination of from and to then creating a compound key where the first node is the lower of the two then all the conversations in the thread can be established , however selecting on from OR two means many conversation threads may be selected. for example
DROP TABLE IF EXISTS T;
CREATE TABLE T(ID INT AUTO_INCREMENT PRIMARY KEY, FROMNO INT, TONO INT);
INSERT INTO T(FROMNO,TONO) VALUES
(1,2),(2,1),
(1,3),(4,1),(1,2);
WITH CTE AS
(SELECT * ,
CASE WHEN FROMNO < TONO THEN CONCAT(FROMNO,TONO)
ELSE CONCAT(TONO,FROMNO)
END AS CVAL
FROM T
WHERE FROMNO = 1 OR TONO = 1
),
CTE1 AS
(SELECT *,
DENSE_RANK() OVER (ORDER BY CVAL) DR
FROM CTE
),
CTE2 AS
(SELECT CVAL,COUNT(*) conversations,MAX(ID) MAXID
FROM CTE1
GROUP BY CVAL
)
SELECT CTE2.CVAL,CTE2.THINGS,CTE2.MAXID,T.ID
FROM CTE2
JOIN T ON T.ID = CTE2.MAXID;
Yields
+------+---------------+-------+----+
| CVAL | conversations | MAXID | ID |
+------+---------------+-------+----+
| 13 | 1 | 3 | 3 |
| 14 | 1 | 4 | 4 |
| 12 | 3 | 5 | 5 |
+------+---------------+-------+----+
3 rows in set (0.002 sec)

insert multiple values in the same row of mysql table

Need to insert multiple values in the same row,For example i need to insert the different referrer came for a site.the table look like
|Date |Ref |Uri
--------------------------
28/9 |ref1 ref2 ref3 |url1
in the above table for the same date and link got the 3 different referrer.
How can i store the referrer in the same row for the particular date and retrieve individual referrer.
Hope have understand my requirement
You can do this, but you shouldn't. It contradicts the Database normalization rules, which you can see here: https://en.wikipedia.org/wiki/Database_normalization.
Use a further table which contains the primary key from your table above and connect it with each ref key. Example:
Existing Table:
T-Id |Date |Uri
--------------------------
1 | 28/9 |url1
2 | 28/9 |url2
New Table:
Id | Ref-Id | T-Id
--------------------------
1 | 1 | 1
2 | 2 | 1
3 | 3 | 1
4 | 1 | 2
5 | 3 | 2
First of all you should not do that .
You should not save data in MySQL like that. Any row must not have a column in which more than one value is saved like separated with commas ,space or anything else. Rather than that, you must separate such data into multiple rows. By this, you can easily retrieve,update and delete any row.
But if you want to save data like that then you should go for JSON datatype .
As of MySQL 5.7.8, MySQL supports a native JSON data type that enables efficient access to data in JSON (JavaScript Object Notation) documents.
It can be saved using JSON array .
A JSON array contains a list of values separated by commas and enclosed within [ and ] characters:
["abc", 10, null, true, false]
Create table ex:
CREATE TABLE `book` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(200) NOT NULL,
`tags` json DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
Insert data ex:
INSERT INTO `book` (`title`, `tags`)
VALUES (
'ECMAScript 2015: A SitePoint Anthology',
'["JavaScript", "ES2015", "JSON"]'
);
There are many native functions in MySql to handle JSON data type.
How to Use JSON Data Fields in MySQL Databases
Mysql JSON Data Type
In the case when referer is an entity, having many attribute then you can do as suggested by #rbr94 . In case when referer has not more than one attribute then splitting data in multiple rows or using JSON DataType will do the Job.
At last it depends on your choice of solution.

Can MySQL FIND_IN_SET or equivalent be made to use indices?

If I compare
explain select * from Foo where find_in_set(id,'2,3');
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | User | ALL | NULL | NULL | NULL | NULL | 4 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
with this one
explain select * from Foo where id in (2,3);
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | User | range | PRIMARY | PRIMARY | 8 | NULL | 2 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
It is apparent that FIND_IN_SET does not exploit the primary key.
I want to put a query such as the above into a stored procedure, with the comma-separated string as an argument.
Is there any way to make the query behave like the second version, in which the index is used, but without knowing the content of the id set at the time the query is written?
In reference to your comment:
#MarcB the database is normalized, the CSV string comes from the UI.
"Get me data for the following people: 101,202,303"
This answer has a narrow focus on just those numbers separated by a comma. Because, as it turns out, you were not even talking about FIND_IN_SET afterall.
Yes, you can achieve what you want. You create a prepared statement that accepts a string as a parameter like in this Recent Answer of mine. In that answer, look at the second block that shows the CREATE PROCEDURE and its 2nd parameter which accepts a string like (1,2,3). I will get back to this point in a moment.
Not that you need to see it #spraff but others might. The mission is to get the type != ALL, and possible_keys and keys of Explain to not show null, as you showed in your second block. For a general reading on the topic, see the article Understanding EXPLAIN’s Output and the MySQL Manual Page entitled EXPLAIN Extra Information.
Now, back to the (1,2,3) reference above. We know from your comment, and your second Explain output in your question that it hits the following desired conditions:
type = range (and in particular not ALL) . See the docs above on this.
key is not null
These are precisely the conditions you have in your second Explain output, and the output that can be seen with the following query:
explain
select * from ratings where id in (2331425, 430364, 4557546, 2696638, 4510549, 362832, 2382514, 1424071, 4672814, 291859, 1540849, 2128670, 1320803, 218006, 1827619, 3784075, 4037520, 4135373, ... use your imagination ..., ..., 4369522, 3312835);
where I have 999 values in that in clause list. That is an sample from this answer of mine in Appendix D than generates such a random string of csv, surrounded by open and close parentheses.
And note the following Explain output for that 999 element in clause below:
Objective achieved. You achieve this with a stored proc similar to the one I mentioned before in this link using a PREPARED STATEMENT (and those things use concat() followed by an EXECUTE).
The index is used, a Tablescan (meaning bad) is not experienced. Further readings are The range Join Type, any reference you can find on MySQL's Cost-Based Optimizer (CBO), this answer from vladr though dated, with a eye on the ANALYZE TABLE part, in particular after significant data changes. Note that ANALYZE can take a significant amount of time to run on ultra-huge datasets. Sometimes many many hours.
Sql Injection Attacks:
Use of strings passed to Stored Procedures are an attack vector for SQL Injection attacks. Precautions must be in place to prevent them when using user-supplied data. If your routine is applied against your own id's generated by your system, then you are safe. Note, however, that 2nd level SQL Injection attacks occur when data was put in place by routines that did not sanitize that data in a prior insert or update. Attacks put in place prior via data and used later (a sort of time bomb).
So this answer is Finished for the most part.
The below is a view of the same table with a minor modification to it to show what a dreaded Tablescan would look like in the prior query (but against a non-indexed column called thing).
Take a look at our current table definition:
CREATE TABLE `ratings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`thing` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5046214 DEFAULT CHARSET=utf8;
select min(id), max(id),count(*) as theCount from ratings;
+---------+---------+----------+
| min(id) | max(id) | theCount |
+---------+---------+----------+
| 1 | 5046213 | 4718592 |
+---------+---------+----------+
Note that the column thing was a nullable int column before.
update ratings set thing=id where id<1000000;
update ratings set thing=id where id>=1000000 and id<2000000;
update ratings set thing=id where id>=2000000 and id<3000000;
update ratings set thing=id where id>=3000000 and id<4000000;
update ratings set thing=id where id>=4000000 and id<5100000;
select count(*) from ratings where thing!=id;
-- 0 rows
ALTER TABLE ratings MODIFY COLUMN thing int not null;
-- current table definition (after above ALTER):
CREATE TABLE `ratings` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`thing` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5046214 DEFAULT CHARSET=utf8;
And then the Explain that is a Tablescan (against column thing):
You can use following technique to use primary index.
Prerequisities:
You know the maximum amount of items in comma separated string and it is not large
Description:
we convert comma separated string into temporary table
inner join to the temporary table
select #ids:='1,2,3,5,11,4', #maxCnt:=15;
SELECT *
FROM foo
INNER JOIN (
SELECT * FROM (SELECT #n:=#n+1 AS n FROM foo INNER JOIN (SELECT #n:=0) AS _a) AS _a WHERE _a.n <= #maxCnt
) AS k ON k.n <= LENGTH(#ids) - LENGTH(replace(#ids, ',','')) + 1
AND id = SUBSTRING_INDEX(SUBSTRING_INDEX(#ids, ',', k.n), ',', -1)
This is a trick to extract nth value in comma separated list:
SUBSTRING_INDEX(SUBSTRING_INDEX(#ids, ',', k.n), ',', -1)
Notes: #ids can be anything including other column from other or the same table.

Seeking coding example for TSQLTimeStamp

Delphi XE2 and MySql.
My previous question led to the recommendation that I should be using MySql's native TIMESTAMP datatype to store date/time.
Unfornately, I can't seem to find any coding examples, and I am getting weird results.
Given this table:
mysql> describe test_runs;
+------------------+-------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------+-------------+------+-----+---------------------+-------+
| start_time_stamp | timestamp | NO | PRI | 0000-00-00 00:00:00 | |
| end_time_stamp | timestamp | NO | | 0000-00-00 00:00:00 | |
| description | varchar(64) | NO | | NULL | |
+------------------+-------------+------+-----+---------------------+-------+
3 rows in set (0.02 sec)
I woudl like to :
declare a variable into which I can store the result of SELECT CURRENT_TIMESTAMP - what type should it be? TSQLTimeStamp?
insert a row at test start which has start_time_stamp = the variable above
and end_time_stamp = some "NULL" value ... "0000-00-00 00:00:00"? Can I use that directly, or do I need to declare a TSQLTimeStamp and set each field to zero? (there doesn't seem to be a TSQLTimeStamp.Clear; - it's a structure, not a class
upadte the end_time_stamp when the test completes
calcuate the test duration
Can somene please point me at a URL with some Delphi code whcich I can study to see how to do this sort of thing? GINMF.
I don't know why you want to hassle around with that TIMESTAMP and why you want to retrieve the CURRENT_TIMESTAMP just to put it back.
And as already stated, it is not a good advice to use a TIMESTAMP field as PRIMARY KEY.
So my suggestion is to use this TABLE SCHEMA
CREATE TABLE `test_runs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`start_time_stamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`end_time_stamp` timestamp NULL DEFAULT NULL,
`description` varchar(64) NOT NULL,
PRIMARY KEY (`id`)
);
Starting a test run is handled by
INSERT INTO test_runs ( description ) VALUES ( :description );
SELECT LAST_INSERT_ID() AS id;
and to finalize the record you simply call
UPDATE test_runs SET end_time_stamp = CURRENT_TIMESTAMP WHERE id = :id
just declare a TSQLQuery (or the correct component for the data access layer of your choice), attach it to a valid connection and populate it's SQL property with:
select * from test_runs;
double click on the query to launch it's fields editor and select add all fields from the contextual menu of that editor.
It will create the correct field type, according to the data access layer and driver you're using to access your data.
Once that's done, if you need to use the value in code, usually you do it by using the AsDateTime property of the field, so you just use a plain TDateTime Delphi type and let the database access layer deal with the specific database details to store that field.
For example, if your query object is named qTest and the table field is named start_time_stamp, your Delhi variable associated with that persistent field will be named qTeststart_time_stamp, so you can do something like this:
var
StartTS: TDateTime;
begin
qTest.Open;
StartTS := qTeststart_time_stamp.AsDateTime;
ShowMessage('start date is ' + DateTimeToStr(StartTS));
end;
If you use dbExpress and are new to it, read A Guide to Using dbExpress in Delphi database applications
I don't know about MySQL, but if the TField subclass generated is a TSQLTimeStampField, you will need to use the type and functions in the SqlTimSt unit (Data.SqlTimSt for XE2+).
You want to declare the local variables as TSQLTimeStamp
uses Data.SQLTmSt....;
....
var
StartTS: TSQLTimeStamp;
EndTS: TSQLTimeStamp;
begin
StartTS := qTeststart_time_stamp.AsSQLTimeStamp;
SQLTmSt also includes functions to convert to and from TSQLTimeStamp, e.g. SQLTimeStampToDateTime and DateTimeToSQLTimeStamp.
P.S. I tend to agree that using a timestamp as a primary key is likely to cause problems. I would tend to use a auto incrementing surrogate key as Sir Rufo suggests.

Mysql sub queries?

I have a report table that looks similar to this
reports
inspection_type | inspection_number
berries | 111
citrus | 222
grapes | 333
inspection_type in my case is the name of the other table I would like
to SELECT * from where the inspection_number equals report_key on
that associated table.
{fruit}
row | report_key | etc....
value | 111 | value
value | 222 | value
The issue is I do not know how to query inspection_type to get the table name
to query the value. Does that make any sense?
I tried this here, but even I know that it's glaringly wrong:
SELECT inpection_type, inspection_number
FROM reports rpt
ON rpt.inspection_number = report_key
(SELECT * FROM inspection_type WHERE status < '2')
WHERE rpt.status < '2'
ORDER BY rpt.inspection_number DESC
Could a SQL guru tell me the best way to do this?
Since it is not possible to have a variable for a table name directly in TSQL, you will have to dynamically construct the TSQL.
Variable table names in Stored Procedures
You can't really do what you are aiming to in SQL alone, you'll need to either mess around in another language, or (and this is the preferred solution) restructure the database i.e. (sorry for the meta-code)
// Comes in where your existing `reports` table is
inspections (
inspection_id INT UNSIGNED NOT NULL AI,
inspection_type_id INT UNSIGNED NOT NULL (links to inspection_types.inspection_type_id)
.... other rows ....
)
// New table to normalise the inspection types
inspection_types (
inspection_type_id INT UNSIGNED NOT NULL AI,
type_name VARCHAR NOT NULL
.... other rows ....
)
// Normalised table to replace each {fruit} table
inspection_data (
inspection_data_id INT UNSIGNED NOT NULL AI,
inspection_id INT UNSIGNED NOT NULL (links to inspections.inspection_id)
.... other rows ....
)
Then your query would be simply
SELECT *
FROM inspections
INNER JOIN inspection_types
ON inspection_types.inspection_type_id = inspections.inspection_type_id
INNER JOIN inspection_data
ON inspection_data.inspection_id = inspections.inspection_id
The brief overview above is quite vague because your existing table data hasn't really been specified, but the general principle is sound. It wouldn't even take much to migrate data out of your existing structure, but when it's done it'll give you far cleaner queries and allow you to actually get the data you're after out more easily