I’d like to be able to create MySQL Document Store Collections via simple SQL DDL statements rather than using the X-Protocol clients.
Is there any way to do so?
Edit: I’ll try and clarify the question.
Collections are tables using JSON datatypes and functions. That much is clear.
I would like know how I can create a Collection without using the X-Protocol calls and make sure that the aforementioned collection is picked up as an actual Collection.
Judging from MySQL workbench, collection tables have a _id blob PK with an expression, a doc JSON column and a few other elements I do not recall at the moment (might be indexes, etc).
I have no means to tell via the Workbench whatever additional schema/metadata information is required for a table to be considered a Document Store Collection, or if the mere presence of an _id and doc columns are enough.
I hope this clears things up.
All "x-api" instructions are directly mapped to sql syntax. When you e.g. run db.createCollection('my_collection'), MySQL will literally just execute
CREATE TABLE `my_collection` (
`doc` json DEFAULT NULL,
`_id` varbinary(32) GENERATED ALWAYS AS
(json_unquote(json_extract(`doc`,_utf8mb4'$._id'))) STORED NOT NULL,
`_json_schema` json GENERATED ALWAYS AS (_utf8mb4'{"type":"object"}') VIRTUAL,
PRIMARY KEY (`_id`),
CONSTRAINT `$val_strict` CHECK (json_schema_valid(`_json_schema`,`doc`))
NOT ENFORCED
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
You can run the corresponding sql statements yourself if you follow that format.
The doc and _id (with their type and the given expression) are required, the _json_schema is optional, the check too (and only added since MySQL 8.0.17). Since MySQL 8, no additional columns are allowed, except generated columns that use JSON_EXTRACT on doc and which are supposed to be used in an index, see below (although they don't actually have to be used in an index).
Any table that looks like that - doc and _id with their correct type/expression and no other columns except an optional _json_schema and generated JSON_EXTRACT(doc,-columns - will be found with getCollections().
To add an index, the corresponding syntax for
my_collection.createIndex("age", {fields: [{field: "$.age", type: "int"}]})
would be
ALTER TABLE `test`.`my_collection` ADD COLUMN `$ix_i_somename` int
GENERATED ALWAYS AS (JSON_EXTRACT(doc, '$.age')) VIRTUAL,
ADD INDEX `age` (`$ix_i_somename`)
Obviously,
db.dropCollection('my_collection')
simply translates to
DROP TABLE `my_collection`
Similarly, all CRUD operations on documents have a corresponding sql DML syntax (that will actually be executed when you use them via x-api).
Related
This is my mssql UDT
create type ConditionUDT as Table
(
Name varchar(150),
PackageId int
);
This is my mssql Stored Procedure
create Procedure [dbo].[Condition_insert]
#terms_conditions ConditionUDT readonly
as
begin
insert into dbo.condition (name, p_id)
select [Name],[PackageId]
from #terms_conditions;
end
There is a workaround solution if you do not have any other choice but definitely migrate from sql server to mysql.
The closest structural predefined object that takes on many rows in mysql is an actual table. So you need 1 table per UDDT of sql server. Make sure you use a specific schema or naming conversion so you know those tables are UDDT emulations.
The idea is fill in the info, use them into the sp and then delete them. You need however to gurantee who reads what and that info are deleted after usage, consumed. So:
For any of those tables you need 2 columns, i suggest put them always first. That will be the key and the variable name. The key can be char(38) and use UUID() to get a unique identifier. It can also be int and use the connectionid() instead. Unique identifier is better however as ensures that nobody will ever use information not indented for him no matter what. The variable name will be the one used into the sql server parameter, just a string. This way:
You know what UDDT you use out of the table name.
You know the identity of your process through the key.
You know the 'variable' out of the name.
So, in your application code you:
Begin transaction.
Insert the data into the proper (UDDT emulator) tables using a key and the variable name(s)
Supply to the stored procedure the key and the variable name(s). You can use the same key for many table type parameters within the same sp call.
The stored procedure can now use that information as before from the UDDT variable using key and variable name as filters to query the proper UDDT emulated table.
Delete the data you insert
Commit
On catch, rollback.
For simplicity your sp can read the data into temp table and you do not need to change a line of code from the original sql server sp for this aspect.
Transaction into your app code will help you make sure your temporary variable data will either be deleted or never committed no matter what goes wrong.
As Larnu thought might be the case, MySQL doesn't support user defined types at all, let alone user defined table types.
You will have to make them all separate scalar parameters.
In the latest Django (2.2), when I add a new field to a model like this:
new_field= models.BooleanField(default=False)
Django runs the following commands for MySQL:
ALTER TABLE `app_mymodel` ADD COLUMN `new_field` bool DEFAULT b'0' NOT NULL;
ALTER TABLE `app_mymodel` ALTER COLUMN `new_field` DROP DEFAULT;
COMMIT;
While this works when everything is updated, this is very problematic because old versions of the application can no longer create models after this migration is run (they do not know about new_field). Why not just keep the DEFAULTconstraint?
Why not just keep the DEFAULT constraint?
Because Django handles the default model field option at the application level, not the database level. So the real question is why it sets the DEFAULT constraint at all.
But first: Django does not use database-level defaults. (From the documentation: "Django never sets database defaults and always applies them in the Django ORM code."). This has been true from the beginning of the project. There has always been some interest in changing it (the first issue on the subject is 14 years old), and certainly other frameworks (Rails, I think, and SQLAlchemy) have shown that it is possible.
But there are good reasons beyond backwards compatibility for handling defaults at the application level. Such as: the ability to express arbitrarily complex computations; not having to worry about subtle incompatibilities across database engines; the ability to instantiate a new instance in code and have immediate access to the default value; the ability to present the default value to users in forms; and more.
Based on the two most recent discussions on the subject, I'd say there's little appetite to incorporate database defaults into the semantics of default, but there is support for adding a new db_default option.
Now, adding a new non-nullable field to an existing database is a very different use case. In that situation, you have to provide a default to the database for it to perform the operation. makemigrations will try to infer the right value from your default option if it can, and if not it will force you to specify a value from the command line. So the DEFAULT modifier is used for this limited purpose and then removed.
As you've noticed, the lack of database-level defaults in Django can make continuous deployment harder. But the solution is fairly straightforward: just re-add the default yourself in a migration. One of the great benefits of the migrations system is that it makes it easy to make arbitrary, repeatable, testable changes to your database outside of Django's ORM. So just add a new RunSQL migration operation:
operations = [
# Add SQL for both forward and reverse operations
migrations.RunSQL("ALTER TABLE app_mymodel ALTER COLUMN new_field SET DEFAULT 0;",
"ALTER TABLE app_mymodel ALTER COLUMN new_field DROP DEFAULT;")
]
You can put that in a new migration file or simply edit the automatically generated one. Depending on your database and its support for transactional DDL, the sequence of operations may or may not be atomic.
I found this ticket from 2 years ago: https://code.djangoproject.com/ticket/28000
It is stated in there that:
Django uses database defaults to set values on existing rows in a table. It doesn't leave the default values in the database so dropping the default is correct behavior. There might be an optimization not to set/drop the default in this case -- I'm not sure it's needed since the column isn't null. A separate ticket could be opened for this.
I also saw the same reference in another question here: Django Postgresql dropping column defaults at migrate
And searching a bit more I came upon this SO question: Django implementation of default value in database that led to the code of the _alter_field method from django.db.backends.base.schema where this comment exists:
# When changing a column NULL constraint to NOT NULL with a given
# default value, we need to perform 4 steps:
# 1. Add a default for new incoming writes
# 2. Update existing NULL rows with new default
# 3. Replace NULL constraint with NOT NULL
# 4. Drop the default again.
Although the last one is about altering an existing nullable field to a non-nullable, this seems to be the way that Django handles the default case :/
Have you heard of SQL Closures or any library that implements them ?
They allow to execute this script in SQL command window (or put it into SP):
exec closure,"
rec{select db=name from sys.databases where name like 'corp_'},{
use |db|
rec{select tbl=name from sys.tables where name like 'user_'},{
for{col},{Created,Modified},{
def_col {
|tbl|.|col| datetime not null default(getdate()) ix
}
}
def_col {|tbl|.deleted datetime ix}
}
}
"
This script will make sure that Created not null, Modified not null and Deleted indexed columns exist in all tables with prefix user_ in all databases with prefix corp_.
def_col will create new column or alter existing column to match desired definition. It will also create/recreate non-unique ascending index for each of these columns.
def_col will drop and recreate dependencies as needed (constraints, indexes, foreign keys, schema bound views and functions).
rec and for and def_col will catch errors and log them into error table or raise immediately depending on context options for easy debugging and tracking of errors during script execution should they happen.
As you can see, the script can be executed many times without failures, it's just second time it will not change anything.
Is there a more readable, supportable and compact way to achieve the same functionality in MS-SQL ?
If yes - please post example in your answer.
Is more readable, supportable and compact way available in MySql, Oracle or other major flavors of SQL language ?
I do not see any reason you could not create simple SQL Server Groups in SSMS, and register your servers against those groups and run your DDL from there. You could also do it with SQLCMD, or Powershell.
What is best practice for storing data in a database which ever only requires a single entry. An example would be configuration data which relates to the entire application/website. Is it common to create a table for this which has only a single entry?
I'm asking under the context of a MongoDB database though I think the question is also valid for SQL databases.
An example of an auxiliary table commonly found in databases would be called Constants and may hold such values of pi, the idea begin that all applications using the database are required to use the same scale and precision. In standard SQL, to ensure they is at most one row e.g. (from Joe Celko):
CREATE TABLE Constants
(
lock CHAR(1) DEFAULT 'X' NOT NULL PRIMARY KEY,
CHECK (lock = 'X'),
pi FLOAT DEFAULT 3.142592653 NOT NULL,
e FLOAT DEFAULT 2.71828182 NOT NULL,
phi FLOAT DEFAULT 1.6180339887 NOT NULL,
...
);
Because mySQL doesn't support CHECK constraint then a trigger is required to achieve the same.
A table would be fine, no reason why not to use it just because it will have only one row.
I just had the weirdest idea (I wouldn't implement it but for some reason I thought of that). You can create a hard-coded view like this:
create view myConfigView
as
select 'myConfigvalue1' as configValue1, 'myConfigvalue2' as configValue2
and do select * from myConfigView :)
but again, no reason why not to use a table just because it will have only one row
If you are using a SQL DB, you will probably have columns like key name, and value and each attribute will be stored as a row.
In MongoDB, you can store all related configuration as a single JSON document
I use a config table with a name (config_name) and a value (config_value). I even add a help field so that users can see what the name/value pair is intended for, or where it is used.
CREATE TABLE config (
config_id bigint unsigned NOT NULL auto_increment,
config_name varchar(128) NOT NULL,
config_value text NOT NULL,
config_help text COMMENT 'help',
PRIMARY KEY (config_id),
UNIQUE KEY ix_config_name (config_name),
) ENGINE=MyISAM AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
Then following php code recovers the value for a key, or returns an empty string. Assumes $db is an open database connection. All entries are forced to lower case.
function getConfigValue($name) {
$retval='';
$db = $this->db;
$sql = 'select config_value from config where LOWER(config_name)="'.strtolower($name).'"';
$result = $db->Query($sql);
if ($result) {
$row = $db->FetchAssoc($result);
$retval = $row['config_value'];
}
return $retval;
}
All mysql/php in this instance, but the general principle remains.
For MongoDB databases, I usually just make a new "table", but, for SQL databases, that entails a lot more (especially when others are also working on the same database; SQL isn't as malleable), so, you might want to be a bit more careful with it.
I would just create table for configurations, as rainecc told, and use cache then to take that all table to memory :) and use it from there (cache). It will be the best.
Say you have a table:
`item`
With fields:
`id` VARCHAR( 36 ) NOT NULL
,`order` BIGINT UNSIGNED NOT NULL
And:
Unique(`id`)
And you call:
INSERT INTO `item` (
`item`.`id`,`item`.`order`
) SELECT uuid(), `item`.`order`+1
MySql will insert the same uuid into all of the newly created rows.
So if you start with:
aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa, 0
bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb, 1
You'll end up with:
aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa, 0
bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb, 1
cccccccc-cccc-cccc-cccc-cccccccccccc, 1
cccccccc-cccc-cccc-cccc-cccccccccccc, 2
How do I command MySql to create a different uuid for each row?
I know that the following works as expected in MSSQL:
INSERT INTO item (
id,[order]
) SELECT newid(), [order]+1
n.b. I know I could SELECT the results, loop through them and issue a separate INSERT command for each row from my PHP code but I don't want to do that. I want the work to be done on the database server where it's supposed to be done.
Turns out uuid() is generating a different uuid per row.
But instead of generating all the chunks randomly, as I would normally expect, MySql appears to only be generating the 2nd chunk randomly. Presumably to be more efficient.
So at a glance the uuids appear identical when in fact MySql has altered the 2nd chunk. e.g.
cccccccc-cccc-cccc-cccc-cccccccccccc
ccccdddd-cccc-cccc-cccc-cccccccccccc
cccceeee-cccc-cccc-cccc-cccccccccccc
ccccffff-cccc-cccc-cccc-cccccccccccc
I assume if there is a collision it would try again.
My bad.
How do I command MySql to create a different uuid foreach row?
MySQL won't allow expressions as a default value. You can work around this by allowing the field to be null. Then add insert/update triggers which, when null, set the field to uuid().
Please try with MID(UUID(),1,36) instead of uuid().
MySQL's UUID() function generates V1 UUIDs, which are split into time, sequence and node fields. If you call it on a single node, only a few bits in the time field will be different; this is referred to as temporal uniqueness. If you call it on different nodes at the exact same time, the node fields will be different; this is referred to as spatial uniqueness. Combining the two is very powerful and gives a guarantee of universal uniqueness, but it also leaks information about the when and where each V1 UUID was created, which can be a security issue. Oops.
V4 UUIDs are generally more popular now because they hash that data (and more) together and thus don't leak anything, but you'll need a different function to get them--and beware what they'll do to performance if you have high INSERT volume; MySQL (at least for now) isn't very good at indexing (pseudo)random values, which is why V1 is what they give you.
First generate an uniq string using the php uniqid() function
and insert to the ID field.