Does MS-SQL support in-memory tables? - mysql

Recently, I started changing some of our applications to support MS SQL Server as an alternative back end.
One of the compatibility issues I ran into is the use of MySQL's CREATE TEMPORARY TABLE to create in-memory tables that hold data for very fast access during a session with no need for permanent storage.
What is the equivalent in MS SQL?
A requirement is that I need to be able to use the temporary table just like any other, especially JOIN it with the permanent ones.

You can create table variables (in memory), and two different types of temp table:
--visible only to me, in memory (SQL 2000 and above only)
declare #test table (
Field1 int,
Field2 nvarchar(50)
);
--visible only to me, stored in tempDB
create table #test (
Field1 int,
Field2 nvarchar(50)
)
--visible to everyone, stored in tempDB
create table ##test (
Field1 int,
Field2 nvarchar(50)
)
Edit:
Following feedback I think this needs a little clarification.
#table and ##table will always be in TempDB.
#Table variables will normally be in memory, but are not guaranteed to be. SQL decides based on the query plan, and uses TempDB if it needs to.

#Keith
This is a common misconception: Table variables are NOT necessarily stored in memory. In fact SQL Server decides whether to keep the variable in memory or to spill it to TempDB. There is no reliable way (at least in SQL Server 2005) to ensure that table data is kept in memory. For more detailed info look here

You can declare a "table variable" in SQL Server 2005, like this:
declare #foo table (
Id int,
Name varchar(100)
);
You then refer to it just like a variable:
select * from #foo f
join bar b on b.Id = f.Id
No need to drop it - it goes away when the variable goes out of scope.

It is possible with MS SQL Server 2014.
See: http://msdn.microsoft.com/en-us/library/dn133079.aspx
Here is an example of SQL generation code (from MSDN):
-- create a database with a memory-optimized filegroup and a container.
CREATE DATABASE imoltp
GO
ALTER DATABASE imoltp ADD FILEGROUP imoltp_mod CONTAINS MEMORY_OPTIMIZED_DATA
ALTER DATABASE imoltp ADD FILE (name='imoltp_mod1', filename='c:\data\imoltp_mod1') TO FILEGROUP imoltp_mod
ALTER DATABASE imoltp SET MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT=ON
GO
USE imoltp
GO
-- create a durable (data will be persisted) memory-optimized table
-- two of the columns are indexed
CREATE TABLE dbo.ShoppingCart (
ShoppingCartId INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
UserId INT NOT NULL INDEX ix_UserId NONCLUSTERED HASH WITH (BUCKET_COUNT=1000000),
CreatedDate DATETIME2 NOT NULL,
TotalPrice MONEY
) WITH (MEMORY_OPTIMIZED=ON)
GO
-- create a non-durable table. Data will not be persisted, data loss if the server turns off unexpectedly
CREATE TABLE dbo.UserSession (
SessionId INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=400000),
UserId int NOT NULL,
CreatedDate DATETIME2 NOT NULL,
ShoppingCartId INT,
INDEX ix_UserId NONCLUSTERED HASH (UserId) WITH (BUCKET_COUNT=400000)
) WITH (MEMORY_OPTIMIZED=ON, DURABILITY=SCHEMA_ONLY)
GO

A good blog post here but basically prefix local temp tables with # and global temp with ## - eg
CREATE TABLE #localtemp

I understand what you're trying to achieve. Welcome to the world of a variety of databases!
SQL server 2000 supports temporary tables created by prefixing a # to the table name, making it a locally accessible temporary table (local to the session) and preceding ## to the table name, for globally accessible temporary tables e.g #MyLocalTable and ##MyGlobalTable respectively.
SQL server 2005 and above support both temporary tables (local, global) and table variables - watch out for new functionality on table variables in SQL 2008 and release two! The difference between temporary tables and table variables is not so big but lies in the the way the database server handles them.
I would not wish to talk about older versions of SQL server like 7, 6, though I have worked with them and it's where I came from anyway :-)
It’s common to think that table variables always reside in memory but this is wrong. Depending on memory usage and the database server volume of transactions, a table variable's pages may be exported from memory and get written in tempdb and the rest of the processing takes place there (in tempdb).
Please note that tempdb is a database on an instance with no permanent objects in nature but it’s responsible for handling workloads involving side transactions like sorting, and other processing work which is temporary in nature. On the other hand, table variables (usually with smaller data) are kept in memory (RAM) making them faster to access and therefore less disk IO in terms of using the tempdb drive when using table variables with smaller data compared to temporary tables which always log in tempdb.
Table variables cannot be indexed while temporary tables (both local and global) can be indexed for faster processing in case the amount of data is large. So you know your choice in case of faster processing with larger data volumes by temporary transactions. It's also worth noting that transactions on table variables alone are not logged and can't be rolled back while those done on temporary tables can be rolled back!
In summary, table variables are better for smaller data while temporary tables are better for larger data being processed temporarily. If you also want proper transaction control using transaction blocks, table variables are not an option for rolling back transactions so you're better off with temporary tables in this case.
Lastly, temporary tables will always increase disk IO since they always use tempdb while table variables may not increase it, depending on the memory stress levels.
Let me know if you want tips on how to tune your tempdb to earn much faster performance to go above 100%!

The syntax you want is:
create table #tablename
The # prefix identifies the table as a temporary table.

CREATE TABLE #tmptablename
Use the hash/pound sign prefix

Related

How can I scan for new data in one database and send it to another one?

I have two databases on two servers.
The first one contains many tables which contains many codes.
An example is "Products" Table which contains the column "ProductCode".
Lets say there are 5 distinct records in that column i.e ProductCode1 -> ProductCode5.
The second database contains all the fields from each table defined in the first database.
I use the second database to provide definitions for each code found in the first database. I have migrated all the data from all the tables in the first db over to the new one manually via an excel file and script.
However, I would like to create an SQL function which scans the first database and when it finds new rows of data, it adds that data to the second database.
This would save me the hassle of querying all the tables individually and then adding them manually, as i originally did.
Please note that both databases are stored on separate servers.
Is this possible to achieve?
Or is there any better options?
While there are recovery backup and replication methods, if both databases maintain different data, there is no single, convenient SQL function to migrate new data in all tables from one database to another. However, you can build an .sql script or stored procedure that runs duplicate-avoid queries for new data.
Consider following steps where 1 and 2 are to be run for each table:
Create Federated Table from remote MySQL database to be locally available for querying but physical storage remains in remote database. See overview of Federated Storage Engine. Note: this step needs to only be run once for each needed table.
CREATE TABLE federated_table (
id INT(20) NOT NULL AUTO_INCREMENT,
name VARCHAR(32) NOT NULL DEFAULT '',
other INT(20) NOT NULL DEFAULT '0',
PRIMARY KEY (id), INDEX name (name),
INDEX other_key (other)
)
ENGINE=FEDERATED
DEFAULT CHARSET=utf8mb4
CONNECTION='mysql://fed_user#remote_host:9306/federated/test_table
Federated table schema must be identical to remote table. Therefore, align data types of CREATE TABLE to output of SHOW CREATE TABLE in remote database.
Run SQL duplicate-avoid queries such as NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL. One other method is EXCEPT (same family as UNION and INTERSECT operators):
INSERT INTO Products (Col1, Col2, Col3, ...)
SELECT Col1, Col2, Col3, ...
FROM my_federated_products_table
EXCEPT
SELECT Col1, Col2, Col3, ...
FROM Products
Automate step 2 for each table into a single stored procedure (or .sql script) to be run multiple times in future.
DELIMITER $
CREATE PROCEDURE migrate_new_data()
BEGIN
-- ALL INSERT INTO STATEMENTS FROM FEDERATED TABLES
END $
DELIMITER ;
Run procedure each time from Excel, workbench, command line, or elsewhere which can serve as your SQL function:
CALL migrate_new_data;
Looks like FEDERATED Storage is what your are looking for.
From the docs:
The FEDERATED storage engine lets you access data from a remote MySQL database without using replication or cluster technology. Querying a local FEDERATED table automatically pulls the data from the remote (federated) tables. No data is stored on the local tables.
Here is and article that shows how to configure it: https://medium.com/#techrandomthoughts/setting-up-federated-tables-in-mysql-8a17520b988c

How to solve a real time dwh delete process?

I am trying to create a near real time dwh. My first attempt is every 15 minutes load a table into my application from my DWH.
I would like to avoid all the possible problems that a near real time DWH can face. One of those problems is query an empty table that shows the value for a multiselect html tag.
To solve this I have thought the following solution but I do not know if there exists a standard to solve this kind of problem.
I create a table like this to save the possible values of the multiselect:
CREATE TABLE providers (
provider_id INT PRIMARY KEY,
provider_name VARCHAR(20) NOT NULL,
delete_flag INT NOT NULL
)
Before the insert I update the table like this:
UPDATE providers set my_flag=1
I insert rows with an ETL process like this:
INSERT INTO providers (provider_name, delete_flag) VALUES ('Provider1',0)
From my app I query the table like this:
SELECT DISTINCT provider_name FROM providers
While the app still working and selecting all providers without duplicated (The source can delete, add or update one provider, so I always have to still updated respect the source) and without showing an error because table is empty I can run this statement just after the insert statement:
DELETE FROM providers WHERE delete_flag=1
I think that this is a good solution for small tables, or big tables with few changes, but what happens when a table is big? Exist some standard to solve this kind of problems?
We can not risk user usability because we are updating data.
There are two aproaches to publich a bulk change of a dimenstion without taking a maintainance window that would interupt the queries.
The first one is simple using a transactional concept, but performs bad for large data.
DELETE the replaced dimension records
INSERT the new or changed dimension records
COMMIT;
Note that you need no logical DELETE flag as the changes are visible only after the COMMIT - so the table is never empty.
As mentioned this approach is not suitable if you have a large dimension with lot of changes. In such case you may use the EXCHANGE PARTITION feature as of MySQL 5.6
You define a temporary table with he same structure as your dimension table, that is partitioned with only one partition containing all data.
CREATE TABLE dim_tmp (
id INT NOT NULL,
col1 VARCHAR(30),
col2 VARCHAR(30)
)
PARTITION BY RANGE (id) (
PARTITION pp VALUES LESS THAN (MAXVALUE)
);
Populate the table with the complete new dimension definition and switch this temporary table with your dimension table.
ALTER TABLE dim_tmp EXCHANGE PARTITION pp WITH TABLE dim;
After this statement the data from the temporary table will be stored (published) in your dimension table (new definition) and the old state of the dimension will be stored in the temporary table.
Please check the documentation link above for constraints of this feature.
Disclaimer: I use this feature in Oracle DB and I have no experience with it in MySQL.

Is there a way to cache a View so that queries against it are quick?

I'm extremely new to Views so please forgive me if this is a silly question, but I have a View that is really helpful in optimizing a pretty unwieldy query, and allows me to select against a small subset of columns in the View, however, I was hoping that the View would actually be stored somewhere so that selecting against it wouldn't take very long.
I may be mistaken, but I get the sense (from the speed with which create view executes and from the duration of my queries against my View) that the View is actually run as a query prior to the external query, every time I select against it.
I'm really hoping that I'm overlooking some mechanism whereby when I run CREATE VIEW it can do the hard work of querying the View query *then, so that my subsequent select against this static View would be really swift.
BTW, I totally understand that obviously this VIEW would be a snapshot of the data that existed at the time the VIEW was created and wouldn't reflect any new info that was inserted/updated subsequent to the VIEW's creation. That's actually EXACTLY what I need.
TIA
What you want to do is materialize your view. Have a look at http://www.fromdual.com/mysql-materialized-views.
What you're talking about are materialised views, a feature of (at least) DB2 but not MySQL as far as I know.
There are ways to emulate them by creating/populating a table periodically, or on demand, but a true materialised view knows when the underlying data has changed, and only recalculates if required.
If the data will never change once the view is created (as you seem to indicate in a comment), just create a brand new table to hold the subset of data and query that. People always complain about slow speed but rarely about data storage requirements :-)
You can do this with:
A MySQL Event
A separate table (for caching)
The REPLACE INTO ... SELECT statement.
Here's a working example.
-- create dummy data for testing
CREATE TABLE MyTable (
id INT NOT NULL,
groupvar INT NOT NULL,
myvar INT
);
INSERT INTO MyTable VALUES
(1,1,1),
(2,1,1),
(3,2,1);
-- create the view, making sure rows have a unique identifier (groupvar)
CREATE VIEW MyView AS
SELECT groupvar, SUM(myvar) as myvar_sum
FROM MyTable
GROUP BY groupvar;
-- create cache table, setting primary key to unique identifier (groupvar)
CREATE TABLE MyView_Cache (PRIMARY KEY (groupvar))
SELECT *
FROM MyView;
-- create a table to keep track of when the cache has been updated (optional)
CREATE TABLE MyView_Cache_updated (update_id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (update_id));
-- create event to update cache table (e.g., daily)
DELIMITER |
CREATE EVENT MyView_Cache_Event
ON SCHEDULE EVERY 1 DAY STARTS CURRENT_TIMESTAMP + INTERVAL 1 HOUR
DO
BEGIN
REPLACE INTO MyView_Cache
SELECT *
FROM MyView_Cache;
INSERT INTO MyView_Cache_updated
SELECT NULL, NOW() AS last_updated;
END |
DELIMITER ;
You can now query MyView_Cache for faster response times, and query MyView_Cache_updated to inform users of the last time the cache was updated (in this example, daily).
Since a view is basically a SELECT statement you can use query cache to improve performance.
But first you should check if :
you can add indexes in the tables involved to speed up the query (use EXPLAIN)
the data isn't changing very often you can materialize the view (make snapshots)
Use a materiallised view.. It can store data like count sum etc but yes after updating the table you need to refresh the view to get correct results as they are not auto updated.. Moreover after querying from view the results are stored in cache so the memory cycles reduces to 2 which are 4 in case of querying from the table itself. So it gets efficient from the second time.. When you query for 1st time from view the data is fetched from main memory and is stored in cache after it.

Alternative for a MySQL temporary table in Oracle

I noticed that the concept of temporary tables in these two systems is different, and I have a musing.. I have the following scenario in MySQL:
Drop temporary table 'a' if exists
Create temporary table 'a'
Populate it with data through a stored procedure
Use the data in another stored procedure
How can I implement the same scenario in Oracle? Can I (in one procedure preferable) create a temporary table, populate it, and insert data in another (non-temporary) table?
I think that I can use a (global) temporary table which truncates on commit, and avoid steps 1&2, but I need someone else's opinion too.
In Oracle, you very rarely need a temporary table in the first place. You commonly need temporary tables in other databases because those databases do not implement multi-version read consistency and there is the potential that someone reading data from the table would be blocked while your procedure runs or that your procedure would do a dirty read if it didn't save off the data to a separate structure. You don't need global temporary tables in Oracle for either of these reasons because readers don't block writers and dirty reads are not possible.
If you just need a temporary place to store data while you perform PL/SQL computations, PL/SQL collections are more commonly used than temporary tables in Oracle. This way, you're not pushing data back and forth from the PL/SQL engine to the SQL engine and back to the PL/SQL engine.
CREATE PROCEDURE do_some_processing
AS
TYPE emp_collection_typ IS TABLE OF emp%rowtype;
l_emps emp_collection_type;
CURSOR emp_cur
IS SELECT *
FROM emp;
BEGIN
OPEN emp_cur;
LOOP
FETCH emp_cur
BULK COLLECT INTO l_emps
LIMIT 100;
EXIT WHEN l_emps.count = 0;
FOR i IN 1 .. l_emps.count
LOOP
<<do some complicated processing>>
END LOOP;
END LOOP;
END;
You can create a global temporary table (outside of the procedure) and use the global temporary table inside your procedure just as you would use any other table. So you can continue to use temporary tables if you so desire. But I can count on one hand the number of times I really needed a temporary table in Oracle.
You are right, temporary tables will work work you.
If you decide stick with regular tables you may want to use the advice #Johan gave, along with
ALTER TABLE <table name> NOLOGGING;
to make this perform a bit faster.
I see no problem in the scheme your are using.
Note that it doesn't have to be a temp-table, you can use a sort of kind of memory table as well.
Do this by creating a table as usual, then do
ALTER TABLE <table_name> CACHE;
This will prioritize the table for storage in memory.
As long as you fill and empty the table in short order you don't need to do step 1 & 2.
Remember the cache modifier is just a hint. The table still ages in the cache and will be pushed out of memory eventually.
Just do:
Populate cache-table with data through a stored procedure
Use the data in another stored procedure, but don't wait to long.
2a. Clear the data in the cache table.
In your MySQL version, I didn't see a step 5 to drop the table a. So, if you want or don't mind having the data in the table persist you could also use a materialized view and simply refresh on demand. With a materialized view you do not need to manage any INSERT statements, just include the SQL:
CREATE MATERIALIZED VIEW my_mv
NOCACHE -- NOCACHE/CACHE: Optional, cache places the table in the most recently used part of the LRU blocks
BUILD IMMEDIATE -- BUILD DEFERRED or BUILD IMMEDIATE
REFRESH ON DEMAND
WITH PRIMARY KEY -- Optional: creates PK column
AS
SELECT *
FROM ....;
Then in your other stored procedure, call:
BEGIN
dbms_mview.refresh ('my_mv', 'c'); -- 'c' = Complete
END;
That said, a global temporary table will work as well, but you manage the insert and exceptions.

MySQL: what is a temporary table?

What is the purpose of a temporary table like in the following statement? How is it different than a regular table?
CREATE TEMPORARY TABLE tmptable
SELECT A.* FROM batchinfo_2009 AS A, calibration_2009 AS B
WHERE A.reporttime LIKE '%2010%'
AND A.rowid = B.rowid;
Temp tables are kept only for the duration of your session with the sever. Once the connection's severed for any reason, the table's automatically dropped. They're also only visible to the current user, so multiple users can use the same temporary table name without conflict.
Temporary table ceases to exist when connection is closed. So, its purpose is for instance to hold temporary result set that has to be worked on, before it will be used.
Temporary tables are mostly used to store query results that need further processing, for instance if the result needs to be queried or refined again or is going to be used at different occasions by your application. Usually the data stored in a temporary database contains information from several regular tables (like in your example).
Temporary tables are deleted automatically when the current database session is terminated.
Support for temporary tables exists to allow procedural paradigms in a set-based 4GL, either because the coder has not switched their 3GL mindset to the new paradigm or to work around a performance or syntax issue (perceived or otherwise).