My Access Database is slow when finding non-matching records
SELECT
RT3_Data_Query.Identifier, RT3_Data_Query.store, RT3_Data_Query.SOURCE,
RT3_Data_Query.TRAN_CODE, RT3_Data_Query.AMOUNT,
RT3_Data_Query.DB_CR_TYPE, RT3_Data_Query.status,
RT3_Data_Query.TRAN_DATE, RT3_Data_Query.ACCEPTED_DATE,
RT3_Data_Query.RECONCILED_DATE
FROM
RT3_Data_Query
LEFT JOIN Debit_AO_Query ON RT3_Data_Query.[Identifier] = Debit_AO_Query.[Identifier]
WHERE
(((Debit_AO_Query.Identifier) Is Null));
I'm doing a query of two queries I created. The last query is just to compare these two queries and show what is missing between them which is what i posted above. I'm matching an identifier between the two queries which looks like this 583005-01-20185804.33 which is a combination of store, date and amount.
Here is a link to the database:
https://wetransfer.com/downloads/15f912909fbe2ea0a5111e44b953d11a20190808195913/db9912
The query is slow because you don't use indexes on tables and join on concated fields (Identifier is Location & Date & Total)!
Each table needs a primary key or it is not a table! That should be an autonumber for the beginning!
Indexing:
Add a field called id to each table, datatype autonumber and make it PK.
Add a key for the fields compared in the join and the where clause (set all index properties (primary, unique, ignore) to no)!
for table RT3_Data (because it is huge create a copy first, then delete the data, or creating index will fail onMaxLocksPerFile):
store
AMOUNT
TRAN_DATE
after that reimport data from copy with query:
INSERT INTO RT3_DATA
SELECT [Copy Of RT3_DATA].*
FROM [Copy Of RT3_DATA];
for table Debit_AO:
Location
Total
Date (should be renamed as Date() is a VBA-Function)
Now change the queryRT3_Data_Query Without Matching Debit_AO_Queryto:
SELECT RT3_Data.store
,RT3_Data.SOURCE
,RT3_Data.TRAN_CODE
,RT3_Data.AMOUNT
,RT3_Data.DB_CR_TYPE
,RT3_Data.STATUS
,RT3_Data.TRAN_DATE
,RT3_Data.ACCEPTED_DATE
,RT3_Data.RECONCILED_DATE
FROM RT3_Data
LEFT JOIN Debit_AO
ON RT3_Data.[store] = Debit_AO.[Location]
AND RT3_Data.[AMOUNT] = Debit_AO.[Total]
AND RT3_Data.[TRAN_DATE] = Debit_AO.[DATE]
WHERE (
(
Debit_AO.Location IS NULL
AND Debit_AO.Total IS NULL
AND Debit_AO.[Date] IS NULL
)
);
Now the query executes in less than 10 seconds and for sure there are more optimizations (e.g composite index).
Related
I Want to join the two tables data and one location and the other one is trip when ever i query for the the one table its show the data but whenever i use the query to join two rows it wont fetch the table result
Here is my query
SELECT trip.Trip_Name ,trip.Trip_ID , trip.Trip_Date , location.Location_Name , location.Location_ID
FROM location
INNER JOIN trip ON trip.Trip_ID = location.Location_ID
i think trip_id is not the same as location_id
the trip table containt trip's information
and the location containt the location's information
so in order to make join you should make a new column named "location_id" in trip as foreign key
and location.location_id as primary key
I have a table that has Act ID, and another table that has Act ID, percentage complete. This can have multiple entries for different days. I need the sum of the percentage added for the Act ID on the first tableZA.108381.080
First table
Act ID Percent Date
ZA.108381.110 Total from 2 table
ZA.108381.120
ZA.108476.020
ZA.108381.110 25% 5/25/19
ZA.108381.110 75 6/1/19
ZA.108381.120
ZA.108476.020
This would be generally considering not good practice. Your primary key should be uniquely identifiable for that specific table, and any other data related to that key should be stored in separate columns.
However since an answer is not a place for a lecture, if you want to store multiple values in you Act ID column, I would suggest changing your primary key to something more generic "RowID". Then using vba to insert multiple values into this field.
However changing the primary key late in a databases life may cause alot of issues or be difficult. So good luck
Storing calculated values in a table has the disadvantage that these entries may become outdated as the other table is updated. It is preferable to query the tables on the fly to always get uptodate results
SELECT A.ActID, SUM(B.Percentage) AS SumPercent
FROM
table1 A
LEFT JOIN table2 B
ON A.ActID = B.ActID
GROUP BY A.ActID
ORDER BY A.ActID
This query allows you to add additional columns from the first table. If you only need the ActID from table 1, then you can simplify the query, and instead take it from table 2:
SELECT ActID, SUM(Percentage) AS SumPercent
FROM table2
GROUP BY ActID
ORDER BY ActID
If you have spaces other other special characters in a column or table name, you must escape it with []. E.g. [Act ID].
Do not change the IDs in the table. If you want to have the result displayed as the ID merged with the sum, change the query to
SELECT A.ActID & "." & Format(SUM(B.Percentage), "0.000") AS Result
FROM ...
See also: SQL GROUP BY Statement (w3schools)
I have 2 tables, the first one of them is orderbook has a field namely
easy_order with datatype of tinyint(1) and the primary key is id.
The second table name is execution including the fields of buy_order_id and sell_order_id. These 2 fields are ref. from the id key from the orderbook table.
I would like to write a SQL to find all the rows from the execution table whose buy_order_id OR sell_order_id row from the orderbook table has the easy_order column value of 1.
I use MySQL database.
I would write it this way, avoiding repetition and dual nested selects:
select execution.*
from execution
inner join orderbook on
orderbook.id in (execution.buy_order_id, execution.sell_order_id) and
orderbook.easy_order = 1;
(You may prefer conceptually to put orderbook.easy_order = 1 in a where clause. It will produce the same results, so it's a matter of preference.)
I make it work The SQL is,
SELECT * FROM execution WHERE buy_order_id IN (SELECT id FROM orderbook WHERE easy_order=1)
OR
sell_order_id IN (SELECT id FROM orderbook WHERE easy_order=1);
I have a couple of tables in a mySQL database. For simplicity I'll just show some basic fields:
Table: sources:
sourceID int not null unique primary key
trigger int not null
<other stuff>
Table: sourceBS
id not null unique primary key
sourceID int not null,
name varchar(20),
SourceID in the in the sourceBS table is a foreign key referencing its namesake in sources, with the cascade option. I have tested this: if I delete an entry in sources, the corresponding entry in sourceBS also vanishes. Good.
I want to select some stuff from a join of sources and sourceBS, filtering based on a "sources" property. This should be easy, via a join which, I think, the foreign key should render pretty efficient, so:
SELECT sources.sourceID, sourceBS.*
FROM sources
LEFT JOIN sourceBS ON sources.sourceID = sourceBS.sourceID
WHERE trigger=1;
But when this runs, each row has "NULL" for the values returned from sourceBS, even sourceBS contains entries matching the condition. I can verify this:
SELECT *
FROM sourceBS
WHERE sourceID IN (
SELECT sourceID
FROM sources
WHERE trigger=1
);
Here I get a proper set of results, i.e. non-null values. But, while this works as a proof of concept, it's no good in real life because I want to return a bunch of stuff from the "sources" table as well, and I don't want to have to run multiple queries in order to get what I want.
Returning to the join, if I replace the left join with an inner join, then no results are returned. It is as if, somehow, the "join" is simply not finding any matches in the sourceBS table, and yet they are there as the second query shows.
Why is this happening? I know that this join has a 1:M relationship, sourceBS could have multiple entries for a given entry in sources, but that should be OK. I can test exactly this type of join on other DBs, and it works.
OK, so I've solved this - it wasn't a transaction issue in the end:when I tried it on the original machine, it failed again. It was the order of the join. It appears that in my terminal I had the "ON" clause the other way round to above, that is, I was doing:
... LEFT JOIN sourceBS ON (sourceBS.blockSourceID=sources.sourceID)
which returns all the nulls. If I do it (as in the above code I pasted)
... LEFT JOIN sourceBS ON (sources.sourceID=sourceBS.sourceID
it works. When I tried it the second time last night on a new machine, I'd used the second formulation.
Guess I'd better read up on joins to understand why this happened!
I have implemented a system on one of our SQL Servers (all currently 2008) that reads out the size and usage of indexes (not PKs) on our tables and stores the information historically in a dedicated database.
Each index gets a assigned a own SID in table a, and each time the index size or usage changes by a specified value a new entry in table b is created, the old one set to inactive (SCD2).
The jobs runs once a day.
Problem: On very rare occassions I get two rows of size for some indexes and so far it only has happened on 3 tables out of more than 1000 that are watched.
Where do i get the data:
FROM IndexInfo.IndexOverview io
JOIN IndexInfo.IndexSizeUsage isu ON io.IndexOverviewSid = isu.IndexOverviewSid
JOIN XYZ.sys.partitions p ON io.object_id = p.object_id AND io.index_id = p.index_id
JOIN ( SELECT container_id, SUM(total_pages) total_pages, SUM(used_pages) used_pages
FROM XYZ.sys.allocation_units
GROUP BY container_id) a ON p.partition_id= a.container_id
LEFT JOIN (SELECT object_id, index_id, ISNULL(100.0*(user_lookups+user_scans+user_seeks)
/ NULLIF((SELECT SUM(user_lookups+user_scans+user_seeks)
FROM XYZ.sys.dm_db_index_usage_stats indusinner
WHERE indusinner.database_id=indusout.database_id AND indusinner.object_id=indusout.object_id
GROUP by object_id),0),0) usage
FROM XYZ.sys.dm_db_index_usage_stats indusout WHERE database_id=DB_ID('XYZ')) usageselect ON io.object_id=usageselect.object_id AND io.index_id=usageselect.index_id
This is is the important part.
IndexOverview (table a): one entry per created user index
IndexSizeUsage (table b): SCD2 data for usage and index size (by the time the error occurs only 1 active entry for each index)
The result:
Two rows in table b for an entry in table a while the only difference in the rows is the information from table a. As table a is Grouped by the only value I use for the join, SQL Server somehow creates a second partition by the time the data is read.
I tried to replicate the problem by manually executing this query while reorganizing / rebuilding the index (which should never happen by the time the job is scheduled) and by doing large inserts, forcing the index to grow.
When and why does SQL Server temporarily create a second partition?