SQL join on different versions of table - mysql

I have multiple suppliers of data, which I will call A, B, & C. A has a database that is updated monthly. B & C (my application actually gets more than 2 other data suppliers, and there are over 100) reference a table in A and tell which month from A they are using. A may update, add or delete records for each monthly release. Most of the records from A will stay the same. I currently use multiple databases, and specify the database to use in each join.
What is a good way to store the data from A so that B & C joins to the data will work efficiently? Does NoSQL or ORDBMS solve this issue?

If ...
you're using MySQL and
your user id has appropriate privileges and
your databases all live on a single server
you can use tables from multiple databases quite easily, just by qualifying the table names in your queries. For example, this sort of thing performs very well.
SELECT a.id, b.vendor
FROM A.stock a
JOIN B.shipments b ON a.sku = b.sku

Related

Mysql query joining 5 tables

I am trying to join 5 tables in which i want to get different currency mentioned on different tables against same contract id.
It is giving me results when i join any three tables but when I add one more table in query the server gets unresponsive until I have to kill the process.
Please help me where I am doing a mistake.
SELECT c.department_id,
c.contract_id,
c.seller_id,
c.buyer_id,
c.contract_ratecurrency AS contractcurrency,
b.currency_id AS billcurrency,
s.saleinv_currency AS saleinvcurrency,
cm.currency_id AS commissioncurrency,
sl.currency_id AS cmlogcurrency,
c.contract_iscancel
FROM tbl_contracts C
JOIN tbl_contract_bill b ON c.contract_id=b.contract_id
JOIN tbl_contract_saleinvoice s ON c.contract_id =s.contract_id
JOIN tbl_commission_payment cm ON c.department_id = cm.department_id
JOIN tbl_saleinvoice_commission_log sl ON c.department_id = sl.department_id
WHERE (c.contract_ratecurrency <> s.saleinv_currency
OR c.contract_ratecurrency <> b.currency_id
OR s.saleinv_currency <> b.currency_id
OR cm.currency_id <> sl.currency_id
OR c.contract_ratecurrency <> cm.currency_id
OR s.saleinv_currency <> cm.currency_id
OR b.currency_id <> cm.currency_id)
AND (c.contract_iscancel =0)
requried result should be
ccontractid,csellerid,cbuyerid,ccurrency,bcurrency,scurrency,cmcurrency,slcurrency
101,25,50,1,1,2,3,1
102,28,16,2,3,1,3,2
It looks like you are having performance issues. To optimize your database structure you have multiple options:
Adding indexes on your keys.
Let's take a look to your join statement:
JOIN tbl_saleinvoice_commission_log sl ON c.department_id = sl.department_id
Adding a clustered index on department_id on
tbl_saleinvoice_commission_log table will help you a lot in
performance wise. For more information you can check this link.
Partitioning is another way to increase performance, but you need to check your database structure to see whether it works for you or not. For more information you can check this link.
Also I believe your tables are one to many, so you might need to check how many rows you are trying to retrieve. If your database server is not capable of processing big number of rows you might need to improve your hardware or CPU usage limits of your database daemon.

SQL most efficient way to check if rows from one table are also present in another

I have two DB tables each containing email addresses
One is mssql with 1.500.000.000 entries
One is mysql with 70.000.000 entries
I now want to check how many identical email addresses are present in both tables.
i.e. the same address is present in both tables.
Which approach would be the fastest:
1. Download both datasets as csv, load it into memory and compare in program code
2. Use the DB queries to get the overlapping resultset.
if 2 is better: What would be a suggested SQL query?
I would go with a DBQuery. Set up a linked server connection between the two DBs (probably on the MSSQL side), and use a simple inner join query to produce the list of e-mails that occur in both tables:
select a.emailAddress
from MSDBServ.DB.dbo.Table1 a
join MySqlServ.DB..Table2 b
on a.EmailAddress = b.EmailAddress
Finding the set difference, that's going to take more processor power (and it's going to produce at least 1.4b results in the best-case scenario of every MySql row matching an MSSQL row), but the query isn't actually that much different. You still want a join, but now you want that join to return all records from both tables whether they could be joined or not, and then you specifically want the results that aren't joined (in which case one side's field will be null):
select a.EmailAddress, b.EmailAddress
from MSDBServ.DB.dbo.Table1 a
full join MySqlServ.DB..Table2 b
on a.EmailAddress = b.EmailAddress
where a.EmailAddress IS NULL OR b.EmailAddress IS NULL
You could do a sql query to check how many identical email addresses are present in two databases: first number is how many duplicates, second value is the email address.
SELECT COUNT(emailAddr),emailAddr FROM table1 A
INNER JOIN
table2 B
ON A.emailAddr = B.emailAddr
Table1 has the 70,000,000 email addresses, table2 has the 1,500,000,000. I use Oracle so the Upper function may or may not have an equivalent in MySQL.
Select EmailAddress from table1 where Upper(emailaddress) in (select Upper(emailaddress) from table2)
Quicker than comparing spreadsheets and this assumes both tables are in the same database.

How to join two tables in different databases in Workbench

How does one join two tables that are in different databases using the SQL runner in MySQL Workbench?
I have searched for this and explored the interface but could not a find a solution.
If it is not possible with Workbench, is it possible with another client?
Note: the databases exist under different connections and ports!
You can simply join the table of different database. You need to specify the database name in your FROM clause. To make it shorter, add an ALIAS on it,
SELECT a.*, -- this will display all columns of dba.`UserName`
b.`Message`
FROM dba.`UserName` a -- or LEFT JOIN to show all rows whether it exists or not
INNER JOIN dbB.`PrivateMessage` b
ON a.`username` = b.`username`
So just adding DB name before tablename will solve your problem.
In that Case you can use,FEDERATED Storage Engine to join two mysql connections running on two servers.Please refer doc to know more about it
http://dev.mysql.com/doc/refman/5.0/en/federated-storage-engine.html

How to do a selecting join if two tables have a specfic equal

Have have two tables in two different databases:
Lets say i have Database users, and Database questions. If users has a table called USER_STATS that has
USER_ID,
EDU_INT1,
EDU_INT2,
EDU_INT3
, and questions has a table called questions that have a column called CLASS_SUBJECTS.
I want to run a query that will display * from QUESTIONS where CLASS_SUBJECTS equals either EDU_INT1,EDU_INT2,EDU_INT3 where the EDU_INT's are determined from a specific USER_ID
Any ideas? This is semi hard because of the two different databases
When querying across two databases, you just need to prepend the database name with a . before the table name as in database.table.column, and the database connection user must have access to both databases.
Beyond that, this a regular JOIN, but with a more complex ON clause using 3 conditions OR'd together:
SELECT
q.*
FROM
questions.questions q
JOIN users.USER_STATS u ON (
q.CLASS_SUBJECTS = u.EDU_INT1
OR q.CLASS_SUBJECTS = u.EDU_INT2
OR q.CLASS_SUBJECTS = u.EDU_INT3
)
WHERE u.USER_ID = <some user id>
The database name reference might work if both are running in the same instance on a single server. If you need to scale to multiple servers, you might need to use replication and possibly set up federated tables.
http://dev.mysql.com/doc/refman/5.5/en/federated-storage-engine.html

MS-Access DISTINCTROW dosen't work anymore if linked tables a stored on SQL2008-Server

I have a query in MS-Access like this:
select DISTINCTROW companies.* from companies, contacts, companies left join contacts on contacts.com_uid = companies.com_uid (This is the ms-access form of a standard "left-join")
[Companies] and [contacts] are linked views on a sql-server 2008, ODBC driver is "SQL server native client 10.0". Both views looks like "select * from [companies] where deleted = 0" and "select * from [contacts] where delete = 0"
The result is wrong since companies are show as many contacts there are.
If the Views are stored on a SQL2000 and linked with the ODBC-driver "SQL Server" everything is fine: All the companies are shown exactly once.
Are there any solutions to get the result with DISTINCTROW again?
I'm surprised it executes that query at all. You're specifying the table "contacts" twice.
Your LEFT JOIN should return every row from "companies". Since you're not retrieving any columns from contacts, I'm pretty sure your query is equivalent to
SELECT *
FROM companies
as long as "companies" means what it does in ordinary language.
If that turns out not to be the case, you can hand the burden off to SQL Server either by creating a view in SQL Server, or by creating a passthrough query in Access. A passthrough query will have to be written in your server's dialect of SQL (SQL Server 2008 dialect of SQL).
Your revision, reproduced below, does nothing to change my earlier comments.
select DISTINCTROW companies.*
from companies, contacts, companies
left join contacts on contacts.com_uid = companies.com_uid
(This is the ms-access form of a standard "left-join")
That's not Access's form of a left join. Access won't allow this:
from companies, contacts, companies
left join contacts
because you're now specifying both tables twice.
Based on your edit, I'd say the query you're trying to write is still equivalent to
SELECT *
FROM companies
What do you get if you run that?
Let's stop talking about the syntax of a left-join in ms-access. Fact is that if the linked tables are views on sql-server 2000:
create view [companies] as
select * from [TabCompanies] where deleted = 0
and
create view [contacts] as
select * from [TabContcts] where deleted = 0
These views are ODBC-linked-tables in a ms-access 2003/2007 mdb.
The questions shows up in ms-access on a query like
select distinctrow [companies].* from [companies] left join [contacts] on [companies].com_uid = contacts.com_uid] where [contacts].[function] like 'C*'
(lets forget that alternative syntax and look on the result assuming that the left join works without an error or syntaxerror)
This DISTINCTROW is a ms-access feature and not know in sql-server and for my point of view the result is the same like DISTINCT but works also even if there are columns with datatype of images par example.
All together we expect by now the same like Catcall in his answer said "select * from companies" BUT IT IS NOT, why?
This is only an excerpt of the whole query and may be makes no sense for production but it shows the changed behaviour wehn sql2008 is connected.
The purpose of DISTINCTROW is to make editable the two sides of an N:1 join. With a Cartesian product (from companies, contacts, companies), the result cannot be editable, so DISTINCTROW has no advantage over DISTINCT.
Secondly, no matter what you say, it is not possible to have the same table twice in a FROM clause without an alias. The SQL you've posted could not have worked in any version of Access.
The only way I can possible imagine there's any sense in what you've posted is if you've omitted a WHERE clause.
EDIT BASED ON COMMENTS:
This should work:
SELECT DISTINCT companies.*
FROM companies INNER JOIN contacts ON companies.com_uid = contacts.com_uid
WHERE contacts.function LIKE "C*"
First off, I'd assume a normal N:1 relationship between contacdts and companies (i.e., many contact records are linked to any single company record), so with both tables in the FROM clause, you do need a DISTINCT to return a single row for each company.
Secondly, if you place criteria on the table on the many side of the JOIN, there's no reason to attempt to use a LEFT JOIN, as it won't change the records returned (use a LEFT JOIN when you want to return records regardless of whether or not there are records in the table on the many side of the JOIN). So, an INNER JOIN is going to do the job for you, and be more efficient (outer JOINs are just slower, even with criteria).