Index data from multiple table into solr - mysql

I have three table TableA,TableB and TableC
TableA
idA ------------ PK
col1A
TableB
idB ----------- PK
col1B
TableC
idC ---------- PK
col1C
I am indexing all the data in solr in single core there may be a chance of overriding TableC data with TableB or TableA and Vice versa. Because the primary key are auto generated and there is a possibility of having same value in different tables. How do I solve this problem.
I have two solutions.
1) I was thinking of appending a suffix pk_tablename to make the unique id in solr.
2) create separate core for each table.
which do you suggest is the best ?
In my business domain the table can have millions of records.
please advise.

Solution 1 should be fine. You can store data from different tables in a single core if you want to search them all with a single query. Your primary key is fine. Along with that you can also store the table name in another field, so your docs will look like:
{
unique_id: 1234_A,
id: 1234,
table: A,
data: <text field>
}
Storing the table name will help you perform searches restricted to some table(s) only.

Related

Problems generating random data in SQL workbench

I am new using mySQL, so probably my question will be very banal, but I didn't find any solution on internet.
I have two tables, TABLE 1 and TABLE 2, each one with a single primary key tab1PK (INT) and tab2PK (VARCHAR).
Since TABLE 1 and TABLE 2 have a M:N relationship, I have a third table, TABLE 3, whose PK are two: tab1PK and tab2PK.
I generated random data for TABLE 1 and TABLE 1. Is there a way to generate rapidly data for the TABLE 3? Is there a way to easily combine tab1PK and tab2PK?
This will give you a cartesian join of all table1 & table2 primary keys.
insert into table3 (tab1PK, tab2PK)
select table1.tab1PK, table2.tab2PK
from table1, table2

How to merge two db with same structure but different data

I'm working with phpmyadmin and I have to merge two db with same structure but different data.
The db have relation between tables (foreign key).
The data in two db may have same id, and so their foreign key.
I would like to know if it's possible merge the two db keeping all data, so, if a row already "exist", insert it with new id and update its foreign key.
thanks a lot
No easy way unfortunately. If you have TableA as a foreign key to TableB, you will need to
1) Insert data from source tableA to target tableA
2) create a (temp) table to store the mapping between source tableA ids and target tableA ids
3) Use this mapping table when inserting data from tableB to convert the tableA ids to the new ones in the target db
... and so on. It can get quite hairy if you have a deep hierarchy of tables, but hopefully you get the idea. Take backups before you start.
Another idea that you might want to consider is using a cursor:
Assume table A is the one that you want to keep and table B is the one you want to remove.
Declare a cursor for table B and select all the records.
Loop each record selected from the cursor and check.
Case 1: If the ID is exists on table A, insert the record to table A with same details.
Case 2: If the ID is exists on table B, insert the record and modify the ID and foreign key.
Once all the records have been checked, drop table B.
Sorry, I just can give an idea at the moment.

Most efficient way to select data from one sql table and see if it matches data on another table in the same database

I have a database with 2 tables, both tables have around 200,000 records.
Lets call these tables, TableA and TableB
Currently I have a function that triggers a select query, this query grabs all records in TableA that match a condition. Once I have that data, I have a foreach loop that uses the data from TableA to see if it matches any record in TableB.
The problem is that it takes a while to do this because there are so many records. I know the way Im doing it works because it does what its supposed to but it takes a good 3 minutes to finish the script. Is there a faster more efficient way to do something like this?
Thank you in advance for the help.
PS: I'm using PHP.
The most efficient way to achieve what you want is to:
1. Create a primary key column for each table (if you do not already have one). Example schema where column "id" is a unique identifier for the table row:
TableA
id firstname lastname
1 Michael Douglas
2 Michael Jackson
TableB
id table_a_id pet
1 1 cat
2 2 ape
3 1 dog
Google or search here on stackoverflow on how to create or add a primary key for a mysql table column. An example of creating TableA with a primary key:
CREATE TABLE `TableA` (
`id` int(11) unsigned AUTO_INCREMENT,
`firstname` varchar(100),
`lastname` varchar(100),
PRIMARY KEY (`id`)
)
2. Create an SQL-query to fetch what you need. For example:
To get all rows with at least one match in BOTH tables:
SELECT TableA.id, TableA.firstname, TableA.lastname, TableB.pet
FROM TableA
INNER JOIN TableB
ON TableA.id = TableB.table_a_id;
To instead get all rows from TableA, and only the matching rows from TableB:
SELECT TableA.id, TableA.firstname, TableA.lastname, TableB.pet
FROM TableA
LEFT JOIN TableB
ON TableA.id=TableB.table_a_id;
The answer to your question ultimately depends on what you mean by "if it matches."
Let's assume, for a moment, that you have primary keys on each of these tables, TableA an TableB, and that you're NOT matching those. But that you have one or more other columns, the actual data that you're storing in each row, which you are considering for your matching. Let's call those ColA and ColB.
In that case you could use:
SELECT TableA.id, TableB.id, TableA.ColA, TableB.ColB
FROM TableA
LEFT JOIN TableB
ON (TableA.ColA = TableB.ColA)
AND (TableB.ColB = TableB.ColB);
... notice that we're using a complex expression on which to JOIN. You'd want to add an AND (TableA.XXX = TableB.XXX) for each columned that you want to consider significant in your matching.
Of course I'm assuming that these tables don't share a common surrogate key (otherwise MicKri's JOIN would be simpler ... or a "NATURAL JOIN" would be even simpler still).
What you're doing, conceptually, is defining a pair of (mathematical) sets an finding the intersection between them. The complication of doing this in SQL is that real world tables often have these extra columns (surrogate primary keys, and foreign keys) which aren't attributes of the underlying entities ... but which serve to map relationships among them.
In my example I'm just showing a way to formulate a JOIN query that finds the intersection based only on the attributes that are significant for your purposes.
(By the way, the parentheses in my example are there for human legibility. They should not be required by your SQL engine ... though they don't hurt, either).
Here's one of a number of visual explanations of SQL JOINs that's handy for learning this sort of thing. An INNER JOIN is an intersection. The ON and WHERE clauses define the subsets of the data (columns and rows, respectively) which are to be related.

normalization of database structure

I was reading the concept of normalization of database structure. I got confused with the following situation in my project.
I have two tables "TableA" and TableB
Both tables are independent of each other and have no realtionship at all
They represent completely different data
Both the tables will have different parameters. However Parameter itself as an object has same properties.
So my concern is should I have single Parameter table which is serving TableA and TableB both
Or
I should have separate Parameter Table for both Table A and Table B
Structure look likes this
Case I:
TableA
ID
Name
Description
TableB
ID
Name
SomeFlag
Parameter
ID
TableA_ID
TableB_ID
Name
Description
Type
Case II
TableA
ID
Name
Description
Parameter_A
ID
TableA_ID
Name
Description
Type
TableB
ID
Name
SomeFlag
Parameter_B
ID
TableB_ID
Name
Description
Type
I personally prefer Case I, as it does make sense to create another table representing same type of data.
As per normalization's concept we should have a table that represents only one thing. So i guess i should have only one parameter table. But what if that table mean something completely different when viewed from TableA and different when viewed from TableB?
I would use case one but with some changes. The parameter entity does hold one thing, parameters for a table. An instance of a parameter entry should relate to only one table (based on your analysis that they are not related).
Parameter
----------
PK Param_ID
FK Main_Table_ID
Main_Table_name (A or B)
param_Name
param_Description
param_Type
If it makes logical sense for a Parameter to have both Table A and Table B in the same instance (not an either/or), then Case I is better.
In Relational Theory, every table is a type. Even if they may have common data, types are based around their usage. And though it's a little more complicated, Case II is more normalized.
There is another possibiltiy, that hasn't been mentioned, I'll call it Case III.
TableA
ID
Name
Description
PropertyID
TableB
ID
Name
SomeFlag
PropertyID
Parameter
ID
Name
Description
Type
If the Properties will always be common among both tables, this is probably going to be the best solution.

Get a value going through several tables in entity framework?

I have a situation where I am trying to get a value stored in one table, but have to go through several other tables on the same database to get the correct result.
Table 1 has a PK ID I can query, this PK will give me a new FK in table 2, this will give me a key to table 3 and table 3 will give me key to table 4 that has the value stored against the PK in table 1. If this made sense? I can not do something with the tables or the database, so I need to find a way to select the value in table 4 from the primary key I got in table 1.
Any ideas?
Edit 1 :
I will try to explain better. I have a filepath located in table 4. To get the correct filepath I need to first find the a project id in table 1. That same project id is a FK in table 2. In table 2 I need to find another id (let's call it "customer id") using project id as FK, the customer id is a FK in table 3. In table 3 I need to find purchase id using customer id as FK, the purchase id is a FK in table 4. With the correct FK (purchase id) in table 4, I am able to get the correct filepath. The filepath that coresponds with the project id from table 1.
I am using ASP.NET (Entity Framework) and a SQL database. I was thinking to use Linq, but is somewhat confused about how to do it. Join several tables or try to get the filepath that way?
Did this make it more clear?
use Linq join Query,..
One example, See here
Entity Framework Join 3 Tables
or search LINQ join in Google,you can find the solution yourself.