SQL Table design for the following scenario - mysql

I am using MySQL and I am having the following scenario:
Table nodes: node_id, lat, lng, name
Table links: node1, node2, name
So there are two tables, in table nodes, it stores all the point and their respective latitude and longitude, and in the table link, where store node1 which reference nodes, and node2, which reference nodes too.
Since in MySQL and Rails we can't really have 2 foreign key pointing to the same table (correct me if I am wrong) and for example if I want to find the starting node_name and ending node_name, how would I construct my SQL statement? I tried
SELECT nodes.name from Nodes, links WHERE nodes.node_id = node1 which kinda works but very slow (I have less than 10k records in each table), and if I want to find both names for starting node and ending node, how can I go over and do it? Or if I want to limit the starting node with lat > x and ending node with lat < y to find all the links?
Thank you.
Regards,
Andy.

yes you can do this. your table structure looks good.
your query is also good. try making sure you have proper indexes on the node_id to help performance.
you can run 2 queries, one for each name, or you can do a union r two subqueries if you want the results to all be in the same query.

Related

What would be an MS Access query to return all the results from one table and a yes/no if it's in another table

What would be a query in Access where I can return everything from table1 (hosts) and an additional column. That column would be a yes or a no, if that host exists in table 2?
Table 1: Name: hosts Fields: Hostname, IP address, OS
Table 2: Name: Current Fields: Hostname
I think this should be easy, but I'm banging my head into the wall, due to knowledge or lack thereof. I don't want add another field to table 1. I did see some solutions that did that. Periodically, I'm going to blow away table 2 with a new set of hosts and rerun the query. Thanks for any help
Based on given info, this is simple. Build query joining tables and calculate a field that returns true/false.
SELECT Hosts.*, Not IsNull([Current].[Hostname]) AS InCurrent
FROM [Current] RIGHT JOIN Hosts ON Current.Hostname = Hosts.Hostname;
Or
SELECT Hosts.*, Not IsNull([Current].[Hostname]) AS InCurrent
FROM Hosts LEFT JOIN [Current] ON Hosts.Hostname = Current.Hostname;

MySql SELECT Query performance issues in huge database

I have a pretty huge MySQL database and having performance issues while selecting data. Let me first explain what I am doing in my project: I have a list of files. Every file should be analyzed with a number of tools. The result of the analysis is stored in a results table.
I have one table with files (samples). The table contains about 10 million rows. The schema looks like this:
idsample|sha256|path|...
The other (really small table) is a table which identifies the tool. Schema:
idtool|name
The third table is going to be the biggest one. The table contains all results of the tools I am using to analyze the files (The number of rows will be the number of files TIMES the number of tools). Schema:
id|idsample|idtool|result information| ...
What I am looking for is a query, which returns UNPROCESSED files for a given tool id (where no result exists yet).
The (most efficient) way I found so far to query those entries is following:
SELECT
s.idsample
FROM
samples AS s
WHERE
s.idsample NOT IN (
SELECT
idsample
FROM
results
WHERE
idtool = 1
)
LIMIT 100
The problem is that the query is getting slower and slower as the results table is growing.
Do you have any suggestions for improvements? One further problem is, that i cannot change the structure of the tables, as this a shared database which is also used by other projects. (I think) the only way for improvement is to find a more efficient select query.
Thank you very much,
Philipp
A left join may perform better, especially if idsample is indexed in both tables; in my experience, those kinds of "inquiries" are better served by JOINs rather than that kind of subquery.
SELECT s.idsample
FROM samples AS s
LEFT JOIN results AS r ON s.idsample = r.idsample AND r.idtool = 1
WHERE r.idsample IS NULL
LIMIT 100
;
Another more involved possible solution would be to create a fourth table with the full "unprocessed list", and then use triggers on the other three tables to maintain it; i.e.
when a new tool is added, add all the current files to that fourth table (with the new tool).
when a new file is added, add all the current tools to that fourth table (with the new file).
when a new result in entered, remove the corresponding record from the fourth table.

Mysql DB Is this the most efficient design?

I have an existing mysql DB that manages regulations for 50 states. The current setup is relational - three tables for EACH of the 50 states:
state_table contains the chapter/sub-chapter headings
item_table contains the end records
department_table contains the ID's to relate the two.
all combined it handles around 620,000 records
I'm not a DB design expert and have always utilized this as-is and gotten-by however, the nature of tables for all 50 states limits searching across all states etc. and I'm wondering if there is a better approach.
I'm wondering if I should consider combining this into either a single set of 3 relational tables for the entire nation or even a single table to handle everything.
I've asked this on other forums and have been told to read various volumes of DB schema and structures etc. so if there is someone who can just suggest the direction to go in and the pro's and con's of what I have vs the alternative that would be great!
thanks!
Here's the way it is, X 50
alabama
ID
Name
State
Parent
Description
alabama_department
Department - ID's from "alabama"
Item - ID's from "alabama_item"
alabama_item
ID
Name
Description
Keywords
Doc_ID
Effective_date
...
...
The Queries: I step through the heirarchy of chapter/sub-chapter/end-record via links this works fine but I'm starting to focus more on search capability and also thinking what I have is overkill and it sounds like a couple of you think so (overkill)
If I am correct in thinking you have 150 tables (3 * 50 states) Then:
You should have a 'states' table which includes a stateID and stateName. Then use ONE table for chapter/subchapters, ONE for departments, and ONE for end records and use the stateID to relate different records to a state.
You should not have 3 tables for each state, you can use one of each and just relate to a state table. This brings you to four tables instead of 150.

remove duplicates in mysql database

I have a table with columns latitude and longitude. In most cases the value extends past the decimal quite a bit: -81.7770051972473 on the rare occasion the value is like this: -81.77 for some records.
How do I find duplicates and remove one of the duplicates for only the records that extend beyond two decimal places?
Using some creative substring, float, and charindex logic, I came up with this:
delete l1
from
latlong l1
inner join (
select
id,
substring(cast(latitude as varchar), 0, INSTR(CAST(latitude as varchar))+3, '.') as truncatedLat
from
latlong
) l2 on
l1.id <> l2.id
and l1.latitude = cast(l2.truncatedLat as float)
Before running, try select * in lieu of delete l1 first to make sure you're deleting the right rows.
I should note that this worked on SQL Server using functions I know exist in MySQL, but I wasn't able to test it against a MySQL instance, so there may be some little tweaking that needs to be done. For example, in SQL Server, I used charindex instead of instr, but both should work similarly.
Not sure how to do that purely in SQL.
I have used scripting languages like PHP or CFML to solve similar needs by building a query to pull the records then looping over the record set and performing some comparison. If true, then VERY CAREFULLY call another function, passing in the record ID and delete the record. I would probably even leave the record in the table, but mark some another column as isDeleted.
If you are more ambitious than I, it looks like this thread is close to what you want
Deleting Duplicates in MySQL
finding multi column duplicates mysql
Using an external programming language (Perl, PHP, Java, Assembly...):
Select * from database
For each row, select * from database where newLat >= round(oldLat,2) and newLat < round(oldLat,2) + .01 and //same criteria for longitude
Keep one of them based on whatever criteria you choose. If lowest primary key, sort by that and skip the first result.
Delete everything else.
Repeat skipping to this step for any records you already deleted.
If for some reason you want to identify everything with greater than 2 digit precision:
select * from database where lat != round(lat,2), or long != round(long,2)

Using Sphinx for the first time - configuring the sql_query key

I'm currently practicing using Sphinx, I've not far off done much, except the configuration what I'm trying to do. The sql_query key is leaving me somewhat confused what to put there, I read in the Sphinx documentation of sql_query but it doesn't seem to clear my mind from knowing what to do since I have many SELECTs in my web application, and I want to use Sphinx for my search and the SQL is often changed (upon user search filtering).
As of my search using MySQL, I want to integrate Sphinx to my web application, if the sql_key is not optional, do I have to expect to put the whole search SQL query into that field or do I pick out the necessary fields from tables to start a reindex?
Can someone point me to the right direction so I can get things going well with Sphinx and my web application.
sql_query is mandatory , it's run by sphinx to get the data you want to be indexed from mysql . You can have joins , conditions etc. , must be a valid sql query . You should have something like "SELECT id ,field1,field2,fieldx from table" . id must be a primary id .Each row returned by this query is considered a document ( which is returned by sphinx when you search ) .
If you have multiple tables ( that are very different by meaning - users , articles etc.) - you need to create an index for each .
Read tutorials from here : http://sphinxsearch.com/info/articles/ to understand how sphinx works .
You can create a sql query to get union set of records from the Database. If you do multiple table joining and query to select the best result set, you can do it with Sphinx too.
You may run into a few trouble with your existing table structure in the database.
Like :
Base table does not have integer primary key field
Create a new table which has two fields. One for the integer id field and the other field to hold the primary key of the base table. Do an inner join with that table and select the id field from that table.
Eg. SELECT t1.id, t2.name, t2.description, t2.content FROM table_new t1 INNER JOIN table_2 t2 WHERE t1.document_id = t1.thread_id INNER JOIN REST_OF_YOUR_SELECT_QUERY
The ta.id is for Sphinx search engine to do its internal indexing.
You filter data by placing WHERE clause and filtering
You can do that in Sphinx by setting filters dynamically based on the conditions.
You select and join different tables to get results
This also can be done by setting different sources and indexes based on your requirements.
Hope this would help you to get an understanding what you need to add and modify to start thinking how Sphinx search engine can be configured to your requirements. Just come here again if your need more help.