Function in MySQL that operates on multiple columns - mysql

Is it possible to create a custom function in MySQL like SUM, MAX, and so on. That accepts multiple columns and do some operation on each row?
The reason I am asking this question is because I tried to do my logic using stored procedure but unfortunatelly couldn't find a way how to select data from table name where the name of the table is input parameter.
Somebody suggested to use dynamic SQL but I can not get the cursor. So my only hope is to use custom defined function.
To make the question more clear here is what I want to do:
I want to calculate the distance of a route where each row in the database table represents coordinates (latitude and longtitude). Unfortunatelly the data I have is really big and if I query the data and do the calculationgs using Java it takes more than half a minute to transfer the data to the web server so I want to do the calculations on the SQL machine.

Select something1, something2 from table_name where table name is a variable
Multiple identically-structured tables (prerequisite for this sort of query) is contrary to the Principle of Orthogonal Design.
Don't do it. At least not without very good reason—with suitable indexes, (tens of) millions of records per table is easily enough for MySQL to handle without any need for partitioning; and even if one does need to partition the data, there are better ways than this manual kludge (which can give rise to ambiguous, potentially inconsistent data and lead to redundancy and complexity in your data manipulation code).

Related

Building a Search Procedure - MySQL - LIKE with multiple values vs LIKE with Temp Tables

I am sure that this question has been asked in many ways but I could not find an answer that comprehensively laid out my specific scenario - so here goes.
I am building a search function across multiple tables and fields. There is one table to store person information, another table to store their address information and yet another table to store their telephone information.
The screen provides a keyword input - which will trigger the search. The user can input a single keyword or multiple comma separated keywords
As I evaluated various options here is what I came up with
Do the search within a stored procedure since the search query will run against multiple different tables and different fields - this complete SQL logic is better kept in the database layer.
Send the keyword(s) as a comma separated or pipe separated string to the stored procedure from PHP - therefore, all the pre-processing of the keywords is done in PHP
Have the stored procedure process the search query and return a combined resultset that can be readily displayed
I have multiple options to design the stored procedure and this is where I am looking for advise / input from the community
After much reading on the use of LIKE vs REGEXP, I settled on the use of LIKE option to match keyword(s) against various attributes in the database and then generate a UNION of resultsets against various tables to send back
Pros - LIKE is extremely fast. It is very simple to use and gets the job done
Cons - If the user enters multiple keywords, then the logic gets very complicated as I have to generate multiple OR LIKE %keyword% statements within the stored procedure. An alternate option could be to use temporary tables and insert the results of each keyword search into the same
Question - does the use of temporary tables slow down performance of a stored procedure significantly? Average number of rows to be matched against will range around 8000 to 10000 and max limit could be 100000.
So will there be significant degradation in performance if I use a WHILE loop to search each keyword against the tables and insert results into a temporary table as opposed to dynamically generating OR LIKE statements and running a single query.
As of now, I have decided to go ahead with inserts into a temporary table instead multiple LIKE statements because of the following reasons
Over complication of the query statement - multiple LIKE clauses just created a very complicated stored procedure design and resulting query
Individual queries will probably be just as efficient as one large complicated query - even if there is a trade off I suspect it will be minor. I havent tested this with live data yet but will post an update once I have sufficient volume

Mysql - How can I selecting rows based on binary compare of PART of a blob

I have a table which has a varbinary column
I would like to select all rows where the first byte of that data is 0x0b
Is there a way to for a query which will select based on the compare of the first byte?
Maybe using a 'like'?
Thanks
Not in MySQL (which I'm assuming you are using based on the tags).
If you have the ability to create tables, I would recommend creating a sharding table that separates the bits you would need to query against into individual tinyint columns. You would need to handle the sharding outside of the DBMS since triggers wont be able to help, but having a sharding table would certainly make these types of queries much faster.
If not, then you are going to have to do a query for greater than & less than and then cycle through on a bitwise operation.
Both of these solutions assume to have access to the requesting system. If you are strictly DBA, then no.

Multiple tables for similar data in MySQL

I am writing a server using Netty and MySQL(with JDBC connector/J).
Please note that I am very new to server programming.
Say I have an application that users input about 20 information about themselves.
And I need to make some methods where I need only specific data from those information.
Instead of using "select dataOne, dataTwo from tableOne where userNum=1~1000"
create a new table called tableTwo containing only dataOne and dataTwo.
Then use "select * from tableTwo where userNum=1~1000"
Is this a good practice when I make tables for every method I need?
If not, what can be a better practice?
You should not be replicating data.
SQL is made in such a way that you specify the exact columns you want after the SELECT statement.
There is no overhead to selecting specific columns, and this is the way SQL is designed for.
There is overhead to replicating your data, and storing in 2 different tables.
Consequences of using such a design:
In a world where we used only select * we would need a different table for each combination of columns we want in results.
Consequently, we would be storing the same data repeatedly. If you needed 10 different column combinations, this would be 10X your data.
Finally, data manipulation statements (update, insert) would need to update the same data in multiple tables also multiplying the time needed to perform basic operations.
It would cause databases to not be scalable.

Mapping records between databases using identification numbering system

I have 2 databases, one mySQL database and a SQLite which sycronize back and forth to maintain the same data. To prevent duplicates on either side I was thinking of having a identifcation numbering sytem for records but im not sure how I will go about that?
I need to somehow create a unique ID for records on both databases, for example:
mySQL ===> data = 1, 5 id=???
sqLITE===> data = 1, 5 id=???
I need the ID to be the same, so when I syncronize it will not transfer over to the other database.
Another way I thought of is creating a hash between 2 columns in the database, and if the same data is on the other server then it does not syncronize that record of data.
Using a column of the database table as a unique identifier is not suitable in my case.
I'm really not sure how to go about this, so any help will be great, thanks!
I understand the question in the way that you need to somehow identify if two rows in two different SQL databases are the same, either because they were independently created or because of an earlier sync.
I think your idea with a hash value is fine. it should do the trick. However, you also could just concatenate the column values in a string and get the same result, maybe with a dash in between in case you have several data columns that would otherwise become ambiguous ("12-2" and "1-12" are then different)
But you still need to send over the generated hash values or concatenated strings of all rows in order to sync. Maybe it makes sense to track rows that are already synced? But then you may need to untrack them if updates of row data values happen.
I am not sure if this answer is helpful to you, because the question leaves many points open to speculation. Can I suggest to make it a bit more clear what you try to achieve?

Is there any way to make queries using functions in their WHERE sections relatively fast?

Let's take a table Companies with columns id, name and UCId. I'm trying to find companies whose numeric portion of the UCId matches some string of digits.
The UCIds usually look like XY123456 but they're user inputs and the users seem to really love leaving random spaces in them and sometimes even not entering the XY at all, and they want to keep it that way. What I'm saying is that I can't enforce a standard pattern. They want to enter it their way, and read it their way as well. So i'm stuck having to use functions in my where section.
Is there a way to make these queries not take unusably long in mysql? I know what functions to use and all that, I just need a way to make the search at least relatively fast. Can I somehow create a custom index with the functions already applied to the UCId?
just for reference an example of the query I'd like to use
SELECT *
FROM Companies
WHERE digits_only(UCId) = 'some_digits.'
I'll just add that the Companies tables usually has tens of thousands of rows and in some instances the query needs to be run repeatedly, that's why I need a fast solution.
Unfortunately, MySQL doesn't have such things as function- (generally speaking, expression-) based indexes (like in Oracle or PostgreSQL). One possible workaround is to add another column to Companies table, which will actually be filled by normalized values (i.e., digits_only(UCId)). This column can be managed in your code or via DB triggers set on INSERT/UPDATE.