How do I store orders? - mysql

I have an app which has tasks in it and you can reorder them. Now I was woundering how to best store them. Should I have a colomn for the ordernumber and recalculate all of them everytime I change one? Please tell me a version which doesn't require me to update all order numbers since that is very time consuming (from the executions point of view).
This is especially bad if I have to put one that is at the very top of the order and then drag it down to the bottom.
Name (ordernumber)
--
1Example (1)
2Example (2)
3Example (3)
4Example (4)
5Example (5)
--
2Example (1) *
3Example (2) *
4Example (3) *
5Example (4) *
1Example (5) *
*have to be changed in the database
also some tasks may get deleted due to them being done

You may keep orders as literals, and use lexical sort:
1. A
2. Z
Add a task:
1. A
3. L
2. Z
Add more:
1. A
4. B
3. L
2. Z
Move 2 between 1 and 4:
1. A
2. AL
4. B
3. L
etc.
You update only one record at a time: just take an average letter between the first ones that differ: if you put between A and C, you take B, if you put between ALGJ and ALILFG, you take ALH.
Letter next to existing counts as existing concatenated with the one next to Z. I. e. if you need put between ABHDFG and ACSDF, you count it as between ABH and AB(Z+), and write AB(letter 35/2), that is ABP.
If you run out of string length, you may always perform a full reorder.
Update:
You can also keep your data as a linked list.
See the article in my blog on how to do it in MySQL:
Sorting Lists
In a nutshell:
/* This just returns all records in no particular order */
SELECT *
FROM t_list
id parent
------- --------
1 0
2 3
3 4
4 1
/* This returns all records in intended order */
SELECT #r AS _current,
#r := (
SELECT id
FROM t_list
WHERE parent = _current
)
FROM (
SELECT #r := 0
) vars,
t_list
_current id
------- --------
0 1
1 4
4 3
3 2
When moving the items, you'll need to update at most 4 rows.
This seems to be the most efficient way to keep an ordered list that is updated frequently.

Normally I'll add an int or smallint column named something like 'Ordinal' or 'PositionOrdinal' as you suggest, and with the exact caveat you mention — the need to update a potentially significant number of records every time a single record is re-ordered.
The benefit is that given a key for a specific task and a new position for that task, the code to move an item is just two statements:
UPDATE `Tasks` SET Ordinal= Ordinal+1 WHERE Ordinal>=#NewPosition
UPDATE `Tasks` SET Ordinal= #NewPosition WHERE TaskID= #TaskID
There are other suggestions for a doubly linked list or lexical order. Either can be faster, but at the cost of much more complicated code, and the performance will only matter when you have a lot of items in the same group.
Whether performance or code-complexity is more important will depend on your situation. If you have millions of records the extra complexity might worth it. However, I normally prefer the simpler code because users normally only order small lists by hand. If there aren't all that many items in the list the extra updates won't matter. This can typically handle thousands of records without any noticeable impact in performance.
The one thing to keep in mind with your updated example is that the column is only used for sorting and not otherwise shown directly to the user. Thus, when dragging an item from the top to the bottom as shown the only thing you need to change is that one record. It doesn't matter that you'll leave the first position empty. This means there is a small potential to overflow your integer sort with enough re-ordering, but let me say again: users normally only order small lists by hand. I've never heard of this risk actually causing a problem.

Out of your answers I came up with a mixture which goes as follows:
Say we have:
1Example (1)
2Example (2)
3Example (3)
4Example (4)
5Example (5)
Now if I sort something between 4 and 5 it would look like this:
2Example (2)
3Example (3)
4Example (4)
1Example (4.5)
5Example (5)
now again something between 1 and 5
3Example (3)
4Example (4)
1Example (4.5)
2Example (4.75)
5Example (5)
it will always take the half of the difference between the numbers
I hope that works please do correct me ;)

We do it with a Sequence column in the database.
We use sparse numbering (e.g. 10, 20, 30, ...), so we can "insert" one between existing values. If the adjacent rows have consecutive numbers we renumber the minimum number of rows we can.
You could probably use Decimal numbers - take the average of the Sequence numbers for rows adjacent to where you are inserting, then you only have to update the row being "moved"

This is not an easy problem. If you have a low number of sortable elements, I would just reset all of them to their new order.
Otherwise, it seems it would take just as much work or more to "test-and-set" to modify only the records that have changed.
You could delegate this work to the client-side. Have the client maintain old-sort-order and new-sort-order and determine which row[sort-order]'s should be updated - then passes those tuples to the PHP-mySQL interface.
You could enhance this method in the following way (doesn't require floats):
If all sortable elements in a list are initialized to a sort-order according to their position in the list, set the sort-order of every element to something like row[sort-order] = row[sort-order * K] where K is some number > average number of times you expect the list to be reordered. O(N), N=number of elements, but increases insertion capacity by at least N*K with at least K open slots between each exiting pair of elements.
Then if you want to insert an element between two others its as simple as changing its sort-order to be one that is > the lower element and < the upper. If there is no "room" between the elements you can simply reapply the "spread" algorithm (1) presented in the previous paragraph. The larger K is, the less often it will be applied.
The K algorithm would be selectively applied in the PHP script while the choosing of the new sort-order's would be done by the client (Javascript, perhaps).

I'd recommend having an order column in the database. When an object is reordered, swap the order value in the database between the object you reordered and the objects that have the same order value, that way you don't have to reoder the entire set of rows.
hope that makes sense...of course, this depends on your rules for re-ordering.

Related

BO Webi - Need Variable to Remove Nulls From Results

I'm looking to create a variable that will take the take the numeric value for a dimension. I tried removing Nulls in my query details, but that won't work because some results only have a Null value (see screenshot) and I was losing results that way.
I also need the variable for use in a cross tab table so I can do a count of each acuity level. I tried creating a Max variable on the acuity field =Max([Acuity Level]). That works for the main tab, but it doesn't work in a cross tab table. Please see attached screenshots for more details.
Acuity Crosstab
Column: Acuity Level
Row: Tracking Date
=FormatDate([Start Tracking Date & Time];"MM/dd/yyyy")
Body: # of Patients
=Count([Financial Number])
First off I created a query with your test data so I could drop into a free-hand SQL query so I have an example with which I can work. I added Row Number to maintain the row order of your data.
My approach requires three variables. A different approach may be possible requiring less variables or the formulas in the variables could be consolidated. However, I like to keep them separated for better understanding of the logical progression and better maintainability.
Var Acuity Level Adjusted gets set to -1 if the Acuity Level is Null and otherwise leave it as is just to make it easier to deal with...
=If(IsNull([Acuity Level]); -1; [Acuity Level])
Var Max Acuity Level is the greatest value of Var Acuity Level Adjusted within each combination of Patient Name and Encounter Type. This is called a calculation context. I do not understand the nuances of this topic well enough to explain why what I have below works, but it does. I refer to that previous link a lot. Also, this is why it was important that I picked -1 to replace Null.
=Max([Var Acuity Level Adjusted]) In ([Patient Name]; [Encounter Type])
Var Max Filter flags the row where the first two variables are equal. This variable is necessary because you cannot filter based on one object relative to another object.
=If([Var Acuity Level Adjusted] = [Var Max Acuity Level]; 1; 0)
Now if I add those variables it looks like this...
Then we can add a filter to only show the records where Var Max Filter = 1. You can hide the extra columns or even delete them from the table.
Hope you can apply this to your situation.

How do I replace values in a column in KNIME?

I have a column of countries with 50 different values that I want to reduce to United States and Other.
Can someone help me with that?
Another example is Age which has 48 values that I'd like to reduce to only 4 like 1 to 18 = youth, 18-27 = starting, etc.
I've actually got about 5 columns that I want to reduce the values of. So would I need to repeat the process multiple times in KNIME or can I accomplish multiple column value replacements at once?
The latter on can easily be achieved with the Rule Engine
$Col0$ > 1 AND $Col0$ <18 => "youth"
For the First problem I'd use a String Replace (Dictionary).
I don't think you replace all at once but you can loop over columns.
For the second case I would use Numeric Binner:
For each column a number of intervals - known as bins - can be
defined. Each of these bins is given a unique name (for this column),
a defined range, and open or closed interval borders. They
automatically ensure that the ranges are defined in descending order
and that interval borders are consistent. In addition, each column is
either replaced with the binned, string-type column, or a new binned,
string-type column is appended.

SQL - How to select/delete rows with the max value depending on two others

I'm an intern in a property rental company. I'm in charge of developping the CRM under Symfony.
So, ones of my entities are properties (houses) and their availabilities. See the table structure below.
The problem I'm facing for now, is that the availabilities had been defined for each day (e.g. 28/01, 29/01, 30/01) instead of being defined for a range of day (e.g. 28/01 -> 30/01). So, the table is really heavy (~710 000 rows). Furthermore, before we changed the way of editing an availability, it created a new row for a same date instead of editing it. So, there are a lot of duplications in this table.
What I want, is to lighten the DB by keeping only the rows which have the max value in date_modif_availabilities for the same date_availabilities and id_properties.
For example, if I have these rows (availabilities_duplications):
I only want to keep the row with the latest modif like this (availabilities_keep_max_value) :
The thing is, I don't know enough the SQL language. I'm able to write few basics scripts but not complex subqueries. Even with code samples that I found.
Thank you in advance for your help.
You could select the elements for which no element with greater modified date exists.
Something like this:
SELECT avl1.*
FROM availabilities avl1
WHERE NOT EXISTS (SELECT *
FROM availabilities avl2
WHERE avl1.date_availabilities = avl2.date_availabilities
AND avl1.id_properties = avl2.id_properties
AND avl2.date_modif_availabilities > avl1.date_modif_availabilities);
This of course has the pre-condition that the combination of the three columns date_availabilities, id_properties and date_modif_availabilities is unique.
Furthermore, it seems that all columns (except the PK) may be NULL. Looks kinda odd to me.
You can use subquery :
select t.*
from table t
where id_availabilities = (select t1.id_availabilities
from table t1
where t1.id_properties = t.id_properties and
t1.date_availabilities = t.date_availabilities
order by t1.date_modif_availabilities desc
limit 1
);
However, if you have concern about the performance, then you want index on (id_properties, date_availabilities and id_availabilities).

Sorting/Ordering sequenced pairs of data in MySQL?

I am trying to determine if there's a way to sort rows of a MySQL table that consists of start/finish columns. (Could also be thought of as parent/child relations or other linked list arrangement)
Here's an example of how the data is currently stored:
id start finish
2 stepthree stepfour
6 stepfive stepsix
9 stepone steptwo
78 stepfour stepfive
121 steptwo stepthree
(The id numbers in this are not relevant, just using them to indicate additional columns of arbitrary data)
I want to sort/display these row in order, presuming I am always starting with "stepone", that traverses the start-> finish chain like, each "finish" being followed by the row with it as a "start".
desired output
9 stepone steptwo
121 steptwo stepthree
2 stepthree stepfour
78 stepfour stepfive
6 stepfive stepsix
There shouldn't be any branching/splits normally, just a sequential series of steps or states. I can't use simple alpha sorting (in my case the start and finish values are codes created by a customer), but can't figure out any other way to order these using SQL. I could programmatically do it using most languages, but stumped about doing it just with SQL.
Any clever ideas?
I would recommend having another table that has each step mapped to its precedence order.
Then you can write a query to sort each row in the order of precedence of the start step.

Accurate pagination by datetime field

I have a database table, for example 'items'. I have a timeline of these items, sorted by field ascended_at (datetime). I need to make a pagination api for such timeline. So, the first my version was:
HTTP GET /items/timeline?page=[PAGE_NUM]
which fires
SELECT * FROM items LIMIT 10 OFFSET [0, 10, 20, ...] ORDER BY ascended_at;
but here is the problem: when new item arrives, all pages shifts per 1 item. To avoid this, i have added from_asc_at parameter:
HTTP GET /items/timeline?page=[PAGE_NUM]&from_asc_at=123123123
which fires
SELECT * FROM items WHERE ascended_at <= [asc_at_parameter] LIMIT 10 OFFSET [0, 10, 20, ...] ORDER BY ascended_at;
but this is not accurate, because it is possible to have two items with same ascended_at, and you can see the same item in two different pages (but should not).
So, my question is: what are the possible solutions for this?
Use ID (because it is unique)? But what if it is not ordered by ID?
Any ideas more?
If your items IDs are auto-incremented, you could check what will be the next "autoincrement" value when retrieving items the first time (before pagination).
Store that value persistently (maybe in a session var) until the next search, and add a filter < {maximumID} to your SQL query, to improve the "result set stability" when the user paginates (all new items created between the initial search and paginations won't be retrieved).
EDIT
To handle items deletions, you will have to do "soft deletes" : do not immediately delete an item from DB, but store a deletion date in a datetime field, so that items still exist in DB for a while.
When a new search is issued, you will store in session the current server time, and add a criteria (for example date_deleted IS NULL OR date_deleted > {searchDate}), so that all the items deleted after a search will still be displayed for that specific search.
You will have to create a scheduled job to "really" delete items from DB after some delay.