How to add dynamic range to database (store the ranges in a table) - mysql

Table (CostTitle)
Id_ _costTitle_
1 A
2 B
3 C
4 D
5 E
6 F
A Refers numbers between 0-99
B Refers numbers between 100-199
C Refers numbers between 200-299
D Refers numbers between 300-399
E Refers numbers between 400-499
F Refers numbers between 500-599
costCode will be base on costTitle's refers numbers
Table (CostCode)
Id_ _costTitle_ _costCode_ _costProductTitle_
1 A 12 productX
2 B 111 productY
3 B 142 productZ
4 C 201 productK
5 F 511 productL
6 F 582 productM
I am trying to add product and assign dynamically cost code.
Thanks for advance

I suppose you want to store the ranges in a table. So you need a BEFORE INSERT trigger, which sets new.costTitle. Triggers are explained here:
http://dev.mysql.com/doc/refman/5.6/en/create-trigger.html
MariaDB offers an alternative: dynamic columns. However, because of the limits of this feature, you cannot store the ranges in a different table. You will need to hardcode the ranges in the virtual column definition, which doesn't seem to me a great idea (but you decide, of course).
https://mariadb.com/kb/en/mariadb/virtual-computed-columns/

Related

How to perform a many-to-many or (at least) a outer-join in SPSS

usually I use [R] for my data analysis, but these days I have to use SPSS. I was expecting that data manipulation might get a little bit more difficult this way, but after my first day I kind of surrender :D and I really would appreciate some help ...
My problem is the following:
I have two data sets, which have an ID number. Neither data sets have a unique ID (in one data set, which should have unique IDs, there is kind of a duplicated row)
In a perfect world I would like to keep this duplicated row and simply perform a many-to-many-join. But I accepted, that I might have to delete this "bad" row (in dataset A) and perform a 1:many-join (join dataset B to dataset A, which contains the unique IDs).
If I run the join (and accept that it seems not to be possible to run a 1:many, but only a many:1-join), I have the problem, that I lose IDs. If I join dataset A to dataset B I lose all cases, that are not part of dataset B. But I really would like to have both IDs like in a full join or something.
Do you know if there is (kind of) a simple solution to my problem?
Example:
dataset A:
ID
VAL1
1
A
1
B
2
D
3
K
4
A
dataset B:
ID
VAL2
1
g
2
k
4
a
5
c
5
d
5
a
2
x
expected result (best solution):
ID
VAL1
VAL2
1
A
g
1
B
g
2
D
k
3
K
NA
4
A
a
2
D
x
expected result (second best solution):
ID
VAL1
VAL2
1
A
g
2
D
k
3
K
NA
4
A
a
5
NA
c
5
NA
d
5
NA
a
2
D
x
what I get (worst solution):
ID
VAL1
VAL2
1
A
g
2
D
k
4
A
a
5
NA
c
5
NA
d
5
NA
a
2
D
x
From your example It looks like what you need is a full many to many join, based on the ID's existing in dataset A. You can get this by creating a full Cartesian-Product of the two dataset, using dataset A as the first\left dataset.
The following syntax assumes you have the STATS CARTPROD extention command installed. If you don't you can see here about installing it.
First I'll recreate your example to demonstrate on:
dataset close all.
data list list/id1 vl1 (2F3) .
begin data
1 232
1 433
2 456
3 246
4 468
end data.
dataset name aaa.
data list list/id2 vl2 (2F3) .
begin data
1 111
2 222
4 333
5 444
5 555
5 666
2 777
3 888
end data.
dataset name bbb.
Now the actual work is fairly simple:
DATASET ACTIVATE aaa.
STATS CARTPROD VAR1=id1 vl1 INPUT2=bbb VAR2=id2 vl2
/SAVE OUTFILE="C:\somepath\yourcartesianproduct.sav".
* The new dataset now contains all possible combinations of rows in the two datasets.
* we will select only the relevant combinations, where the two ID's match.
select if id1=id2.
exe.

How to get the NA rate by column in Tableau Desktop?

I try to get a simple thing with Tableau, the % of null value by column of my dataset.
But each time I put my dimensions on my columns it displays all of possible values of this dimension, it's impossible to make the Python's equivalent of dataframe.isna().sum()/len(dataframe).
By default Tableau visualizes all possible values.
Assuming you have this kind of data:
ID Category
1 A
2 B
3 C
4 D
5 E
6 F
7
8
9
10
You can get your Rate with a simple Calculated field:

Combine multi rows in access - 1 field

I have data in multiple rows, and I need to combine data in similar columns and separate with a semi colon, to end up with one row with grouped by ID.
I have
ID Type
1 A
1 B
1 C
2 D
3 A
3 F
I want results to be
1 A;B;C
2 D
3 A;F
I have limited knowledge of access, but know this should be basic and easy. I appreciate assistance.

How to compare values from stored in the same table with qlikview?

being new to qlikview Im a litle confused with I should do in sql and what qlik provides out of the box.
Lets suppose I have a table similar to this :
id Status type value quantity dat_s Area
1 Activo A 10 10 20171001 Norte
2 Activo B 20 20 20171001 Norte
3 Activo C 15 15 20171001 Sul
4 Fechado A 5 5 20171101 Norte
5 Activo B 20 20 20171101 Norte
6 Activo D 5 5 20171101 Sul
7 Activo D 5 5 20170901 Sul
Id like to compare a table with itself, but only the likes from selected dates, lets imagine, data A = 20171001 and date B= 20171001 (these should be user defined via an input field or whatever) the comparison id like to do is for example :
Type CountDateA ValDateA CountDateB ValDateB valuediff
A 1 100 1 25 -75
B 1 400 1 400 0
C 1 225 0 0 -225
D 0 0 1 25 25
or
Area ValDateA ValDateB valuediff
Norte 500 425 -75
Sul 225 25 -200
I was planing to duplicate the table and use different field names for the same data leaving half empty but I hope there is a more elegant way
Thanks all.
just needed to load the table and then the calculations of the clumns would be :
Sum(< Status ={$('Activo')}, dat ={$(20171001)} qty*val)
Still quite confused with your problem. Qlikview's power relies (in few words) on building graphs or tables that are automatically updated depending on selected filters. In your example, I guess, that filter would be the date (or dates) the user selects. Hence, you wouldn't need to define columns like ValDateA, ValDateB etc. In your case however, it seems that you want to compare EXACTLY two dates, so you could define those columns, each of them depending on different date pickers. This being said, I'll show you how I would approach your problem although I'm not really sure whether I understood well:
I assume you read your data correctly so you have the first data table on memory (with the fields: id Status type value quantity dat_s Area)(note: be careful and consistent with capital letters)
Create a table chart with dimension "type" (which will autofilter each row expression) and with these expressions:
Count(distinct{< Status = {"Activo"}, date_s= {"$(vDate1)"} >} id) //how many rows in active state for date1 (vDate1 is a variable assigned to the first date picker)
Sum({< Status = {"Activo"}, date_s= {"$(vDate1)"} >} value*quantity)
Same as expression 1 but using $(vDate2)
Same as expression 2 but using $(vDate2)
In Qlikview you can just write Column(4) - Column(2), in QlikSense you would need to write the whole expressions 2 and 4 again and subtract the sums.

Efficiently joining over interval ranges in SQL

Suppose I have two tables as follows (data taken from this SO post):
Table d1:
x start end
a 1 3
b 5 11
c 19 22
d 30 39
e 7 25
Table d2:
x pos
a 2
a 3
b 3
b 12
c 20
d 52
e 10
The first row in both tables are column headers. I'd like to extract all the rows in d2 where column x matches with d1 and pos1 falls within (including boundary values) d1's start and end columns. That is, I'd like the result:
x pos start end
a 2 1 3
a 3 1 3
c 20 19 22
e 10 7 25
The way I've seen this done so far is:
SELECT * FROM d1 JOIN d2 USING (x) WHERE pos BETWEEN start AND end
But what is not clear to me is if this operation is done as efficient as it can be (i.e., optimised internally). For example, computing the entire join first is not really a scalable approach IMHO (in terms of both speed and memory).
Are there any other efficient query optimisations (ex: using interval trees) or other algorithms that can handle ranges efficiently (again, in terms of both speed and memory) in SQL that I can make use of? It doesn't matter if it's using SQLite, PostgreSQL, mySQL etc..
What is the most efficient way to perform this operation in SQL?
Thank you very much.
Not sure how it all works out internally, but depending on the situation I would advice to play around with a table that 'rolls out' all the values from d1 and then join on that one. This way the query engine can pinpoint the right record 'exactly' instead of having to find a combination of boundaries that match the value being looked for.
e.g.
x value
a 1
a 2
a 3
b 5
b 6
b 7
b 8
b 9
b 10
b 11
c 19 etc..
given an index on the value column (**), this should be quite a bit faster than joining with the BETWEEN start AND end on the original d1 table IMHO.
Off course, each time you make changes to d1, you'll need to adjust the rolled out table too (trigger?). If this happens frequently you'll spend more time updating the rolled out table than you gained in the first place! Additionally, this might take quite a bit of (disk)space quickly if some of the intervals are really big; and also, this assumes we don't need to look for non-whole numbers (e.g. what if we look for the value 3.14 ?)
(You might consider experimenting with a unique one on (value, x) here...)