I got a TABLE in Excel format like:
SEE LARGE IMAGE HERE
This is a STEN TABLE containing FACTORS (A,B,C,D,E....etc)
Each Factor is like a separate test.
Each Factor i.e. Factor A has a [Raw Score] and a [Sten]
Supposing Factor A, had a question like:
-----------------------------------
Why do humans have Eyes?
The answer options could be like:
a) To Watch movies = [Raw Score] -> 10,
b) To Read Novels = [Raw Score] -> 5,
c) To close them while sleeping = [Raw Score] ->0
So if they Chose a) then, the system will go to the STEN TABLE to get the STEN Equivalent under Factor A, in this case, the sten equivalent will be 4. (See Factor A->Row score 10-> Sten column)
What could be the most practical way to have this STEN TABLE with Factors and their Raw Scores and Stens created?
Something like:
**STEN TABLE**
|
|
**FACTORS (A,B,C,D...)**
/ \
/ \
/ \
/ \
/ \
**[Raw Score] [Sten]**
-----------------------
EDIT 1:
To a larger image, please click here: http://ctrlv.in/459785
Please note that, the stens are not equal for all Factors, though the Raw Scores are the same. i.e. in Factor A, [Raw Score]->3 = [sten]->2 but in Factor C [Raw Score]->3 = [sten]->1 and in Factor F [Raw Score]->1 [sten]->2 whereas in Factor E [Raw Score]->1 = [sten]->1.
Any Suggestion is highly appreciated.
The table structure should be:
Questions
fields: id, factor, text
Answers
fields: id, question_id, text, raw_score
Sten-table(s)
fields: id; factor, raw_score, sten
The id is a unique primary auto-incrment field identifying the line. When you get the answer to a question, you take the factor from the question and the raw_score from the answer and do a simple select on the Sten-table like
SELECT sten FROM sten_table WHERE factor = 'A' AND raw_score = 10
EDIT
The Sten-table would have all lines from your Excel-sheet below eachother:
id, factor, raw_score, sten
1, A, 1, 1
2, A, 2, 1
3, A, .....
...
22, B, 1, 1
23, B, 2, 1
...
Related
I have data in PostGIS that have value and geometry. If there is a same value within let say <10 m, I wanna detect or remove that value from my table. Here is the small example:
create table points (id serial primary key, val integer, label2);
select addGeometryColumn('points', 'geom', 1, 'point', 2);
insert into points (id, val, label2, geom) values
(1, 1, aaa, st_geomFromText('POINT(1 1)', 1)),
(2, 1, bbb, st_geomFromText('POINT(1 2)', 1)),
(3, 1, aaa, st_geomFromText('POINT(10 100)', 1)),
(4, 2, ccc, st_geomFromText('POINT(10 101)', 1));
because of data(id) 1 and 2 has the same value and distance<10m, so there just will be:
id |val| source | geom
-----+------------+------
3 | 1 | aaa | xxx
4 | 2 | ccc | xxx
Do you know how to query that in PostGIS?
First, I would consider what are the real requirements? E.g. consider points on a line with 8 meter distance: A, B, C and equal value. Do you want that to be reduced to A and C, or B? Both eliminate duplicates within 10 meters, but the result is different. What about A, B, C, D - would you like result to be A, C, or B, D, or A, D, or maybe B, C? Defining specific criteria is not trivial, and sometimes is hard to implement in SQL.
Or maybe you don't care, and just want to reduce point density? Then it is simpler, just compute snapped = ST_SnapToGrid with appropriate grid size, and group by equal values of snapped, value and chose arbitrary point from each group. Note that this does not guarantee there are no close points (points with similar coordinates can snap to different grid cells) but it does reduce most duplicates and it is very cheap computationally.
I have a database with four columns corresponding to the geographical coordinates x,y for the start and end position. The columns are:
x0
y0
x1
y1
I have an index for these four columns with the sequence x0, y0, x1, y1.
I have a list of about a hundred combination of geographical pairs. How would I go about querying this data efficiently?
I would like to do something like this as suggested on this SO answer but it only works for Oracle database, not MySQL:
SELECT * FROM my_table WHERE (x0, y0, x1, y1) IN ((4, 3, 5, 6), ... ,(9, 3, 2, 1));
I was thinking it might be possible to do something with the index? What would be the best approach (ie: fastest query)? Thanks for your help!
Notes:
I cannot change the schema of the database
I have about 100'000'000 rows
EDIT:
The code as-is was actually working, however it was extremely slow and did not take advantage of the index (as we have an older version of MySQL v5.6.27).
To make effective use of the index, we could rewrite the IN predicate
example
(x0, y0, x1, y1) IN ((4, 3, 5, 6),(9, 3, 2, 1))
Like this:
( ( x0 = 4 AND y0 = 3 AND x1 = 5 AND y1 = 6 )
OR ( x0 = 9 AND y0 = 3 AND x1 = 2 AND y1 = 1 )
)
EDIT
Newer versions of MySQL optimizer fix the performance problem; generate execution plans that make more effective use of available indexes.
The (a,b) IN ((7,43),(7,44),(8,1)) syntax has been supported in MySQL many versions back, but there were performance problems with it (at least with with non-trivial sets) because of the suboptimal execution plan generated by the optimizer.
But the optimizer has been improved in newer versions of MySQL; the newer optimizer can generate more efficient execution plans.
Note a similar related problem with OR constructs. Here's an example query intended to get the "next page" of 20 rows ordered by columns seq and sub (unique tuple). The last fetched page (seq,sub)=(7,42)
With much older versions of MySQL, this syntax would not be accepted
WHERE (seq,sub) > (7,42)
ORDER BY seq, sub
LIMIT 20
And when MySQL did support the syntax, we would get an execution plan like if we had written
WHERE ( seq > 7 )
OR ( seq = 7 AND sub > 42 )
ORDER BY sub, seq
LIMIT 20
we would get a much more efficient the execution plan if we instead write something subtly different:
WHERE ( seq >= 7 )
AND ( seq > 7 OR sub > 42 )
ORDER BY sub, seq
LIMIT 20
and we would get a much better plan from the MySQL optimizer. we'd expect the optimizer plan to use available UNIQUE INDEX on (sub,seq), and return rows in index order from a range scan operation...
I do not understand your point. The following query is valid MySQL syntax:
SELECT *
FROM my_table
WHERE (x0, y0, x1, y1) IN ((4, 3, 5, 6), ... ,(9, 3, 2, 1));
I would expect MySQL to use the composite index that you have described. But, if it doesn't you could do:
SELECT *
FROM my_table
WHERE x0 = 4 AND y0 = 3 AND x1 = 5 AND y1 = 6
UNION ALL
. . .
SELECT *
FROM my_table
WHERE x0 = 9 AND y0 = 3 AND x1 = 2 AND y1 = 1
The equality comparisons in the WHERE clause will take advantage of an index.
MySQL allows row constructor comparisons like you show, but the optimizer didn't know how to use an index to help performance until MySQL 5.7.
https://dev.mysql.com/doc/refman/5.7/en/row-constructor-optimization.html
You can concatenate the four values into a string and check them like that:
SELECT *
FROM my_table
WHERE CONCAT_WS(',', x0, y0, x1, y1) IN ('4,3,5,6', ..., '9,3,2,1');
The way you are doing is giving correct results in the mysql version on my machine. I am using v5.5.55. Maybe you are using an older one. Please check that.
If you still want to solve this problem in your own version or the above mentioned solution doesn't work then only read the next solution.
I am still not clear about data types and range of all your columns here. So I am assuming that data type is integer and range is between 0 to 9. If this is the case you can easily do this as given below.
select * from s1 where x0+10*x1+100*y1+1000*y2 in (4356,..., 9321);
I am fairly new to SQL, Big Query
I have a dataset and I want to retrieve values in column 2 corresponding to the values in column 1 if they satisfy certain conditions. I want to know how to do that. I am using Big Query Platform
Example Dataset D :
Col 1 ; Col 2
A ; 1
B ; 2
C ; 3
D ; 4
E ; 5
Query to retrieve values of col1, col2 such that col2 >2
Expected Output :
C ; 3
D ; 4
E ; 5
I am using big query platform.
According to me,
SELECT col1,col2
FROM [D]
WHERE col2>2
will give col1 and col2 as outputs where col2>2 but the values in col2 may or may not be the ones corresponding to col1.
Am I wrong ? If so, please suggest a query to get necessary output.
If you don't have a row A;5, it won't ever exist in your return. The only time you need to worry about the mismatch is if you're doing a join between one data set of {A, B, C, D, E} and another of {1, 2, 3, 4, 5}. Then every possible combination from A;1, A;2... to ...E;4, E;5 would be output, and filtering on col2 > 2 would produce A;3, B;3, C;3, ..., etc. But that isn't how your data is set up in your question, so don't worry. If you wonder how a select query will work, it's usually okay to just run it, unless it will take hours and consume tons of resources and you have a budget... but it seems more like you're doing homework.
Also don't ask for homework help on stack overflow.
I have a mysql like:
id (UNSIGNED INT) PrimaryKey AutoIncrement
name (VARCHAR(10)
status UNSINGED INT Indexed
I use the status column to represent 32 different statuses like:
0 -> open
1 -> deleted
...
31 -> something
This is convenient to use since I do not know how many statuses I have (Now we support 32 statuses , we can use a long int to support 64, if more than 64 (highly unlikely we will see :) )
The prolem with this approach is that there is no index in the
bit level -> queries selecting where a bit is set are slow.
I can improve a bit using range queries -> where status between n1 and n2 .
Still this is not a good approach.
I want to point out that I want to search only if a few of the 32 bits are set (let's say bits 0, 12 , 13, 21, 31).
any ideas to improve perfomance?
If for some reason you cannot normalize your data as suggested by RandomSeed in the previous answer, I'm pretty sure you can just put an index on the field and search using int values (that is 2^n).
For example if you need bit 0, 12 and 13 set, search where status = 2^0 + 2^12 + 2^13.
Edit: If you need to search where those bits are set, regardless of other bits, you could try using bitwise operators, e.g. for bits 0, 12 and 13, search where status & 1 = 1 and status & 4096 = 4096 and status & 8192 = 8192
However compared to a ranged query I'm not sure what will be the performance improvement (if any). So as said before, normalization might be the only solution.
Normalize your data.
MainEntity:
id (UNSIGNED INT) PrimaryKey AutoIncrement
name (VARCHAR(10)
Status:
id (UNSIGNED INT) PrimaryKey AutoIncrement
label (VARCHAR(10))
EntityHasStatus:
entity_id (UNSIGNED INT) PrimaryKey
status_id (UNSIGNED INT) PrimaryKey
Entities having both statuses 1 and 5:
SELECT MainEntity.*
FROM MainEntity
JOIN EntityHasStatus AS Status1
ON entity_id = MainEntity.id
AND Status1.status_id = 1
JOIN EntityHasStatus AS Status5
ON entity_id = MainEntity.id
AND Status1.status_id = 5
Entities having either status 4 or 6:
SELECT MainEntity.*
FROM MainEntity
LEFT JOIN EntityHasStatus AS Status4
ON entity_id = MainEntity.id
AND Status4.status_id = 4
LEFT JOIN EntityHasStatus AS Status6
ON entity_id = MainEntity.id
AND Status6.status_id = 6
WHERE
Status4.status_id IS NOT NULL
OR Status6.status_id IS NOT NULL
These queries should be virtually instant (prefer the first form when possible, as it is a tad bit more efficient).
A simple quiz:
Probably many guys know this before,
In my app there is a query in which Im using concat in where condition like this,
v_book_id and v_genre_id are 2 variables in my procedure.
SELECT link_id
FROM link
WHERE concat(book_id,genre_id) = concat(v_book_id,v_genre_id);
Now, I know there is a catch/bug in this, which will occur only twice in your lifetime. Can you tell me what is it?
I found this out yesterday and thought I should make a noise about all others practicing this.
Thanks.
Let's have a look
WHERE concat(book_id,genre_id) = concat(v_book_id,v_genre_id);
as opposed to
WHERE book_id = v_book_id AND genre_id = v_genre_id;
There. The second solution is
faster (optimal index usage)
easier to write (less code)
easier to read (what on earth was the author thinking to concatenate numbers???)
more correct (as Alnitak also stated in the question's comments). check out this sample data:
book_id | genre_id
1 | 12
11 | 2
Now add (or concat) v_book_id = 1 and v_genre_id = 12 and see how you'll get funny results with your concat() query
Note, some databases (including MySQL) allow operations on tuples, which may be what the clever author of the above really intended to do:
WHERE (book_id, genre_id) = (v_book_id, v_genre_id);
A working example of such a tuple predicate:
SELECT * FROM (
SELECT 1 x, 2 y FROM DUAL UNION ALL
SELECT 1 x, 3 y FROM DUAL UNION ALL
SELECT 1 x, 2 y FROM DUAL
) a
WHERE (x, y) = (1, 2)
Note, some databases will need extra parentheses around the right-hand side tuple : ((1, 2))