sql - Finding cost between two graph points - mysql

I have a table that stores the time between two adjacent railway stations.
+------+------+------+
| s1id | s2id | tbtw |
+------+------+------+
| 234 | 235 | 20 |
| 235 | 133 | 8 |
| 133 | 108 | 15 |
| 234 | 236 | 10 |
| 108 | 500 | 2 |
| 235 | 108 | 21 |
+------+------+------+
I want to find the time between any two point, if they can be connected like finding the time from station 234 to 500 (234->235->108->500). I know this is like a graph. I have tried to find the cumulative distance with t2.s1id = t1.s2id as follows
select t1.* SUM(t2.tbtw) as sum
from t t1
join t t2 on t1.s2id = t2.s1id
group by t1.id, t1.tbtw
but this is not giving me cumulative time or doesn't properly link with the nodes
select t1.*
(select sum(tbtw)
from t t2
where t2.s1id = t1.s2.id
) as sum
from t t1;
I can easily do it in a programming language, but in sql, its really confusing. Do I have to use procedures? Can't I do in simple sql statements? The solution is preferred in simple sql statements, however other solutions are also okay. Please help me.

see article: http://hansolav.net/sql/graphs.html
i guess it is helpfull

Related

Best way to gain performance and do fast sql queries?

I use MySQL for my database and i do some processing on the database side to make it easier for my application.
The queries i do used to be very fast until recently my database has lots of data and the queries are very very very slow.
My application do mainly statistics and has lots of related database to fetch data.
Here is an example:
tbl_game
+-------------------------------------+
| id | winner | duration| endedAt |
|--------+--------+---------+---------|
| 1 | 1 | 1200 |timestamp|
| 2 | 0 | 1200 |timestamp|
| 3 | 1 | 1200 |timestamp|
| 4 | 1 | 1200 |timestamp|
+-------------------------------------+
winner is either 0 or 1 for the team who won the game
duration is the number of seconds a game took
tbl_game_player
+-------------------------------------------------+
| gameId | playerId | playerSlot | frags | deaths |
|--------+----------+------------+-------+--------|
| 1 | 100 | 1 | 24 | 50 |
| 1 | 150 | 2 | 32 | 52 |
| 1 | 101 | 3 | 26 | 62 |
| 1 | 109 | 4 | 48 | 13 |
| 1 | 123 | 5 | 24 | 52 |
| 1 | 135 | 6 | 30 | 30 |
| 1 | 166 | 7 | 28 | 48 |
| 1 | 178 | 8 | 52 | 96 |
| 1 | 190 | 9 | 12 | 75 |
| 1 | 106 | 10 | 68 | 25 |
+-------------------------------------------------+
The details are only for the first game with id 1
1 game has 10 player slots where slot 1-5 = team 0 and 6-10 = team 1
There are more details in my real table this is just to give an overview.
So i need to calculate the statistics of each player in all the games. I created a view to accomplish this and it works fine when i have little data.
Here is an example:
+--------------------------------------------------------------------------+
| gameId | playerId | frags | deaths | actions | team | percent | isWinner |
|--------+----------+-------+--------+---------+------+---------+----------|
actions = frags + deaths
percent = (actions / sum(actions of players in the same team)) * 100
team is calculated using playerSlot in 1,2,3,4,5 or 6,7,8,9,10
isWinner is calculated by the team and winner
This is just 1 algorithm and i have many others to perform. My database is 1 milion + records and the queries are very slow.
here is the query for the above:
SELECT
tgp.gameId,
tgp.playerId,
tgp.frags,
tgp.deaths,
tgp.frags + tgp.deaths AS actions,
IF(playerSlot in (1,2,3,4,5), 0, 1) AS team,
((SELECT actions) / tgpx.totalActions) * 100 AS percent,
IF((SELECT team) = tg.winner, 1, 0) AS isWinner
FROM tbl_game_player tgp
INNER JOIN tbl_game tg on tgp.gameId = tg.id
INNER JOIN (
SELECT
gameId,
SUM(frags) AS totalFrags,
SUM(deaths) AS totalDeaths,
SUM(frags) + SUM(deaths) as totalActions,
IF(playerSlot in (1,2,3,4,5), 0, 1) as team
FROM tbl_game_player
GROUP BY gameId, team
) tgpx on tgp.gameId = tgpx.gameId and team = tgpx.team
It's quite obvious that indexes don't help you here¹, because you want all data from the two tables. You even want the data from tbl_game_player twice, once aggregated, once not aggregated. So there are millions of records to read and join. Your query is fine, and I see no way to improve it really.
¹ Of course you should always have indexes on primary and foreign keys, so the DBMS can make use of them in joins. (E.g. there should be an index on tbl_game(tgp.gameId)).
So your options lie outside the query:
Hardware (obviously).
Add a computed column for the team to tbl_game_player, so at least you save its evaluation when querying.
Partitions. One partition per team, so the aggregates can be calcualted separately.
Pre-computed data: Add a table tbl_game_team holding the sums; fill it with triggers. Thus you don't have to compute the aggregates in your query.
Data warehouse table: Make a table holding the complete result. Fill it with triggers or at intervals.
Setting up indexes would speed up your queries. Queries can take a while to run if there is a lot of results, this is definitely a start though.
for large databases Mysql INDEX can be very helpful in speed problems, An index can be created in a table to find data more quickly & efficiently. so must create index , you can learn more about MYsql index here http://www.w3schools.com/sql/sql_create_index.asp

Convert Mysql Query to Rails ActiveRecord Query Without using find_by_sql

I have table named questions like follows
+----+---------------------------------------------------------+----------+
| id | title | category |
+----+---------------------------------------------------------+----------+
| 89 | Tinker or work with your hands? | 2 |
| 54 | Sketch, draw, paint? | 3 |
| 53 | Express yourself clearly? | 4 |
| 77 | Keep accurate records? | 6 |
| 32 | Efficient? | 6 |
| 52 | Make original crafts, dinners, school or work projects? | 3 |
| 70 | Be elected to office or make your opinions heard? | 5 |
| 78 | Take photographs? | 3 |
| 84 | Start your own political campaign? | 5 |
| 9 | Free spirit or a rebel? | 3 |
| 38 | Lead a group? | 5 |
| 71 | Work in groups? | 4 |
| 2 | Helpful? | 4 |
| 4 | Mechanical? | 6 |
| 14 | Responsible? | 6 |
| 66 | Pitch a tent, an idea? | 1 |
| 62 | Write useful business letters? | 5 |
| 28 | Creative? | 3 |
| 68 | Perform experiments? | 2 |
| 10 | Like to figure things out? | 2 |
+----+---------------------------------------------------------+----------+
I have a sql query to get one random record from each category.Can any one convert the mysql query to rails activerecord query(with out using Question.find_by_sql).This mysql query is working absolutely fine but I need only active record query because of my dependency in further steps.
Here is mysql query
SELECT t.id, title as question, category
FROM
(
SELECT
(
SELECT id
FROM questions
WHERE category = t.category
ORDER BY RAND()
LIMIT 1
) id
FROM questions t
GROUP BY category
) q JOIN questions t
ON q.id = t.id
Thank You for your consideration!
When things get crazy one have to reach out for Arel:
It is intended to be a framework framework; that is, you can build
your own ORM with it, focusing on innovative object and collection
modeling as opposed to database compatibility and query generation.
So what we want to do is to let Arel create the query for us. Moreover the approach here is gonna be used: the questions table is left joined with randomized version of itself:
q_normal = Arel::Table.new("questions")
q_random = Arel::Table.new("questions").project(Arel.sql("*")).order("RAND()").as("q2")
Time to left join
query = q_normal.join(q_random, Arel::Nodes::OuterJoin).on(q_normal[:category].eq(q_random[:category])).group(q_normal[:category]).order(q_random[:category])
Now you can use which columns you want using project, e.g.:
query.project(q_normal[:id])
The only way I can think of to do this requires a good bit of application code. I don't think there's a way of accessing the RAND() functionality in MySQL (or equivalent in other DB technologies) using ActiveRecord. Here's what I came up with:
counts = Question.group(:category_id).count(:id)
offsets = {}
counts.each do |cat_id, count|
offsets[cat_id] = rand(count)
end
random_questions = []
offsets.each do |cat_id, offset|
random_questions.push(Question.where(:category_id => cat_id).offset(offset).first)
end

MySQL query with insanely long list of columns and tables

My database contains a table tab0 with two columns, id and mjd, and thousands of tables tab1... tabM with five columns, id,A,B,C, and D. The columns contain thousands of elements.
Which is the best way to obtain something like this?
+-----+-------------+-------------+-------------+
| mjd | A (of tab_1)| A (of tab_2)| A (of tab_m)|
+-----+-------------+-------------+-------------+
| 1 | 123 | 423 | 523 |
| 2 | 233 | 243 | 633 |
| ... | ... | ... | ... |
| n | 353 | 343 | 753 |
+-----+-------------+-------------+-------------+
Can I obtain the list of columns and tables from INFORMATION_SCHEMA and then use it to construct my query like
SELECT t0.mjd, t1.A, t2.A, ... tM.A FROM tab0 as t0, tab1 as t1, ... tabM as tM
WHERE t0.id=t1.id and ... and t0.id=tM.id;
or it is a completely insane approach?

MySQL: optimize query for scoring calculation

I have a data table that I use to do some calculations. The resulting data set after calculations looks like:
+------------+-----------+------+----------+
| id_process | id_region | type | result |
+------------+-----------+------+----------+
| 1 | 4 | 1 | 65.2174 |
| 1 | 5 | 1 | 78.7419 |
| 1 | 6 | 1 | 95.2308 |
| 1 | 4 | 1 | 25.0000 |
| 1 | 7 | 1 | 100.0000 |
+------------+-----------+------+----------+
By other hand I have other table that contains a set of ranges that are used to classify the calculations results. The range tables looks like:
+----------+--------------+---------+
| id_level | start | end | status |
+----------+--------------+---------+
| 1 | 0 | 75 | Danger |
| 2 | 76 | 90 | Alert |
| 3 | 91 | 100 | Good |
+----------+--------------+---------+
I need to do a query that add the corresponding 'status' column to each value when do calculations. Currently, I can do that adding the following field to calculation query:
select
...,
...,
[math formula] as result,
(select status
from ranges r
where result between r.start and r.end) status
from ...
where ...
It works ok. But when I have a lot of rows (more than 200K), calculation query become slow.
My question is: there is some way to find that 'status' value without do that subquery?
Some one have worked on something similar before?
Thanks
Yes, you are looking for a subquery and join:
select s.*, r.status
from (select s.*
from <your query here>
) s left outer join
ranges r
on s.result between r.start and r.end
Explicit joins often optimize better than nested select. In this case, though, the ranges table seems pretty small, so this may not be the performance issue.

MySQL using GROUP BY to group by multiple columns

I'd like to use GROUP BY multiple columns, I think it's best to start with an example:
SELECT
eventsviews.eventId,
showsActive.showId,
showsActive.venueId,
COUNT(*) AS count
FROM eventsviews
INNER JOIN events ON events.eventId = eventsviews.eventId
INNER JOIN showsActive ON showsActive.eventId = eventsviews.eventId
WHERE events.status = 1
GROUP BY showsActive.venueId, showsActive.showId, showsActive.eventId
ORDER BY count DESC
LIMIT 100;
Output:
| *eventId* | *showId* | *venueId* | *count* |
+-----------+----------+-----------+---------+
[...snip...]
| 95 | 92099 | 9770 | 32 |
| 95 | 105472 | 10702 | 32 |
| 3804 | 41225 | 8165 | 17 |
| 3804 | 41226 | 8165 | 17 |
| 923 | 2866 | 5451 | 14 |
| 923 | 20184 | 5930 | 14 |
[...snip...]
What I would like instead:
| *eventId* | *showId* | *venueId* | *count* |
+-----------+----------+-----------+---------+
| 95 | 92099 | 9770 | 32 |
| 3804 | 41226 | 8165 | 17 |
| 923 | 20184 | 5930 | 14 |
So, I want my data grouped by eventId, but only once for each showId and venueId ...
I actually have a SQL query that does that, but it has 8 subqueries and is as slow as a T-Ford ... And since this is executed on every page load, speeding things up looks like a good idea!
There are a few questions like this, and I've tried many different things, but I've been at this query for an hour and I can't seem to get it to work as I want :-(
Thanks!
You probably want either a min or a max on showid, and then not include it in the group by, I can't tell which because looking at your "prefered" resultset, you have both.
If you want your data grouped by eventId, group just by eventId and you'll get exactly the result you're looking for.
This is a MySQL feature (?) that it allows you to select non-aggregate columns, in which case it will return the first row available. In other DBMS it's achieved by DISTINCT ON, which is not available in MySQL.