Suppose I have N teams and what to generate a fixture list, where every team plays every other team, what is the best practice for this. Is there a known algorithm that does this nicely? Efficiency isn't really a necessity as this only needs to be generated once a season.
To be more specific, I'll start with some definitions:
I have N teams... T_1, T_2, ... , T_n. If N is odd, include a 'ghost' team to make the amount of teams even.
A set of fixtures for a week is a set of N/2 pairs, with no team in more than one pair.
A 'fixture list' is set of sets of fixtures such that every team is paired with every other team.
What I am trying to do is create a 'fixture list' with some kind of random element to it.
Thanks
Here is the usual way. If you need some random element, you can shuffle the list of teams first. It doesn't matter much, since every team plays every other anyway.
Related
Novice relational database design question here. This is a "I feel like I'm doing wrong. What is it?" question. What it boils down to is, how can I avoid unnecessary complexity when designing a relational database. I also realise that this is as much a question about class structure design, seeing as I'm using an ORM and I'm really thinking of these tables as objects.
Background
Lets say I want to record the results of a number of competitive games between an unspecified number of "players". These players all belong to a "leaderboard", so, the leaderboard has multiple players and records multiple results. The "score" for each of the players is recorded at the end of each game and belongs to a single "result" instance (see image). A score is also parented by the player to which it belongs.
edit: An example
Each row in the leaderboard table represents a collection of players who together form a league. For example, all of
the players who belong to a tennis league will have the same
leaderboard_id in the player table.
A row in the results table
represents a match that has taken place between players that belong to
a particular league. So the leaderboard_id associated with our
players is recorded in each result in this league. The results table
doesn't hold the score of each player, rather, I've attempted to
normalise (appologies for potentially inappropriate use of that term)
these into a score table.
Bringing this all together. We have a
league in the Leaderboard table, in which a game has taken places
between two players. These players belong to the league in
question. The two players have just played a match and their scores
are recorded as rows in the score table. These rows are collected
together under a single results_id, refereing to a row in the
results table.
Question
Q1. Does this make sense? Is there anything glaringly obvious that is wrong with this design?
As is, I can easily query the scores a particular player has accumulated over time, look up the players that played in a particular result, etc. However, there are some actions that really feel like they should be simple, but, to me, feel overly complicated.
For instance, if I want to return the most recent results for a particular player (ie not the player's scores, rather the results that contain a score that belongs to our player).
So, hand-wavey Q2. Maybe this is just lack of experience with SQL, but, I shouldn't have to do a JOIN to look up this, should I? But then, what's the alternative? Should I create a one-to-many composition between player and results, so that I can simply look up a player results?
With the current design to find the most recent result for a player I would need to do something like this (python sqlalchemy)
Score.query.join(Result, Player)\
.filter(Player.id ==player_id)\
.order_by(Result.timestamp.desc()).first()
Is this bad?
I'm trying to figure out the best way to manage this data storage problem....
I have a table of players, teams, and competitions.
A team may be involved in let's say 3 competitions.
A player belongs to a team, but may only be eligible to play in 2 of the 3 competitions that his or her team plays in. Likewise another player of the same team may be eligible for all 3.
I don't want to add a column to the player table for each competition as I'm then moving away from the relational model. Do I need another table 'competition_eligiblity' - this seems like a lot of work though!
Any suggestions?
Thanks,
Alan.
Yes, you do need a table for competition eligibility.
It really is no more work to put it there. Actually, it will be less work:
Adding a new competition in the future will be a real pain if it involves adding a new column to a table.
If the competition eligibility is stored in columns, performing a query to get information on eligibility becomes a nightmare.
Suppose you wanted to list all the competitions players are eligible for. Here would be your query:
select player, "competition1" from players where competition1_eligible = 1
union all
select player, "competition2" from players where competition2_eligible = 1
union all
select player, "competition3" from players where competition3_eligible = 1
union all
select player, "competition4" from players where competition4_eligible = 1
Sounds like fun, eh? Whereas, if you have an eligibility table, this information will be very simple to get.
Update: storing all the eligibility info in a single value would be even more of a nightmare, because imagine trying to extract that information back out of the string. That is beyond the limits of a sane SQL query.
Creating a new table is really a trivial piece of work, and you only have to do that once. Everything after that will be much easier if you do it that way.
I'm new to (My)SQL and it's been difficult to find good info on best practices of table design.
I want to save sequences of moves on a chess board, say I have an array $a = ['e4 e5', 'Nf3 Nc6', ...]
Being new, my first idea is a dumb little table with 2 columns, one for the Game ID and one for the moves. The moves (array) would be serialized and stored in a string. I guess this would technically work, but reading and writing potentially huge serialized arrays from a DB - perhaps on every page load - seems suboptimal to me.
Caching the array on the user side might not be possible for various reasons and is not something I'm curious about.
I'm interested in learning how to best store data that can't be entirely predicted in it's format (e.g. the number of moves can vary from 1 to 1000).
Rather than storing the entire game history in a single row, why not store the game ID, the move, and the sequence number of that move.
That way you would retrieve the entire history of a given game by doing something like
SELECT *
FROM MovesTable
WHERE gameID = id
ORDER BY sequence
In general, I would use a table in one of the following forms:
GameId, MoveNumber, Move
GameId, MoveNumber, FromSquare, ToSquare
Which one you use will depend on what you will need to query against and how the data will be presented, but I would lean toward the latter suggestion.
You can then combine this with a parent table that contains the GameId and some data about the game itself, such as dates or players.
If you're only going to consume the moves as an entire block - that is you always want the entire move chain and never will query into individual moves - you could store the string as you suggest. This has the added benefit that there is only one row of data to return, which will be very fast. The downside of course is that you will have to deserialize/parse the data once you receive it.
I'd do something like
Game, MoveNumber, Move
----------------------
Game1, 1, e4 e5
Game1, 2, Nf3 Nc6
...
Game2, 1, ....
You will need a number of tables, one for the players, one for games, one for the pieces and one for moves.
A game would have players and colours and things like date and time
A piece would tell you how it moves, is it a knight or a queen etc.
A player would have the player's name etc.
A move would have the piece, game, player, start position and end position
You would link these tables together by relationships based on id fields in each of the tables.
Try two tables.
First table contains information pertaining to each game. Who are the players, where and when did they play, who won, and whatever other information is pertinent to each game and makes each game unique.
Games
Game_id PK
Move_id FK
Black_name
White_name
Game_Location
Game_Date
Game_Time
Winner
The second table contains all the moves for each game. This contains all the pertinent information for each move: was a piece taken, was the other player put in check, was this a regular move or did they castle, etc.
Moves
Move_id PK
Move_number
Who_moved
Piece_moved
Square_from
Square_to
Piece_taken
Move_type
Check_YorN
Now each row in the Games table (game) is joined to many rows (moves) in the Moves table.
Using MySQL I have table of users, a table of matches (Updated with the actual result) and a table called users_picks (at first it's always going to be 10 football matches pr. gameweek pr. league because there's only one league as of now, but more leagues will come along eventually, and some of them only have 8 matches pr. gameweek).
In the users_picks table should i store each 'pick' (by pick I mean both 'hometeam score' and 'awayteam score') in a different row, or have all 10 picks in one single row? Both with a FK for user and gameweek. All picks in one row would mean I had columns with appended numbers like this:
Option 1: [pick_id, user_id, league_id, gameweek_id, match1_hometeam_score, match1_awayteam_score, match2_hometeam_score, match2_awayteam_score ... etc]
and that option doesn't quite fill me with joy, and looks a bit stupid. Especially since there's going to be lots of potential NULLs in the db. The second option would mean eventually millions of rows. But would look like this:
Option 2: [pick_id, user_id, league_id, gameweek_id, match_id, hometeam_score, awayteam_score]
What's the best practice? And would it be a PITA to do all sorts of statistics using the second option? eg. Calculating how many matches a user has hit correctly in a specific round, how many alltime correct hits etc.
If I'm not making much sense, I'll try to elaborate anything. I just wan't my table design to be good from the start, so I won't have a huge headache in a couple of months.
Thanks in advance.
The second choice is much better than the first. This is called database normalisation and makes querying easier, not harder. I would suggest reading the linked article, and the related descriptions of the various "normal forms", and aiming for a 3rd Normal Form data structure as a minimum.
To see the flaw in your first option, imagine if there were to be included later a new league with 11 matches. Or 400.
You should read up about database normalization.
When you have a 1:n relation, like in your case one team having many matches, you would create two tables. One table "teams" and a second table "matches" where each row includes the ID of the team which played the match.
In the same manner you should also have separate tables for users, picks and leagues.
Option two is better, provided you INDEX your table properly, since (as you indicate) it will grow quite large. The pick_id is the primary key, but also create an INDEX on the user_id field, as likely the most common query will be
SELECT * FROM `users_pics` WHERE `user_id`=?;
to get all the picks for a given user.
I am trying to find an optimal solution for my Database (MySQL), but I'm stuck over the decision whether or not to store a Total column.
This is the simplified version of my database :
I have a Team table, a Game table and a 'Score' table. Game will have {teamId, scoreId,...} while Score table will have {scoreId, Score,...} (Here ... indicates other columns in the tables).
On the home page I need to show the list of Teams with their scores. Over time the number of Teams will grow to 100s while the list of Score(s) will grow to 100000s. Which is the preferred way:
Should I sum up the scores and show along with teams every time the page is requested. (I don't want to cache because the scores will keep changing) OR
Should I have a total_score field in the Team table where I update the total_score of a team every time a new score is added to the Scores table for that group?
Which of the two is a better option or is there any other better way?
I use two guidelines when deciding to store a calculated value. In the best of all worlds, both of these statements will be true.
1) The value must be computationally expensive.
2) The value must have a low probability of changing.
If the cost of calculating the value is very high, but it changes daily, I might consider making a nightly job that updates the value.
Start without the total column and only add it if you start having performance issues.
Calculating sum at request time is better for accuracy but worse for efficiency.
Caching total in a field (dramatically) improves performance of certain queries, but increases code complexity or may show stale data (if you update cached value not at the same time, but via cron job).
It's up to you! :)
I agree that computed values should not be used except for special situations such as month end snapshots of databases.
I would simply create a view with one column in the view equal to your computed total column. Then you can query the view instead of the base tables.
Depending on how often your scores gets updated and what exactly the "score" means
Case1: Score is a LIVE score
If the "score" is the live score like "runs scored in cricket or baseball" or "score of vollyball match or tabletennis" then I really dont understand the need of showing the "sum" of the "running" scores. However, this may be a requirements also in some cases like showing the total runs scored by a team till now + the runs scored so far in the on going (live) game.
In this case I suggest you another option which is combination of your 1st and 2nd option
Total_score in the team table would be good with slight change in your data model. which is
Add a new column in the scores table called LIVE which will be 0 for a finished match 1 for a live match (and optionally -1 indicating match is about to start but the scores wont get update)
Now union two tables something like
select team_id,sum(total_sore) from (
select team_id,total_score from team
union
select team_id,sum(score) total_score from scores where live = 1 group by team_id)subquery
group by team_id
Case2: Score is just a RESULT
Well just query the db directly (your 1st option) as because the result will be updated only after the game ends and the update infact it will be a new entry in the score table.
If my assumption is correct, the scores get updated only after the game is finished. Moreover the update can be even less often when considered the games played by a team.