Is It Efficient and Scalable for a Neural Network to Rely on Weights that Require Database Interaction?

Is It Efficient and Scalable for a Neural Network to Rely on Weights that Require Database Interaction? - mysql

I'm a high school senior interested in computer science and I have been programming for almost nine years now. I've recently become interested in machine learning and I have decided to implement a neural network. I haven't begun to code it yet and have been in the designing stage for a while now. The objective of the program is to analyze a student's paper, along with some other information, and then predict what grade the student will receive, much like PaperRater. However, I plan to make it far more personal than PaperRater.
The program has four inputs, one is the student's paper, the second is the student's id (i.e, primary key), third is the teacher's id, and finally the course id. I am implementing this on a website where registered, verified users alone can submit their papers for grading. The contents of the paper are going to be weighed in relation to the relationship between the teacher and student and in relation to the course difficulty. The network adapts to the teacher's grading habits for certain classes, the relationship between the teacher and student (e.g., if a teacher dislikes a student you might expect to see a drop in the student's grades), and the course-level (e.g., a teacher shouldn't grade a freshman's paper as harshly as a senior's paper).
However, this approach poses some considerable problems. There is an inherent limit imposed, where the numbers of students, teachers and courses prove to be too much and everything blows up! That's because there is no magic number which can account for every combination of student, teacher and course.
So, I've concluded that each teacher, student, and course must have an individual (albeit arbitrary) weight associated with them, not present in the Neural Network itself. The teacher's weight would describe her grading difficulty, and the student's weight would describe her ability as a writer. The weight of the course would describe the difficulty of the course. Of course, as more and more data is aggregated, the weights should adapt to become more accurate representations.
I realize that there is a relation between teachers and students, teachers and courses, and students and courses; therefore, I plan to make three respective hidden layers which sum the weights of its inputs and apply an activation function. How could I store the weights associated with each teacher, student and course, though?
I have considered storing it in their respective tables, but I don't know how well that would scale (or for that matter, if it would work). I also considered storing it in a file and calling it like that, but I'm sure that would be even worse than storing it in a database.
So the main question I have is: is it (objectively) efficient, in terms of space and computational complexity, and scalable, to store and manage separate, individual weights for each possible element of certain inputs in a SQL database outside of the neural network, if there are a finite (not necessarily small) amount of possible choices for such inputs, and still receive a reasonable output?
Regardless, I would like an explanation as to how come. I believe it would be just fine, but I can't justify it myself and so I'm asking for help. Thanks in advance!
(P.S.: If you realize any problems with my approach not covered in the scope of this question, or have general advice, please include it as an addendum to your answer or please message me).

Related

Making smaller models from pre-existing redundant models

Sorry for the vague title.
I will just start with an example. Say that I have a pre-existing model that can classify dogs, cats and humans. However, all I need is a model that can classify between dogs and cats (Humans are not needed). The pre-existing model is heavy and redundant, so I want to make a smaller, faster model that can just do the job needed.
What approaches exist?
I thought of utilizing knowledge distillation (using the previous model as a teacher and the new model as the student) and training a whole new model.

First, prune the teacher model to have a smaller version to be used as a student in distillation. A simple regime such as magnitude-based pruning will suffice.
For distillation, as your output vectors will not match anymore (the student is 2 dimensional and teacher is 3 dimensional, you will have to take this into account and only calculate this distillation loss based on the overlapping dimensions. An alternative is layer-wise distillation in which the output vectors are irrelevant and the distillation loss is calculated based on the difference between intermediate layers of the teacher and student. In both cases, total loss may include a difference between student output and label, in addition to student output and teacher output.
It is possible for a simple task like this that just basic transfer learning would suffice after pruning - that is just to replace the 3d output vector with a 2d output vector and continue training.

How to design a relational model for double-entry accounting with job costing

I would like to commend to readers the answers here and here for the depth and thought that went into them. I stumbled across them while searching for something tangential for a project I'm working on, and I got caught up reading them from top to bottom.
I am trying to build a niche-market app using these principles (namely, double-entry accounting), with job-costing thrown in. The above answers have been extremely helpful in reshaping my concept of what both the accounting and the database-ing should look and work like. However, I'm having a hard time integrating the job-costing portion of the equation into the excellent graphical examples that were provided.
There were several transaction examples using the House, account holders, fees, etc. I have a few other specific use-cases I would love to get some input on:
I have no customers. I buy a property (usually cash goes out, a liability (loan) is created, an asset (the property) is created), spend a bunch of money to fix it up (either cash out at a store, credit card charges at a store, or a check written to a vendor, which debits the property asset and debits or credits the funding source), and then sell it (cash comes in, the loan is paid off, and hopefully there's more cash left than what I spent on the project). This likely creates more ledger entries than I've listed above, but I'm not an accountant. I think I understand that all my costs go toward my basis in the property, and if my net proceeds are greater than my basis, then I've made money, and if not, then not.
So what I need to record are expenses that a) come from a specific account (i.e. company checking account or owner's Best Buy card etc.), b) are generally associated with a specific job (but not always - I do have the occasional overhead expense like office supplies), and c) are always associated with a cost code (i.e. '100.12 - Window Materials', '100.13 - Window Labor', etc.).
Frequently I receive bills from vendors that are due sometime in the future. I would like to track the bills received but not-yet-paid for a given job (committed costs). I think this transaction looks like this, but I'm not really sure:
As you may have surmised from my quip above about the "owner's Best Buy card," I sometimes (more often than I should) use my personal funds for company- and job-related expenses. I think (again with the caveat that I'm a layman) that all of those expenditures credit "Owner's Equity," and debit/credit other accounts as needed.
I've been keeping track of all of this in a big, ugly spreadsheet, which is why I'm trying to build an app to replace it - the spreadsheet method doesn't work very well and it certainly won't scale.

Preliminary
For those reading this Answer, please note that the context is as follows, in increments:
Derived Account Balance vs Stored Account Balance
Relational Data Model for Double-Entry Accounting
If you have not availed yourself to that, this Answer may not make sense.
I will respond in a sequence that is Normalised, which is of course different to the way you have laid out the problem.
Principle & Correction
There are a few, more than one, errors in your stated problem which you are not aware of, so the first step is awareness; understanding. Once a problem is correctly and precisely declared, it is easy to solve. These are errors that developers commonly make, so they need to be understood as such ... long before an app is contemplated.
1 First Principle
I've been keeping track of all of this in a big, ugly spreadsheet [the spreadsheet method doesn't work very well and it certainly won't scale], which is why I'm trying to build an app to replace it
If the manual (or the previous computerised) system is broken, and you implement a new or replacement app that is based on it, you are guaranteed to carry that broken-ness into the app.
Worse, if this is not understood, a third app can be written, promising to fix the problems in the second app, but it too, is guaranteed to migrate the problems that were not fixed in the first and second app.
Therefore, you must identify and correct every single problem in the system that you are replacing, including testing, before you can design an app and database that has any chance of success.
Scaling is the least of our worries. How any particular thing works with any other thing is the problem.
The fact that you have one great big ugly spreadsheet means that you have an overall perspective: humans can do that, we can fly by the seat of our pants, but computers cannot, they require explicit instructions.
2 Second Principle
I've been keeping track of all of this in a big, ugly spreadsheet [...] - the spreadsheet method doesn't work very well
Why does it not work [as it stands] ?
Reason 1 of 2.
You make a mistake that developers commonly make: you inspect and study the the bits and pieces of a thing, which is in the physical realm, and try to figure out how the thing works. Guaranteed failure, because how a thing works; its purpose; etc, is in the intellectual realm, not the physical.
I won't detail it here, but the larger problem must be noted. This error is a specific instance of a larger error, and very common, that:
developers focus on the functions of the GUI,
instead of the demand, which is to
correctly define the data and its relations, upon which the functions of the GUI are existentially dependent.
A person who has not learned about internal combustion, cannot figure out how to build an engine from looking at the parts of an engine that has been taken apart, even if the parts are laid out carefully. Let alone one with injectors or turbo-chargers. The principle of internal combustion is logical, the parts are physical.
Here you have looked at the spreadsheets that others have used to do their Accounting, and perhaps copied that, without understanding what they are doing with the spreadsheets.
Case in point.
You have examined the first and second linked Answers, and you think you can figure out how to apply that to a new app that fixes the dirty big spreadsheet problem.
Many developers think that if they work out the nuts and bolts, copy-paste-and-substitute, somehow the app will work. Note the carefully thought-out, but still incomplete, graphics that details perceived transactions.
They are missing the logical realm, and messing with the physical realm without the demanded understanding of what they are messing with.
In a word, forget about the pretty graphics for the Transactions, both yours and mine, and seek to understand the Logic (this principle) and the Accounting Standard [3].
"Test driven development" aka "code the minimum" aka "trial and error" is a totally bankrupt method, it has no scientific basis (marketing, yes, but science, no), and it is guaranteed to fail. Dangerous, because the cost is ongoing, never finite.
And to keep failing, if you understand the above.
More precisely, it is anti-science, in that it contradicts the science for building apps and databases.
So the first step is to break that great big spreadsheet down into logical units that have a purpose. And certainly, link each referencing spreadsheet column to the right columns in the referenced spreadsheet ... such that any Amount value is never duplicated.
3 Third Principle
I've been keeping track of all of this in a big, ugly spreadsheet [...] - the spreadsheet method doesn't work very well
Why does it not work, either as it stands, or when the spreadsheet has been divided into logical units ?
Reason 2 of 2.
Lack of Standards.
Since the subject matter is Accounting, we must use Accounting Standards.
That single great big ugly spreadsheet is ready evidence that you have not used an Accountant to set it up. And of course, you cannot set up a set of spreadsheets to do your Accounting without either understanding Accounting or using a qualified Accountant.
Therefore the second step is to either get an Accountant, or obtain a good understanding of Accounting. Note again, the ready evidence of your carefully thought out transactions: despite the fact that you are a very capable person, you cannot figure out the Accounting logic that is in the first and second linked Answers, let alone the Accounting that you need for your app (or your manual system).
So the best advice I can give you is, as stated in the Double-Entry Accounting Answer, find some good Tutorials on the web, and study them.
If you did that, or hired an Accountant to set up your books, you would split the single big fat spreadsheet into standard Accounting Spreadsheets:
Balance Sheet:
Asset or Liability
Profit & Loss:
Revenue or Expense
and one more set (later)
Another way of stating this principle is this. When one is ignorant that a Standard exists, or worse, when one knowingly chooses to not comply with it, one is left in the dangerous position of re-inventing the wheel, from scratch. Aka "Test driven development", aka "code the minimum possible", aka "trial and error". That means that one will go through an entire series of increments of development, which can be eliminated by observance of the Standard.
Problem & Solution
Now that we understand the principles, we can move on to determination of the specific problems, and their solutions. Each of these is a specific application of the Third Principle.
4 Property/Mortgage Treatment
I have no customers. I buy a property (usually cash goes out, a liability (loan) is created, an asset (the property) is created), spend a bunch of money to fix it up (either cash out at a store, credit card charges at a store, or a check written to a vendor, which debits the property asset and debits or credits the funding source), and then sell it
I am not saying that you have not heeded the advice I have given in the Double-Entry Answer. I am saying you have not appreciated the gravity of the advice; what it means in an Accounting context (before we venture into the database context).
Money represents value. Money; value, cannot be created or destroyed. It can only be moved. From one bucket to another. The demand is to have your buckets defined and arranged properly, according to [3].
The property is not created, it already exists. When you buy a property, there is a movement of your cash to the bank, and a movement of their property to you. In the naïve sense only, the property is now an "asset", the mortgage is now a "liability". That naïveté will be clarified into proper accounting buckets later.
You are, in fact, operating as a small single-branch bank; a cooperative; a casino. The precise context for the Double-Entry Accounting Answer. The following is true for
either a corrected set of spreadsheets,
or for following and implementing the Double-Entry Accounting Answer (if you go directly into the app ... without testing the correction to your single spreadsheet).
This is really important to understand, because it has to do with legislation in your country, which you have not mentioned. That legislation will be known to you as Taxation, or your Tax Return for the business. Even if you hold just one property at any one time.
Your "customer" is each bank that is engaged for each property. Name it for the property.
Each mortgage (property) should be set up as an External Account. That will allow you to conduct only those transactions that are actually related to it, against it. Loan Payments; Bank Charges; Expenses; etc. There will be no incoming money, until the property is sold.
In any case, the External Account will match the Bank Statement that the bank gives you for the mortgage account (which you did not mention, but which is a fundamental requirement of Accounting).
As defined in the Double-Entry Accounting Answer, every transaction on an ExternalAccount will have one Double-Entry leg in the Ledger. More, later.
Whether it is an Asset or a Liability in Accounting terms, is a function of the Ledger entry, not a function of the External Account. (By all means, we know it represents a property, which by a naïve perspective is an "asset", until it starts losing money, when it by naïve perspective, becomes a "liability".)
Another way of defining this point is, the bank loan represents a contract, upon which money (value) "changes hands" (is moved). The bank which you engaged is the "customer", the External Account. You must keep all income and expense related to the contract, with the contract.
niche-market app ...
I have a few other specific use-cases ...
No, you don't. There is nothing new under the sun. If you set up your books correctly (multiple linked spreadsheets using Accounting Standards), this is a vanilla use case. Hopefully my explanation has demonstrated that fact.
5 Ledger
Where the above points have to do with the intellectual realm, the understanding of each problem and therein the solution, which causes little work in the physical realm, this point, which has the same demand for the intellectual, is onerous at the physical level. That is, the number of keystrokes; checking; changes; checking ... before you get it set up correctly.
Although the first linked Answer deals with:
Derived vs Stored Account Balance (efficient and audit-able processing re month end),
and the second linked Answer deals with:
Double-Entry Accounting (implementation of an over-arching Accounting Standard in an existing Accounting system, a higher level of audit-ability),
neither explains the Ledger in detail.
The Ledger is the central article of any Accounting system.
The Double-Entry system is not a stand-alone article, but an advancement to that Ledger.
The data model is the specific how to set the database up correctly for both the app, and any reporting client s/w to use, uneventfully.
You do not have a true Ledger. The single big spreadsheet is not a Ledger.
You must set up the Ledger, according to [3]. At best, some of the items in that spreadsheet will be entries in the Ledger, but note, you will perceive them quite differently, due to the corrections set forth in [1][2][3].
Note that when we say "put that in the Ledger" or "that is not in the Ledger", which is for simplicity, what we mean precisely is a reference to single Ledger Entry, which is identified by a specific Account Number in the Ledger.
In the data model, this is LedgerNo.
Likewise, when we say "Accounts", we mean precisely a single Account Number in the Ledger.
If a transaction is not in the Ledger (a specific Account Number, a LedgerNo, one leg of the DEA Credit/Debit), it is not in the "accounts", it is not accounted for.
This is where you will set up genuine Accounts for Assets, and for Liabilities. This is for Internal purposes, in the Ledger, as declared in the margin for Internal in the data model.
The best advice I can give you is, trawl the web for Tutorials on Accounting; determine which are good; study them carefully, with a view to setting up a proper Ledger for your purposes.
The simple answer is, the Ledger is an Hierarchy of Account Numbers.
Wherein the leaf level is an actual AccountNo that can be transacted against,
and the non-leaf levels exist for the purpose of aggregation, no transactions allowed.
Whenever the Ledger is reported (or any derivative of the Ledger, such as BalanceSheet or Profit & Loss):
the hierarchy is shown by indentation,
the transactional Account entries show the Current Balance for the current month
and the aggregate Account entries show the aggregate for the tree under it
[your graphics re transactions]
First and foremost, every Transaction is in the Ledger. That means one leg of the Double-Entry Accounting Transaction is in the Ledger. Look at § 5 in my Double-Entry Accounting Answer, notice that every Business Transaction has at least one blue entry (do not worry about the other details).
Second, the other DEA leg is:
either in the Ledger, meaning that the money moved between one Ledger Account LedgerNo and another Ledger Account LedgerNo. Notice the Business Transactions where both sides are blue.
or in an External Account, meaning that the money moved between one Ledger Account LedgerNo and an External Account AccountNo. Notice the Business Transactions where one side is blue and the other is green.
When you understand that, and you have your Ledger set up, there will be no "??" in your graphics, and the blue/green will be shown. (Do not re-do your graphics, I expect that this Answer will suffice.)
Your "asset/liab" designation is not correct. More precisely, it is premature to make that declaration before the Ledger is fully defined and arranged. First set up your Ledger, with Asset/Liability for each entry in mind. Then you will not have to declare "asset/liab" on each transaction, because that is a function of the Ledger Account Number LedgerNo, not a function of the transaction.
expenses that a) come from a specific account (i.e. company checking account or owner's Best Buy card etc.),
Ledger-ExternalAccount
(one DEA leg in the Ledger, the other leg in the External Account). Noting the caveats above. The other DEA leg will be to one of these (hierarchy):
Expense/Property Improvement/Structure/Material
Expense/Property Improvement/Structure/Labour
Expense/Property Improvement/Fitting/Material
Expense/Property Improvement/Fitting/Labour
Expense/Property Improvement/Furniture
expenses that c) are always associated with a cost code (i.e. '100.12 - Window Materials', '100.13 - Window Labor', etc.).
You will no longer have "cost codes", they will all be Ledger Account Numbers LedgerNos, because the Ledger is where you account for anything and everything.
One DEA leg in the Ledger, the other leg in the External Account for the particular property. The hierarchy will be the same as the previous point.
expenses that b) are generally associated with a specific job
Ledger-ExternalAccount
(one DEA leg in the Ledger, the other in the External Account).
(but not always - I do have the occasional overhead expense like office supplies)
Ledger-Ledger
one DEA leg in the Ledger for an Expense or Liability LedgerNo ... that the money was paid to
Expense/Regular/Office Supplies
the other leg in the Ledger for a Revenue or Asset LedgerNo ... that the money was paid from
Revenue/Monthly Payable
6 Credit & Other Card Treatment
credit card charge
Best Buy card
Each of your cards represents a contract, an Account that that needs to be transacted against, that must be balanced against the monthly statement provided by the institution that issued the card.
Set up each one as an External Account, one DEA leg here, the other in the Ledger.
"owner's Best Buy card" is not clear to me (who is the owner, you or the property owner ... if the latter then the assumption thus far, that "you" buy and sell properties is incorrect.)
In any case, I believe I have given enough detail for you to figure it out.
Do not amalgamate an owner's property Account and their Best Buy card into one External Account: keep separate External Accounts for each.
7 Job Costing
Notice that I am addressing this last, because once you fix the big problems, the problems that remain, are small. What you set out as the big problems (job costing; profit/loss per property) are, once the Ledger has been set up correctly for your business, actually small problems.
As far as I can see, Job Costing is the only remaining point that I have not addressed. First, the issue to be understood here is, the difference between Actuals and Estimates. Everything I have discussed thus far are Actuals.
For Estimates, the Standard procedure is to set up a separate Account structure (tree in the hierarchy) in the Ledger. These are often called Suspense Accounts, as in money that is held in suspense.
Treated properly, these Accounts will prevent you from closing or finalising an External Account before all the Estimates have been transferred to Actuals (Suspense to zero).
The Business Transactions are exactly the same as for Actuals.
This will provide precise tracking of such figures, and also the difference when an item moves from Estimate to Actual.
8 Data Model • Job Costing
Noting that the data model in the first and second linked Answers are complete for the purpose, wherein the Ledger is not expanded:
this Answer deals with explanation of the Ledger, and this data model gives the full definition of a Ledger
Arranged by AccountType
A single-parent hierarchy
Only the leaf level LedgerAccount may be transacted against
The intermediate level LedgerIntermediate is for summarising the tree below it.
I have further Normalised Transaction
expanded External Account to show a Person vs an Organisation
All constraints are made explicit.
Obviously too large for an inline graphic. Here is a PDF in two pages:
the Data Model alone (as above)
the Data Model with sample data and notes, it includes all the examples covered in the Answer
Note the indentation in the Ledger, which denotes the Account hierarchy
Comments
How do you insert the first ledger (e.g. 100 Asset, no parent)?
The Ledger is a Tree, a Single Parent Hierarchy (aka "one way" for strange reasons), as per Account Hierarchy. A root row is required. In a database build operation (using DDL from a file), we generally do all our CREATE TABLEs, followed by all our ADD CONSTRAINT FKs. Insert the root row in with the CREATE TABLE.
After the
CREATE TABLE Ledger
do
INSERT Ledger VALUES ( 0, 0, "I", "AL", "Root", ... ).
After the
CREATE TABLE LedgerIntermediate
do
INSERT LedgerIntermediate VALUES ( 0 ).
Given that the reverse of Comprises is belongs to, all first-level Ledgers eg. Fees, House, Interbank and your Asset would belong to this root row.

Building an autonomic drugs widget for medical education

I've made my way over to this community because I'm planning on building a widget to help medical students with understanding the effects of various autonomic medications on cardiovascular metrics like heart rate (HR), BP (systolic, diastolic, and mean) and peripheral resistance (SVR). Some background - I'm a 3rd year med student in the US without fluency in any programming languages (which makes this particularly difficult), but am willing to spend the time to pick up what I need to know to make this happen.
Regarding the project:
The effects of autonomic medications like epinephrine, norepinephrine, beta-blockers, and alpha-blockers on the cardiovascular system is of great interest to physicians because these drugs can be used to resuscitate, to prep for surgery, to slow the progression of cardiovascular and respiratory disease, and even as antidotes for certain toxicities. There are four receptor types we are primarily concerned with - alpha1, alpha2, beta1, beta2. The receptor selectivity profile of any given drug is what governs its effects on the CV system. The way these effects are taught and tested in med school classrooms and by the United States board exams is in the form of graphs.
The impetus for this project is that me and many of my classmates struggled with this concept when we were initially learning it, and I believe a large part of that arises from the lack of a resource which shows the changes in the graphs from baseline, in real time.
When being taught this info, we are required to consider: a) the downstream effects when the receptor types listed above are stimulated (by an agonist drug) or inhibited (by an antagonist); b) the receptor specificities of each of the autonomic drugs of interest (there are about 8 that are very important); c) how to interpret the graphs shown above and how those graphs would change if multiple autonomics were administered in succession. (Exams and the boards love to show graphs with various points marked along it, then ask which drugs are responsible for the changes seen, just like the example above.)
The current methods of learning these three points is a mess, and having gone through it, I'd like to do what I can to contribute to building a more effective resource.
My goal is to create a widget that allows a user to visualize these changes with up to 3 drugs in succession. Here is a rough sketch of the goal.
In this example, norepinephrine has strong alpha1 agonist effects which causes an increase in systolic (blue line), diastolic (red line), and mean BP, as well as peripheral resistance. Due to the increased BP, there is a reflexive decrease in HR.
Upon the administration of phentolamine, a strong alpha1 antagonist, the BP and SVR decline while HR increases reflexively.
Regarding the widget, I would like the user to be able to choose up to 3 drugs from a drop down menu (eg. Drug 1, Drug 2, Drug 3), and the graphs to reflect the effects of those drugs on the CV metrics while ALSO taking into account the interactions of the drugs with themselves.
This is an IMPORTANT point: the order in which drugs are added is important because certain receptors become blocked, preventing other drugs from having their primary effect so they revert to their secondary effect.
If you're still following me on this, what I'm looking for is some help in figuring out how best to approach all the possibilities that can happen. Should I try to understand if-then statements and write a script to produce graphs based off those? (eg. if epi, then Psys = x, Pdia = y, MAP = z). Should I create a contingency table in excel in which I list the 8 drugs I'm focusing on and make values for the metrics and then plot those, essentially taking into account all the permutations? Any thoughts and direction would be greatly appreciated.
Thank you for your time.

How to handle properties that exist "between" entities (in a many-to-many relationship in this case)?

I've found a few questions on modelling many-to-many relationships, but nothing that helps me solve my current problem.
Scenario
I'm modelling a domain that has Users and Challenges. Challenges have many users, and users belong to many challenges. Challenges exist, even if they don't have any users.
Simple enough. My question gets a bit more complicated as users can be ranked on the challenge. I can store this information on the challenge, as a set of users and their rank - again not too tough.
Question
What scheme should I use if I want to query the individual rank of a user on a challenge (without getting the ranks of all users on the challenge)? At this stage, I don't care how I make the call in data access, I just don't want to return hundreds of rank data points when I only need one.
I also want to know where to store the rank information; it feels like it's dependent upon both a user and a challenge. Here's what I've considered:
The obvious: when instantiating a Challenge, just get all the rank information; slower but works.
Make a composite UserChallenge entity, but that feels like it goes against the domain (we don't go around talking about "user-challenges").
Third option?
I want to go with number two, but I'm not confident enough to know if this is really the DDD approach.
Update
I suppose I could call UserChallenge something more domain appropriate like Rank, UserRank or something?

The DDD approach here would be to reason in terms of the domain and talk with your domain expert/business analyst/whoever about this particular point to refine the model. Don't forget that the names of your entities are part of the ubiquitous language and need to be understood and used by non-technical people, so maybe "UserChallenge" is not he most appropriate term here.
What I'd first do is try to determine if that "middle entity" deserves a place in the domain model and the ubiquitous language. For instance, if you're building a website and there's a dedicated Rankings page where the user he can see a list of all his challenges with the associated ranks, chances are ranks are a key matter in the application and a Ranking entity will be a good choice to represent that. You can talk with your domain expert to see if Rankings is a good name for it, or go for another name.
On the other hand, if there's no evidence that such an entity is needed, I'd stick to option 1. If you're worried about performance issues, there are ways of reducing the multiplicity of the relationship. Eric Evans calls that qualifying the association (DDD, p.83-84). Technically speaking, it could mean that the Challenge has a map - or a dictionary of ranks with the User as a key.

I would go with Option 2. You don't have to "go around talkin about user-challenges", but you do have to go around grabbin all them Users for a given challenge and sorting them by rank and this model provides you a great way to do it!

In RDBMS, is there a formal design principle for Concrete objects, such as Course vs CourseSession?

In designing RDBMS schema, I wonder if there is formal principle of concrete objects: for example, if it is Persons table, then each record is very concrete and unique. Each record in fact represents a unique person.
But what about a table such as Courses (as in school). It can have a description, number of units, offered only in Autumn (Fall) or Spring, etc, which are the "general properties" of a course.
And then there is actual CourseSessions, which has information about the time_from and time_to (such as 10 to 11am), whether it is Monday, Wednesday or Tue / Thur, and the instructor teaching it, and also pointing back using a course_id to the Courses table.
So the above 2 tables are both needed.
Are there principles of table design for "concrete" vs "abstract"?
Update: what I mean "abstract" here is that a course is an abstract idea... there can be multiple instances of it... such as the course Physics 10 from 10-11am, and another at 12-1pm.

for example, if it is Persons table, then each record is very concrete and unique. Each record in fact represents a unique person.
That is the hope, but not the reality of the situation.
By immigration or legal death status, it is possible for there to be two (or more records) that represent the same person. Uniquely identifying people is difficult - first, middle and surnames can match but actually reflect different people. SSN/SIN are not reliable, because they can change (immigration, legally dead). A name doesn't guarantee gender, and gender can be changed.
Are there principles of table design for "concrete" vs "abstract"
The classification of being "concrete" vs "abstract" is arbitrary, subject to interpretation. Does the start and end date really make a Course session "concrete"? Because I can book numerous things in [Calendaring software of choice] - doesn't mean class actually took place, or that final grades are legitimate values...
Table design is based on business rules, and the logical entities (which can become tables in the physical model) required to support those rules. Normalization helps make these entities more obvious.

The relational data model, base on mathematics, prove a way to design your data model on which certain operations is correct without risk.
Unfortunatly, this kind of data model is not a suitable solution for performance issue in database. How to organize tables for certain business domain is need to consider about not only the abstract model of objects or database normalization but also performance planning on your system. Yes, the leak of abstraction.
For example, there are two design strategies for tree structure: Adjacency model and Materialized path model(The art of SQL). Which one is better is based on which operations need to be optimized.
There is a good and classical article I recommend: The Law of Leaky Abstractions
Abstraction has its price (& it is often higher than expected)
By Keith Cooper
The art of SQL, of course, the soul of database design in my opinion.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008