many-to-many relation in elasticsearch - mysql

I have Student model and Class model, and they have many-to-many relationship.
(A student can register for many classes, and a class can include many students)
I have Enrollment table as a join table.
(you can get the picture in the following website)
https://fmhelp.filemaker.com/help/18/fmp/en/index.html#page/FMP_Help/many-to-many-relationships.html
■ Student table
attributes:
・name
・age
■ Class table
attributes:
・className
・desrciption
■ Enrollment table
attributes:
・studentId
・classId
I think this is typical many-to-many relationship and I'm working this with MySQL and rails.
I would like to know if I could implement this relationships on Elasticsearch.
I read some articles which say that Elasticsearch does not support it, but is there any hacks or best practice for this?

Your use case is better suited for relational database.
Store and query data separately in elastic search and join data in API (business side).
Elasticsearch does not have concept of joins. It is based on concept of denormalization. Denormalization is used to improve the response time of a query at the expense of adding redundant data. Data from different tables can be combined and stored in single place , avoiding the need of joins, which results in faster retrieval at cost of storage(duplicity of data).
Your document which is equivalent to row in a table can be modeled as below
{
studentName:"",
"age":"",
....
classes:[
{
className:"class1",
...
},
{
className:"class2",
...
}
]
}
For each student store all the classes associated with him/her. This will cause duplication of class data across students. It will lead to faster search but slower update as any change in class data will need to be updated across students.
You can also model your data other way around with class as parent and array of students under it. Choose based on your use case.
For your sub model you can use different data types.
Object -- data is flattened.
Nested relations is maintained between properties of model.
Child/Parent sub model becomes different document. It is used when sub model data changes frequently. It does not case re-indexing of parent document.

Related

Creating a temporary table from inside Controller in Laravel PHP

I'm making a Laravel website which is a musician directory. Every Musician has many Skills and Genres.
I want the user to be able to make advanced searches by specifying various skills and genres.
The results are then to be retrieved and sorted by relevance.
Example:
User specifies skills: "Guitar", "Theory", and genres "Jazz", "Rock", "Blues".
A Musician with the skills: "Guitar", "Bass", and genres "Jazz", "Funk" gets a relevance score of 2 because he has two matching tags.
My plan is to make a temporary table inside of the MusiciansController search() function which stores all the results for the user's search.
How would I go about doing this if even possible?
I suggest making this a little easier on yourself by designing this slightly differently. Instead of thinking about a 'temporary' table, create an actual Laravel model and create all the functionality (calculations of relevance, user, etc) within.
By having a model / DB connection, you can also use Laravel relationships and call the methods on that model in a scalable way in future -- IE if you want to add new relevance calcs, or different relationships, you don't have to modify an SQL view - just add a method.
If you want to simulate a temporary table, just have a cleaner function at the start of the next search to wipe the previous records in the DB for that model.
Lastly, ask yourself if you even need to store this in a table, or can you calculate it using some set of formulae within the controller's methods.

Is it better to store recursive data in one OR two tables?

Goal: For a simple toDo app, tasks and possible subtasks needs to be stored (Model 1).
Is it "better" to have one table that is using recursive relations OR to use two tables? Advantages/disadvantages in your opinion? Positive/negative effects on performance, useability, etc. Is it even correct to use the recursive one this way ?
Model 1: Tasks and subtasks in two tables. More subtask levels are not necessary.
Model 2: Tasks and subtasks in one table. Btw, is it correct, that with this design to have unlimited subtask-levels (beside technical bounderies) ? task-subtask-subtask-...
I am not sure why you form your question this way and what confuses you.
A classical example of a database is one that stores employees. In the employees table you also store managers as managers are also employees. So what you describe as model 2 is not something "weird".
Self join is a common query.
Try to define the tables in a way that will make your queries as simple as possible and your model easy to understand and extend.
In your case you should define a second table only if each subtask has extra information that other tasks do not.
In your model 1 as you describe it you just duplicate the columns of your main table. This is not a good design IMO.
As far as I can see model 2 fits what you are trying to do.

Better way to do MySQL Object layer

I am not a pro in MySQL, but want to do something like Object Layer above relational MySQL tables.
I want to have very many "structures" with a fields of type "bigint", "longtext", "datetime", "double" stored in just 7 tables.
entity_types (et_id, et_name) - list of "structures";
entity_types_fields (etf_id, parent_et_id, ....., etf_ident, etf_type) - list of structure properties stored in one table for ALL structures; etf_type contains int value (0,1,2,3) which referenced to one of 4 tables described below.
entities (e_id, et_id) - list of all available entities (id and type id of entity)
and 4 data tables (containing all data for entities) -
entities_props_bigint (parent_e_id, parent_etf_id, ep_data) - for BIGINT data properties
entities_props_longtext (parent_e_id, parent_etf_id, ep_data) - for LONGTEXT data properties
entities_props_datetime (parent_e_id, parent_etf_id, ep_data) - for DATETIME data properties
entities_props_double (parent_e_id, parent_etf_id, ep_data) - for DOUBLE data properties
What the best way to do selection from such data layer ?
Let I have list of e_id (id of entities), each entity can have any type. I want to get predefined list of properties. If some of entities don't have such property, I want to have it equal to NULL.
Do you have some info about how to do it ? May be you have some links or have already deal with such things.
Thanks!
You're reinventing the wheel by implementing a whole metadata system on top of a relational database. Many developers have tried to do what you're doing and then use SQL to query it, as if it is relational data. But implementing a system of non-relational data and metadata in SQL is harder than you expect.
I've changed the relational tag of your question to eav, because your design is a variation of the Entity-Attribute-Value design. There's a limit of five tags in Stack Overflow. But you should be aware that your design is not relational.
A relational design necessarily has a fixed set of attributes for all instances of an entity. The right way to represent this in a relational database is with columns of a table. This allows you to give a name and a data type to each attribute, and to ensure that the same set of names and their data types apply to every row of the table.
What the best way to do selection from such data layer ?
The only scalable way to query your design is to fetch the attribute data and metadata as rows, and reconstruct your object in application code.
SELECT e.e_id, f.etf_ident, f.etf_type,
p0.ep_data AS data0,
p1.ep_data AS data1,
p2.ep_data AS data2,
p3.ep_data AS data3
FROM entities AS e
INNER JOIN entity_type_fields AS f ON e.et_id = f.parent_et_id
LEFT OUTER JOIN entities_props_bigint AS p0 ON (p0.parent_e_id,p0.parent_etf_id) = (e.e_id,f.etf_id)
LEFT OUTER JOIN entities_props_longtext AS p1 ON (p1.parent_e_id,p1.parent_etf_id) = (e.e_id,f.etf_id)
LEFT OUTER JOIN entities_props_datetime AS p2 ON (p2.parent_e_id,p2.parent_etf_id) = (e.e_id,f.etf_id)
LEFT OUTER JOIN entities_props_double AS p3 ON (p3.parent_e_id,p3.parent_etf_id) = (e.e_id,f.etf_id)
In the query above, each entity field should match at most one property, and the other data columns will be null. If all four data columns are null, then the entity field is missing.
Re your comment, okay now I understand better what you are trying to do. You have a collection of entity instances in a tree, but each instance may be a different type.
Here's how I would design it:
Store any attributes that all your entity subtypes have in common in a sort of super-type table.
entities(e_id,entity_type,name,date_created,creator,sku, etc.)
Store any attributes specific to an entity sub-type in their own table, as in Martin Fowler's Class Table Inheritance design.
entity_books(e_id,isbn,pages,publisher,volumes, etc.)
entity_videos(e_id,format,region,discs, etc.)
entity_socks(e_id,fabric,size,color, etc.)
Use the Closure Table design to model the hierarchy of objects.
entity_paths(ancestor_e_id, descendant_e_id, path_length)
For more information on Class Table Inheritance and Closure Table, see my presentations Practical Object-Oriented Models in SQL and Models for Hierarchical Data in SQL, or my book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming, or Martin Fowler's book Patterns of Enterprise Application Architecture.

CakePHP alternative to Class Table Inheritance?

I want to create a Class Table Inheritance model in CakePHP.
I would like to have a Model called something like ProductBase with the table product_bases to hold all the base information every product should have, like upc, price, etc.
Then have specific product type models extend that. For example ProductRing with the table product_rings to hold specific ring information like ring_size, center_stone, etc.
Then if I retrieve data directly from the ProductBase model, have it pull all types:
// pull all product types
$this->ProductBase->find('all');
Or find specific types only:
// pull only Rings or descendants of the Ring type.
$this->ProductRing->find('all');
Is anything like this possible in CakePHP? If not, what should I be doing instead?
What is the proper Cake way of doing something like this?
I worked with CakePHP for two years, and found no satisfactory solution for this, so one day I wrote a solution for it. I built a new kind of ORM that work as a plugin on top of CakePHP 2.x. I called it "Cream".
It works similar to the entities of CakePHP 3.0, but in addition supports multi table inheritance. It also supports very convenient data structure browsing (lazy loading) and is very easy to configure. In my opinion it is more powerful than what CakePHP 3.0 offers right now. Data structure browsing works as follows:
$entity = new Entity('SomeModel', $somePrimaryKeyValue);
$foo = $entity->RelatedModel()->YetAnotherRelatedModel()->someProperty();
However, it is important to notice, that in Cream, each entity object is a compund of a series of models and primary key values that are merged together. At least in the case where model inheritance is used. Such a compound looks like:
[<'SomeConcreteModel', primaryKeyValueA>, <'IntermediaryModel', primaryKeyValueB>, <'BaseModel', primaryKeyValueC>]
It is important to notice that you can pick up this entity by any of the given model/primaryKeyValue combinations. They all refer to the same entity.
Using this you can also solve your problem. You can use standard CakePHP find methods to find all primary key values you want from the base model, or you can use the find methods models that inherit from it, and then go along and create the entities.
You set up the chain of inheritance/extension by simply writing in your model class:
public $extends = 'YourBaseModel';
In addition you also needs to setup an ordinary CakePHP relationship between the models (hasOne or belongsTo). It works just like in normal OOP, with a chain of models that inherit from their bases. If you just use vanilla CakePHP you will just notice that these models are related, but when you start using the Cream interface, all entities merge model/primaryKeyValue pairs into one single object.
Within my github repository there is a powerpoint file that explain most of the basic features.
https://github.com/erobwen/Cream
Perhaps I should fork the CakePHP project and make a pull request, but for now It is a separate repository. Please feel free to comment or participate in developing "Cream".
Also, for those suggesting that it is best to just "work with the CakePHP flow as intended" I would argue the following. Common estimates suggest that C programs are 2.5 times bigger than the C++ counterpart. Given that the only feature that separates these languages is the OOP with inheritance etc, we can deduce that the lack of proper OOP with inheritance etc requires the programmer to do 150% additional work with repetition code etc. Therefore I would argue that a proper model inheritance mechanism in CakePHP is very much needed. Cream is an attempt at this.
You are referring to an ARC relationship (or at least a variation of it). Cake does not handle these types of relationships on the fly. This means you will have to implement your own logic to handle this.
The other option is to categorize the products. If the product can fit into multiple categories, then you will want a HABTM categories for each product. Otherwise, you can use a category column. I suspect it will be a HABTM you are looking for.
PRODUCTS: The table that holds the
products.
CATEGORIES: The list of categories
any given product can belong to.
CATEGORIES_PRODUCTS: The link between
each product and their various
categories.
TYPE: This is the flag that will
define the type of product (i.e.
ring, shoe, pants, etc.)
Then when you want ALL products, you query the products table. When you want a slice of the products (i.e. Rings) you select all the products that belongs to the RING category.
Now, we need to address the information about the product. For example, not all information will apply to every product. There are a number of ways to do this.
You can build multiple tables to
hold the product information. When
you pull a product of a given type,
you pull its companion information
from the table.
Store the information in a text
field as serialized data. All of the
information can be defined in a
settings var and then you can use
the serialized data to map to the
information.
I hope this helps. Happy coding!

Request-specific relationship() filtering (or alternative) in SQLAlchemy?

I have a schema where most of the tables have associated users_*_meta tables which store per-user data like starred/unstarred, rating, and the like.
For example, stories -< users_stories_meta >- users.
However, I'm having trouble figuring out how to perform a joined load of a row and the related metadata row for the current user without either writing my own ORM on top of SQL Alchemy's expression builder or using a closure to generate a new session and schema for each request. (relationship() doesn't seem to support resolving components of primaryjoin lazily)
What would the simplest, least wheel-reinventing way be to treat the appropriate subset of the many-to-many relationship as a one-to-many relationship (one user, many stories/authors/elements/etc.) specific to each request?
There's a recipe at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/GlobalFilter which illustrates how the criterion represented by a relationship() can be influenced at the per-query level.