How can we understand DP (Dynamic Programing) ? Listing type of problems [closed] - language-agnostic

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 days ago.
Improve this question
I am doing some programming practice and
Going through Dynamic Programming theory. we always come across two points
Optimal Substructure (OSS)
Overlapping Subproblem (OSP)
Any optimization problem with these two characteristics can be solved by using DP techniques (Memoization or Tabulation).
But we know this needs so much practice to identify the kind of problems.
Let's Say we have 4 types
TYPE 1
TYPE2
PROBLEMS
OSS
OSP
ALL DP Problem
NON-OSS
OSP
?
OSS
NON-OSP
?
NON-OSS
NON-OSP
?
e.g. Is there any problem that looks like having NON-Overlapping Subproblem but Optimal Substructure characteristics?
I need your help in listing the problem of each type. These will help me and whoever is reading this get more identifying and then solving the problem.
If you have gone through any problem (Leetcode, CodeChef, SPOJ etc) you think can be fit into '?' category please comment.
Also If you have any link/source to know more about type based on OSS/OSP.

Related

Optimal way to check if given sentence(query) contains any of the predefined keywords [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
It might look like a simple question already answered countless times, but I could not find the optimal way(using some db).
I have a list of few thousands keywords(let's say abusive words). Whenever someone posts a message(long sentence or a paragraph), I want to check if the given sentence contains any of the keywords, so that I can block user or take other actions.
I am looking for a db/schema which can solve the above problem and gives response in a few milliseconds(<15ms).
There are many dbs which solves the reverse of the above problem: given the keywords, find documents containing keywords(text search).
Try ClickHouse for your workload.
According to docs:
multiMatchAny(...) returns 0 if none of the regular expressions are matched and 1 if any of the patterns matches. It uses hyperscan library. For patterns to search substrings in a string, it is better to use multiSearchAny since it works much faster.
The length of any of the haystack string must be less than 2^32 bytes.

SQL varchar column length for business/company names [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Is there a standard for varchar length relating to storing company/business names?
I have looked everywhere and cannot find an answer.
If not, what would be an ideal length to cover the majority of scenarios?
I'm going to go out on a limb here:
No
There is not in general, though there are some guidelines for some of these kinds of fields, for some organisations, in some countries (see answers to List of standard lengths for database fields).
You'll have to use best judgement. Quick google search the longest I could find was a little over 100 characters - if you're not stuck for space, throw in a few hundred to be safe, otherwise why are you strapped for space? Pull it out into a lookup table, then make the column in that table wide and move on; angsting over this will not earn you anything.

Best choice structure for MYSQL? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am trying to decide on what will be more efficient between two different systems for MySQL to handle my data. I can either,
1: Create around 200 tables, each having around 30 rows & 5 columns.
2: Create 1 table, having around 6000 rows & 5 columns.
I am using Laravel for this project and Eloquent will be handling this. Does anybody have any opinions on this matter? I appreciate any/all responses.
Option 2.
For such low row counts the overhead both in terms of programming effort and computation of joining 200(!) tables far outweighs the "flat file" approach. Additionally, MySQL will attempt to cache the entire 6000-row table in RAM, assuming you're not storing massive BLOBs.

one database or many to make it more efficient? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I need to know if it is more or less efficient to have multiple databases with an index of databases relative to each dataset.
I do not know to what extent multicache can adversely affect performance.
Suppose 10 bases in 2GB data each rather than a single 20GB.
For example: the data of userid 293484 are in third database.
Thanks.
Yes, this is a common technique known as sharding.
http://en.wikipedia.org/wiki/Shard_%28database_architecture%29
Altimately the code you will have to write to maintain such a structure will kill you.
Keep it simple, keep it in one database, and use proper design patterns and indexing.
Database engines are design to deal with large amounts of data, so if your hadrware is sufficient, your queries well structured and the design good, you should not have to many performance problems.

Is there an algorithm for weighted reservoir sampling? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Is there an algorithm for how to perform reservoir sampling when the points in the data stream have associated weights?
The algorithm by Pavlos Efraimidis and Paul Spirakis solves exactly this problem. The original paper with complete proofs is published with the title "Weighted random sampling with a reservoir" in Information Processing Letters 2006, but you can find a simple summary here.
The algorithm works as follows. First observe that another way to solve the unweighted reservoir sampling is to assign to each element a random id R between 0 and 1 and incrementally (say with a heap) keep track of the top k ids. Now let's look at weighted version, and let's say the i-th element has weight w_i. Then, we modify the algorithm by choosing the id of the i-th element to be R^(1/w_i) where R is again uniformly distributed in (0,1).
Another article talking about this algorithm is this one by the Cloudera folks.
You can try the A-ES algorithm from this paper of S. Efraimidis. It's quite simple to code and very efficient.
Hope this helps,
Benoit