can we specify different field weights while searching in redisearch? - redisearch

When we create an index on redisearch, we specify different fields with their weights e.g. title text weight 50 description text weight 25 and as per my understanding these weights determine the ordering of results.
I want to know is there a way to specify fields weights which searching i.e. using FT.SEARCH
I would like to change field weights on runtime i.e. according to the query.
I would like to accomplish something like this
FT.SEARCH idx john description weight 50 title weight 25
See how I have tried to change weights while querying.

You can use query attribute to specify different weight for different parts of the query. Take a look here for more details: https://oss.redislabs.com/redisearch/Query_Syntax.html#query_attributes

Related

What does caffe do with the mean-binary file ?

In the caffe-input layer one can define a mean image that holds mean values of all the images used. From the image net example: "The model requires us to subtract the image mean from each image, so we have to compute the mean".
My question is: What is the implementation of this subtraction? Is it simply :
used_image = original_image - mean_image
or
used_image = mean_image - original_iamge
or
used_image = |original_image - mean_image|^2
if it is one of the first two, then how are negative pixels handeld ? Since the pictures are usually stored in uint8 it would mean that it simply starts from the beginning. e.g
200 - 255 = 56
Why I need to know this? I made tests and I know that the second example or the third example would work better.
It's the first one, a trivial normalization step. Using the second instead wouldn't really matter: the weights would invert.
There are no "negative pixels", per se: this is simply integer input to the matrix operations. You are welcome to interpret this as a visual alteration of some sort, but the arithmetic doesn't care.

Tableau Log Function Incorrect

Here is sample dataviz
Heatmap of linear order quantity (region vs quantity)
Created calculated field logarithmic = int(log([Order Quantity])) and later on logarithmic = int(log([Order Quantity],10))
Heatmap where size is based on logarithmic.
Size doesn't change and number is incorrect, please guide.
tl;dr Sum the order quantities before taking the logarithm.
int(log(SUM[Order Quantity]))
Otherwise you are taking the logarithm of each individual Order Item, and then adding the logarithms. The aggregation function, sum() in your case, is specified when you place the field on the shelf unless you make it explicit in the calculated field.
Here are a couple of ways to use the log field, dual or triple encoding the log by size, color and shape. A custom legend works better with multiple encoded symbols than the default legends.

the interaction of wordlist and top features selected based on weights

In the training process for a text classification case, the wordlist generated from process documentmodule has a length of about 15000 words. On the other side, I applied feature selection module, i.e.,weight by information gain and select by weight to select top 500 features. Both wordlist and selected weights are stored. Are there any ways to apply this generated 500 weights to the wordlist and constructed the short wordlist, which exactly matches the 500 weights. In other words, I would like to have the intersection of the original wordlist (about 15000 words) and the top 500 features(or top 500 words based on the ).
The following shows the script I am using.The stored weight(circled with red) is two columns where the first column is word(attributed) and the second column is corresponding weight value. Based on which, we can select top 500 or any other top features. The original wordlist (circled with red) can have 15000 words, a matrix with 15000 rows.
My question is that how to generated a filtered wordlist object based on the ranked weight object.
I have posted this question on Rapidminer forum. Please follow the update there.
You should post a representative process. In the absence of that it's difficult to give help but my view is that you could take the 500 word example set and process it again to make a word list from it.

MySQL what would the best approach to ranking highest to lowest possible match?

I have a MySQL database I'm searching through. Lets say this is a database of people. When querying for a specific record, it is possible to find a match 100% on each attribute. But querying the database to find closest match on probability (closest matches on table attributes) is more of the strategy.
In this scenario, does it make sense to create a temporary table (much like a tally-sheet) to indicate what attributes match/what attributes are present? What is the typical approach to doing advanced searches on database like this?
Example (below) of a hypothetical stored Procedure
*parameters are just to exemplify how I would search. I'm not concerned how to perform my selects. Question is about approach, strategy, technique *
call FindPerson ("Brown Eyes", "Brown hair", "Height:6'1", "white", "Name:Joe" ,"weight180", "Age 34" "sex m");
RESULT TABLE
NAME AGE HEIGHT WEIGHT HAIR SKIN sex RANK_MATCH
Joe 32 6'1 180 Brown white m 1
Mike 33 6'1 179 Brown white m 2
James 31 6'0 179 Brown black m 3
Just out of my mind. You can create your own score and sort by it. Something like
SELECT `id`,
(IF(`age`=32,1,0)+IF(`height`="6'1",1,0)+...) as `score`
FROM `people`
HAVING `score` > 0
ORDER BY `score` DESC
LIMIT 10;
With this, you can handle every field with its own comparison, and also weight the individual attributes by not just add 1 but 2 or more.
But I'm quiet not sure, how performant this is.
The approach I would use would be to create a scoring function (your stored proc) that would evaluate the given input's standard distance from the mean.
In the proc, you would judge each criteria in a fashion similar to:
INPUT AGE: 32
calculate MEAN of AGE WHERE (sex = m): 34.5
calculate STANDARD DEVIATION of AGE WHERE (sex = m): 2.5
calculate how many STDEVs 32 is from the 34.5 (also known as z-score): 1
Repeat this process for all numeric datatypes, summing them and ORDER BY the sum.
In doing so, the following schema change would be required: height changed from foot/inch form to strictly inches.
Depending on your needs, you may also consider coming up with an arbitrary scale for sex and skin color/hair color. Of course, you may think that measures like these should NOT be factored in because of how drastically it would change the scoring function. If you chose to, you'd have to find some number that would be added to the above SUM...but it's hard because nominative variables don't translate easily into these kinds of things.
If you find that haircolor/skin color is able to be usefully transferred into say, the continous color spectrum, your scoring tidbit would be the same...color value of input vs color value of means and standard deviations.
The query that would find your matches would be something to the effect of:
SELECT
ABS(INPUT_AGE - AVG(AGE)) / STD(AGE) AS age_z,
ABS(INPUT_WT - AVG(WT)) / STD(WT) AS wt_z,
...
(age_z + wt_z + ...) AS score
FROM `table`
ORDER BY score ASC

Saving user's height and weight

How should I store a user's height and weight in a MySQL database such that I can use the information to find users within a certain height or weight? Also, I will need to be able to display this information in either English or metric system.
My idea is to store the information for height in centimeters and weight in kilograms (I prefer metric over English). I can even let the user enter their information and English system, but do the conversion to metric before saving. I think converting kilograms to pounds might be easy to do in SQL, but I'm not sure how easy it would be to convert 178 centimeters to 5'10" (rounded slightly down).
Should I be saving English and metric values in the database so that I don't need to do conversions when I do my queries? Sounds like a bad idea to store derived/computed values.
There are several ways... one is to just have two numeric columns, one for height, one for weight, then do the conversions (if necessary) at display time. Another is to create a "height" table and a "weight" table, each with a primary key that is linked from another table. Then you can store both English and metric values in these tables (along with any other meta info you want):
CREATE TABLE height (
id SERIAL PRIMARY KEY,
english VARCHAR,
inches INT,
cm INT,
hands INT // As in, the height of a horse
);
INSERT INTO height VALUES
(1,'4 feet', 48, 122, 12),
(2,'4 feet, 1 inch', 49, 124, 12),
(3,'4 feet, 2 inches', 50, 127, 12),
(3,'4 feet, 3 inches', 51, 130, 12),
....
You get the idea...
Then your users table will reference the height and weight tables--and possibly many other dimension tables--astrological sign, marital status, etc.
CREATE TABLE users (
uid SERIAL PRIMARY KEY,
height INT REFERENCES height(id),
weight INT references weight(id),
sign INT references sign(id),
...
);
Then to do a search for users between 4 and 5 feet:
SELECT *
FROM users
JOIN height ON users.height = height.id
WHERE height.inches >= 48 AND height.inches <= 60;
Several advantages to this method:
You don't have to duplicate the "effort" (as if it were any real work) to do the conversion on display--just select the format you wish to display!
It makes populating drop-down boxes in an HTML select super easy--just SELECT english FROM height ORDER BY inches, for instance.
It makes your logic for various dimensions--including non-numerical ones (like astrological signs) obviously similar--you don't have special case code all over the place for each data type.
It scales really well
It makes it easy to add new representations of your data (for instance, to add the 'hands' column to the height table)
I would do it the way that you have said you would like to do it, but on the converting part, you would not convert 178 centimeters to 5'10", you would convert it to 70", then if need be, convert that into 5'10".
Think of 5'10" as either 70" or 5.8333333'. In that case, converting betwen 70" or 5.83333 is just a multiplication, so its easy to store in the db as centimeters if you so choose.
The issue of what the user sees is a presentation issue and nothing to do with the database.
I agree that storing computed values in this case is not ok. Your choices are perfect.
However, I would do the computations at the application level and query the DB with those values - depending on the language your application is written in , I am sure there are plenty o libraries/modules that are made that can compute those transformations.
Edit - to address the issue of storing computed values in DB:
While this is considered to be a bad practice in working with DBs, I usually am not 100% against this practice - just 90%.
I tend to store computed values in DB only when the computations are complex and would take enormous resources to get to the result wanted - this is clearly not the case.
If you would store computed values here you would have only the disadvantages of this technique - when modifying a record, you would have to modify the data in multiple places to keep the consistency of your DB