Multi-parameter search with mysql and node.js - mysql

Let me preface by saying I'm very new to SQL (and back end design) in general. So for those annoyed with noob questions, please be gentle.
BACKGROUND:
I'm trying to build a product test database (storing test data for all our products) where I want a user to be able to refine a search to find test data they actually want. For example, they may start by searching for all products of a certain brand name, and then refine it with a product type, and/or refine it with a date range of when the test was done.
PROBLEM:
I'm having a hard time finding information on how to implement multi-parameter searches with mysql and node.js. I know you can do nested queries and joins and such within pure SQL syntax, but it's not abundantly clear to me how I would do this from node.js, especially when certain search criteria aren't guaranteed to be used.
Ex:
CREATE PROCEDURE `procedureName`(
IN brandname VARCHAR(20),
producttype VARCHAR(30))
BEGIN
SELECT * FROM products
WHERE brand = brandname
AND product_type = producttype;
END
I know how to pass data from node.js to this procedure, but what if the user didn't specify a product type? Is there a way to nullify this part of the query? Something like:
AND product_type = ALL;
WHAT I'VE TRIED:
I've also looked into nesting multiple SQL procedures, but passing in dynamic data to the "FROM" clause doesn't seem to be possible. Ex: if I had a brandname procedure, and a product type procedure, I don't know how/if I can pass the results from one procedure to the "FROM" clause of the other to actually refine the search.
One idea was to create tables with the results in each of these procedures, and pass those new table names to subsequent procedures, but that strikes me as an inefficient way to do this (Am I wrong? Is this a completely legit way to do this?).
I'm also looking into building a query string on the node side that would intelligently decide what search criteria have been specified by the front end, and figure out where to put SQL AND's and JOIN's and what-nots. The example below actually works, but this seems like it could get ugly quick as I add more search criteria, along with JOINS to other tables.
// Build a SQL query based on the parameters in a request URL
// Example request URL: http://localhost:3000/search?brand=brandName&type=productType
function qParams(req) {
let q = "SELECT * FROM products WHERE ";
let insert = [];
if(req.query.brand) {
brandname = req.query.brand; // get brandname from url request
q = q + `brand = ?`, // Build brandname part of WHERE clause
insert.push(brandname); // Add brandname to insert array to be used with query.
};
if(req.query.type) {
productType = req.query.type; // get product type from url request
insert.length > 0 ? q = q + ' AND ' : q = q; // Decide if this is the first search criteria, add AND if not.
q = q + 'product_type = ?'; // Add product_type to WHERE clause
insert.push(productType); // Add product_type variable to insert array.
}
// Return query string and variable insert array
return {
q: q,
insert: insert
};
};
// Send Query
async function qSend(req, res) {
const results = await qParams(req); // Call above function, wait for results
// Send query string and variables to MySQL, send response to browser.
con.query(results.q, results.insert, (err, rows) => {
if(err) throw err;
res.send(rows);
res.end;
})
};
// Handle GET request
router.use('/search', qSend);
CONCISE QUESTIONS:
Can I build 1 SQL procedure with all my search criteria as variables, and nullify those variables from node.js if certain criteria aren't used?
Is there way to nest multiple MySQL procedures so I can pick the procedures applicable to the search criteria?
Is creating tables of results in a procedure, and passing those new table names to other procedures a reasonable way to do that?
Building the query from scratch in node is working, but it seems bloated. Is there a better way to do this?
Googling "multi-parameter search mysql nodejs" is not producing useful results for my question, i.e. I'm not asking the right question. What is the right question? What do I need to be researching?

One option is to use coalesce():
SELECT p.*
FROM products p
WHERE
p.brand = COALESCE(:brandname, p.brand)
AND p.product_type = COALESCE(:producttype, p.producttype);
It may be more efficient do explicit null checks on the parameters:
SELECT p.*
FROM products p
WHERE
(:brandname IS NULL OR p.brand = :brandname)
AND (:producttype IS NULL OR p.product_type = :producttype);

Related

postgres crosstab query with $libdir/tablefunc crosstab_hash function

My crosstab query (see below) runs just fine. However, I have to generate a large number of such queries, and - crucially - the number of column definitions will vary from day to day. If the number of output columndefs does not match that of the second argument of the crosstab, the crosstab will throw and error and abort. Therefore, I cannot "hard-wire" the column definitions as in my current query, and I need instead a function which will ensure that column definitions will be synchronized on-the-fly. Is it possible to write a generic postgres function that will be reusable in all such instances? Here is my query:
SELECT *
FROM crosstab
('SELECT
to_char(ipstimestamp, ''mon DD HH24h'') As row_name,
ips.objectid::text As category,
COUNT(*)::integer As value
FROM loggingdb_ips_boolean As log
INNER JOIN IpsObjects As ips
ON log.Varid=ips.ObjectId
WHERE (( log.varid = 37551)
OR (log.varid = 27087)
OR (log.varid = 29469)
OR (log.varid = 50876)
OR (log.varid = 45096)
OR (log.varid = 54708)
OR (log.varid = 47475)
OR (log.varid = 54606)
OR (log.varid = 25528)
OR (log.varid = 54729))
GROUP BY to_char(ipstimestamp, ''yyyy MM DD HH24h''), row_name, objectid, category
ORDER BY to_char(ipstimestamp, ''yyyy MM DD HH24h''), row_name, objectid, category',
'SELECT DISTINCT varid
FROM loggingdb_ips_boolean ORDER BY 1;'
)
As CountsPerHour(row_name text,
"25528" integer,
"27087" integer,
"29469" integer,
"37551" integer,
"45096" integer,
"54606" integer,
"54708" integer,
"54729" integer)
PS: Note that this query can be run against test data at the following server:
host: bellariastrasse.com
database: IpsLogging
user: guest
password: guest
I am afraid what you want is not completely possible. If the return type varies, you can either
create a function returning a generic SETOF record.
But then you'd have to provide a column definition list with every call - bringing you right back to where you started.
create a new function with a matching return type for every different case.
But that's what you are trying to avoid ...
If you have to write "a large number of such queries" you could utilize a query-generator function instead, which would not return the results but the DDL script which you would execute in a second step. Basically a function that takes in the variable parts as parameters and generates the query string in your example .. RETURNS text.
This can get pretty complex. Several meta-levels on top of each other have to be considered, but it is absolutely possible. Be sure to make heavy use of dollar-quoting to keep the quoting madness at bay.

Linq to Sql: Join, why do I need to load a collection

I have 2 tables that I need to load together all the time, the both must exist together in the database. However I am wondering why Linq to Sql demands that I have to load in a collection and then do a join, I only want to join 2 single tables where a record where paramid say = 5, example...
var data = _repo.All<TheData>(); //why do I need a collection/IQueryable like this?
var _workflow = _repo.All<WorkFlow>()
.Where(x => x.WFID== paramid)
.Join(data, x => x.ID, y => y.WFID, (x, y) => new
{
data = x,
workflow = y
});
I gues then I need to do a SingleOrDefault()? If the record is not null pass it back?
I Understand the Sql query comes out correctly, is there a better way to write this?
NOTE: I need to search a table called Participants to see if the loggedonuser can actually view this record, so I guess I should leave it as this? (this is main requirement)
var participant = _repo.All<Participants>();
.Any(x=> x.ParticipantID == loggedonuser.ID); //add this to above query...
The line var data = _repo.All<TheData>(); is something like saying 'start building query against the TheData table'.
This function returns you an IQueryable which will contain a definition of the query against your database.
So this doesn't mean you load the whole TheData table data with this line!
The query will be executed the moment you do something like .Count(), .Any(), First(), Single(), or ToList(). This is called deferred execution.
If you would end your query with SingleOrDefault() this will create a sql query that joins the two tables, add the filter and select the top most record or null(or throw an error if there are more!).
You could also use Linq instead of query extension methods.
It would look like:
var data = _repo.All<TheData>();
var _workflow = from w in _repo.All<WorkFlow>()
join t in _repo.All<TheData> on w.Id equals t.WFID
where x.WIFD = paramid
select new
{
data = t,
workflow = x
});

Could not format node 'Value' for execution as SQL

I've stumbled upon a very strange LINQ to SQL behaviour / bug, that I just can't understand.
Let's take the following tables as an example: Customers -> Orders -> Details.
Each table is a subtable of the previous table, with a regular Primary-Foreign key relationship (1 to many).
If I execute the follow query:
var q = from c in context.Customers
select (c.Orders.FirstOrDefault() ?? new Order()).Details.Count();
Then I get an exception: Could not format node 'Value' for execution as SQL.
But the following queries do not throw an exception:
var q = from c in context.Customers
select (c.Orders.FirstOrDefault() ?? new Order()).OrderDateTime;
var q = from c in context.Customers
select (new Order()).Details.Count();
If I change my primary query as follows, I don't get an exception:
var q = from r in context.Customers.ToList()
select (c.Orders.FirstOrDefault() ?? new Order()).Details.Count();
Now I could understand that the last query works, because of the following logic:
Since there is no mapping of "new Order()" to SQL (I'm guessing here), I need to work on a local list instead.
But what I can't understand is why do the other two queries work?!?
I could potentially accept working with the "local" version of context.Customers.ToList(), but how to speed up the query?
For instance in the last query example, I'm pretty sure that each select will cause a new SQL query to be executed to retrieve the Orders. Now I could avoid lazy loading by using DataLoadOptions, but then I would be retrieving thousands of Order rows for no reason what so ever (I only need the first row)...
If I could execute the entire query in one SQL statement as I would like (my first query example), then the SQL engine itself would be smart enough to only retrieve one Order row for each Customer...
Is there perhaps a way to rewrite my original query in such a way that it will work as intended and be executed in one swoop by the SQL server?
EDIT:
(longer answer for Arturo)
The queries I provided are purely for example purposes. I know they are pointless in their own right, I just wanted to show a simplistic example.
The reason your example works is because you have avoided using "new Order()" all together. If I slightly modify your query to still use it, then I still get an exception:
var results = from e in (from c in db.Customers
select new { c.CustomerID, FirstOrder = c.Orders.FirstOrDefault() })
select new { e.CustomerID, Count = (e.FirstOrder != null ? e.FirstOrder : new Order()).Details().Count() }
Although this time the exception is slightly different - Could not format node 'ClientQuery' for execution as SQL.
If I use the ?? syntax instead of (x ? y : z) in that query, I get the same exception as I originaly got.
In my real-life query I don't need Count(), I need to select a couple of properties from the last table (which in my previous examples would be Details). Essentially I need to merge values of all the rows in each table. Inorder to give a more hefty example I'll first have to restate my tabels:
Models -> ModelCategoryVariations <- CategoryVariations -> CategoryVariationItems -> ModelModuleCategoryVariationItemAmounts -> ModelModuleCategoryVariationItemAmountValueChanges
The -> sign represents a 1 -> many relationship. Do notice that there is one sign that is the other way round...
My real query would go something like this:
var q = from m in context.Models
from mcv in m.ModelCategoryVariations
... // select some more tables
select new
{
ModelId = m.Id,
ModelName = m.Name,
CategoryVariationName = mcv.CategoryVariation.Name,
..., // values from other tables
Categories = (from cvi in mcv.CategoryVariation.CategoryVariationItems
let mmcvia = cvi.ModelModuleCategoryVariationItemAmounts.SingleOrDefault(mmcvia2 => mmcvia2.ModelModuleId == m.ModelModuleId) ?? new ModelModuleCategoryVariationItemAmount()
select new
{
cvi.Id,
Amount = (mmcvia.ModelModuleCategoryVariationItemAmountValueChanges.FirstOrDefault() ?? new ModelModuleCategoryVariationItemAmountValueChange()).Amount
... // select some more properties
}
}
This query blows up at the line let mmcvia =.
If I recall correctly, by using let mmcvia = new ModelModuleCategoryVariationItemAmount(), the query would blow up at the next ?? operand, which is at Amount =.
If I start the query with from m in context.Models.ToList() then everything works...
Why are you looking into only the individual count without selecting anything related to the customer.
You can do the following.
var results = from e in
(from c in db.Customers
select new { c.CustomerID, FirstOrder = c.Orders.FirstOrDefault() })
select new { e.CustomerID, DetailCount = e.FirstOrder != null ? e.FirstOrder.Details.Count() : 0 };
EDIT:
OK, I think you are over complicating your query.
The problem is that you are using the new WhateverObject() in your query, T-SQL doesnt know anyting about that; T-SQL knows about records in your hard drive, your are throwing something that doesn't exist. Only C# knows about that. DON'T USE new IN YOUR QUERIES OTHER THAN IN THE OUTER MOST SELECT STATEMENT because that is what C# will receive, and C# knows about creating new instances of objects.
Of course is going to work if you use ToList() method, but performance is affected because now you have your application host and sql server working together to give you the results and it might take many calls to your database instead of one.
Try this instead:
Categories = (from cvi in mcv.CategoryVariation.CategoryVariationItems
let mmcvia =
cvi.ModelModuleCategoryVariationItemAmounts.SingleOrDefault(
mmcvia2 => mmcvia2.ModelModuleId == m.ModelModuleId)
select new
{
cvi.Id,
Amount = mmcvia != null ?
(mmcvia.ModelModuleCategoryVariationItemAmountValueChanges.Select(
x => x.Amount).FirstOrDefault() : 0
... // select some more properties
}
Using the Select() method allows you to get the first Amount or its default value. I used "0" as an example only, I dont know what is your default value for Amount.

Populate JOIN into a list in one database query

I am trying to get the records from the 'many' table of a one-to-many relationship and add them as a list to the relevant record from the 'one' table.
I am also trying to do this in a single database request.
Code derived from Linq to Sql - Populate JOIN result into a List almost achieves the intended result, but makes one database request per entry in the 'one' table which is unacceptable. That failing code is here:
var res = from variable in _dc.GetTable<VARIABLE>()
select new { x = variable, y = variable.VARIABLE_VALUEs };
However if I do a similar query but loop through all the results, then only a single database request is made. This code achieves all goals:
var res = from variable in _dc.GetTable<VARIABLE>()
select variable;
List<GDO.Variable> output = new List<GDO.Variable>();
foreach (var v2 in res)
{
List<GDO.VariableValue> values = new List<GDO.VariableValue>();
foreach (var vv in v2.VARIABLE_VALUEs)
{
values.Add(VariableValue.EntityToGDO(vv));
}
output.Add(EntityToGDO(v2));
output[output.Count - 1].VariableValues = values;
}
However the latter code is ugly as hell, and it really feels like something that should be do-able in a single linq query.
So, how can this be done in a single linq query that makes only a single database query?
In both cases the table is set to preload using the following code:
_dc = _db.CreateLinqDataContext();
var loadOptions = new DataLoadOptions();
loadOptions.LoadWith<VARIABLE>(v => v.VARIABLE_VALUEs);
_dc.LoadOptions = loadOptions;
I am using .NET 3.5, and the database back-end was generated using SqlMetal.
This link may help
http://msdn.microsoft.com/en-us/vcsharp/aa336746.aspx
Look under join operators. You'll probably have to change from using extension syntax other syntax too. Like this,
var = from obj in dc.Table
from obj2 in dc.Table2
where condition
select

Linq to SQL Stored Procedures with Multiple Results

We have followed the approach below to get the data from multiple results using LINQ To SQL
CREATE PROCEDURE dbo.GetPostByID
(
#PostID int
)
AS
SELECT *
FROM Posts AS p
WHERE p.PostID = #PostID
SELECT c.*
FROM Categories AS c
JOIN PostCategories AS pc
ON (pc.CategoryID = c.CategoryID)
WHERE pc.PostID = #PostID
The calling method in the class the inherits from DataContext should look like:
[Database(Name = "Blog")]
public class BlogContext : DataContext
{
...
[Function(Name = "dbo.GetPostByID")]
[ResultType(typeof(Post))]
[ResultType(typeof(Category))]
public IMultipleResults GetPostByID(int postID)
{
IExecuteResult result =
this.ExecuteMethodCall(this,
((MethodInfo)(MethodInfo.GetCurrentMethod())),
postID);
return (IMultipleResults)(result.ReturnValue);
}
}
Notice that the method is decorated not only with the Function attribute that maps to the stored procedure name, but also with the ReturnType attributes with the types of the result sets that the stored procedure returns. Additionally, the method returns an untyped interface of IMultipleResults:
public interface IMultipleResults : IFunctionResult, IDisposable
{
IEnumerable<TElement> GetResult<TElement>();
}
so the program can use this interface in order to retrieve the results:
BlogContext ctx = new BlogContext(...);
IMultipleResults results = ctx.GetPostByID(...);
IEnumerable<Post> posts = results.GetResult<Post>();
IEnumerable<Category> categories = results.GetResult<Category>();
In the above stored procedures we had two select queries
1. Select query without join
2. Select query with Join
But in the above second select query the data which is displayed is from one of the table i.e. from Categories table. But we have used join and want to display the data table with the results from both the tables i.e. from Categories as well as PostCategories.
Please if anybody can let me know how to achieve this using LINQ to SQL
What is the performance trade-off if we use the above approach vis-à-vis implement the above approach with simple SQL
Scott Guthrie (the guy who runs the .Net dev teams at MS) covered how to do this on his blog some months ago much better than I ever could, link here. On that page there is a section titled "Handling Multiple Result Shapes from SPROCs". That explains how to handle multiple results from stored procs of different shapes (or the same shape).
I highly recommend subscribing to his RSS feed. He is pretty much THE authoritative source on all things .Net.
Heya dude - does this work?
IEnumerable<Post> posts;
IEnumerable<Category> categories;
using (BlogContext ctx = new BlogContext(...))
{
ctx.DeferredLoadingEnabled = false; // THIS IS IMPORTANT.
IMultipleResults results = ctx.GetPostByID(...);
posts = results.GetResult<Post>().ToList();
categories = results.GetResult<Category>().ToList();
}
// Now we need to associate each category to the post.
// ASSUMPTION: Each post has only one category (1-1 mapping).
if (posts != null)
{
foreach(var post in posts)
{
int postId = post.PostId;
post.Category = categories
.Where(p => p.PostId == postId)
.SingleOrDefault();
}
}
Ok. lets break this down.
First up, a nice connection inside a using block (so it's disposed of nicely).
Next, we make sure DEFERRED LOADING is off. Otherwise, when u try and do the set (eg. post.Category == blah) it will see that it's null, lazy-load the data (eg. do a rountrip the database) set the data and THEN override the what was just dragged down from the db, with the result of there Where(..) method. phew! Summary: make sure deferred loading is off for the scope of the query.
Last, for each post, iterate and set the category from the second list.
does that help?
EDIT
Fixed it so that it doesn't throw an enumeration error by calling the ToList() methods.
Just curious, if a Post have have one or many Categories, is it possible to instead of using the for loop, to load the Post.PostCategories with the list of Categories (one to many), all in one shot, using a JOIN?
var rslt = from p in results.GetResult<Post>()
join c in results.GetResult<Category>() on p.PostId = c.PostID
...
p.Categories.Add(c)