I'm trying to select the top ten most similar properties for a given property in a realty site and I wondering if you guys could help me out. The variables I'm working with would be price(int), area(int), bathrooms(int), bedrooms(int), suites(int), parking(int). At the moment I'm thinking of ordering by ABS(a-b) but wouldn't that be slow if I had to calculate that every time a property is viewed? (I'm not sure I could cache this since the database is constantly being updated) Is there another option?
Thanks for your help!
One solution could be to create a new table containing the result ready. Like this:-
property_id similar_properties_ids
--------------------------------------
1 2,5,8
2 3,10
...
...
And a cron running at regular intervals doing the calculation for all the properties and filling up the similar_properties_ids.
So, at runtime, you don't have the calculation overhead but the downside is that you get results which are a little old (updated during the last cron run).
Related
Ok, so what is the best practice when it comes down to paginating in mysql. Let me make it more clear, let's say that a given time I have 2000 records and there are more being inserted. And I am displaying 25 at a time, I know I have to use limit to paginate through the records. But what am I supposed to do for the total count of my records? Do I count the records every time users click to request the next 25 records. Please, don't tell me the answer straight up but rather point me in the right direction. Thanks!
The simplest solution would be to just continue working with the result set normally as new records are inserted. Presumably, each page you display will use a query looking something like the following:
SELECT *
FROM yourTable
ORDER BY someCol
LIMIT 25
OFFSET 100
As the user pages back and forth, if new data were to come in it is possible that a page could change from what it was previously. From a logical point of view, this isn't so bad. For example, if you had an alphabetical list of products and a new product appeared, then the user would receive this information in a fairly nice way.
As for counting, your code can allow moving to the next page so long as data is there to support a new page being added. Having new records added might mean more pages required to cover the entire table, but it should not affect your logic used to determine when to stop allowing pages.
If your table has a date or timestamp column representing when a record was added, then you might actually be able to restrict the entire result set to a snapshot in time. In this case, you could prevent new data from entering over a given session.
3 sugggestions
1. Only refreshing the data grid, while clicking the next button via ajax (or) storing the count in session for the search parameters opted .
2. Using memcache which is advanced, can be shared across all the users. Generate a unique key based on the filter parameters and keep the count. So you won't hit the data base. When a new record, gets added then you need to clear the existing memcache key. This requires a memache to be running.
3. Create a indexing and if you hit the db for getting the count alone. There won't be much any impact on performance.
I'm hoping this will be a rather simple question to answer, as I'm not looking for any specific code. I have a table on a classic asp page populated from an sql server. I've just set the table up so that each row is clickable and takes you to a page to edit the data in the row. My question is this: Would I be better off trying to use the recordset that populated the table or should I reconnect to the db and pull just the record I want edited.
As always; It Depends. It depends on what you need to edit about the record. It depends on How far apart your DB and site are from each other. It depends on which machine, if the DB and site are on separate machines, is more powerful.
That being said, you should make a new call for that specific record. The reason mainly being because of a specification you made in your question:
...and takes you to a page to edit the data in the row
You should not try to pass a record set between pages. There are a few reasons for this
Only collect what you need
Make sure data is fresh
Consider how your program will scale
On point 1 there are two ways to look at this. One is that you are trying to pass the entire record set across a page when you only need 1 record. There are few situations where another DB call would cost more than this. The other is you are only passing one record which would make me question your design. Why does this record set have every item related to a record. You are selecting way too much for just a result list. Or if the record is that small then Why do you need the new page. Why can you not just reveal an edit template for the item if it is that minimal.
On point 2 consider the following scenario. You are discussing with a coworker how you need to change a customer's record. You pull up this result set in an application but then nature calls and you step away from you desk. The coworker gets called by the customer and asked why the record is not updated yet. To placate the customer your coworker makes the changes. Now you are using an old record set and may overwrite additional changes your coworker made while you were away. This all happens because you never update the record set, you always just pass the old one from page to page.
On point 3 we can look back a point 1 a bit. let us say that you are passing 5 fields now. You decide though that you need a comments field to attach to one of your existing fields. do you intend to pass 2000 characters of that comment field to the next page? How about if each of the 5 need a comment field? Do you intend to pass 10,000 characters for a properly paged record set of 10? do you not do record set paging and need to pass 10,000 characters for a full 126 records.
There are more reasons too. Will you be able to keep your records secure passing them this way? Will this effect your user's experience because they have a crummy computer and cannot build that quick of a post request quickly? Generally it is better to only keep what you need and in most situations your result set should not have everything you need to edit.
I have a project table which lists every project. I have a cost center table which lists every cost center. I have an analyst table which shows the project, the cost center, and the analyst assigned to them. The projects and cost centers are dropdowns lists. Every project should have every cost center included in it. For every project and cost center combination, there should be an analyst assigned to it. How do I see which ones I have missed? The query I keep trying has two outer joins and Access doesn't like that. With 30 projects and 15 cost centers it is easy to forget to assign an analyst to one of the combinations.
It would also be helpful to have some kind of query that easily shows who is assigned to what projects, preferably in a crosstab format (similar to a pivot table). I think I can do that if I have the corect query that links these 3 tables together and shows every project with every cost center and which analyst is assigned to them.
If my setup with 3 tables is the main problem I can redo the database design. I thought I was designing it correctly by having a seperate table for projects and cost centers and a 3rd table that combines them with the analysts. But now that I can't figure out how to get this query to work I am thinking maybe that wasn't the best design idea.
Sorry. I figured it out. I guess writing the question helped me think it through.
I used one query that had the project table and the cost center table in it. This created a list of every possible combination.
I then made a second query that linked the first query to the analyst table. I forced the query to show every combination from the first query and then tell me when an analyst matched that combination. This way I get blanks for every time I missed adding an analyst. It was also very easy to turn this 2nd query into a pivot table that shows all of the blanks.
Sorry again for posting this question.
I thought about using a trigger or function to solve the problem but I don't really know how to go about coding it. Any help would be much appreciated.
did you try to create view? view update every time a new row added or updated.
Views serve as query and look like a table.
IMHO you should not persist the running totals, the possibility of the value becoming stale is high, instead you should go the route of View as suggested by #m-farhan. Still if you would want to persist the running total, then Trigger would be the safest bet. The Trigger ideally should fire a SQL query to accumulate the Customer's total and update the Total's column.
I wrote a VBA script that runs in an Access database. The script looks up values on various tables and assigns an attribute to a main table based on the combination of values.
The script works as intended, however, I am working with millions of records so it takes an unacceptably long time.
I would like to break the process up into smaller parts and run the script concurrently on separate threads.
Before I start attempting to build a solution, I would like to know:
Based on your experience, would this increase performance? Or would the process take just as long?
I am looking at using Powershell or VBScript to accomplish this. Any obstacles to look out for?
Please Note: Due to the client this will run on, I have to use Access for the backend and if I use Powershell it will have to be version 1.0.
I know these are very vague questions but any feedback based on prior experience is appreciated. Thanks
Just wanted to post back with my final solution on this...
I tried the following ways to assign an attribute to a main table based on a combination of values from other tables for a 60,000 record sample size:
Solution 1: Used a combination of SQL queries and FSO Dictionary objects to assign attribute
Result: 60+ minutes to update 60,000 records
Solution 2: Ran script from Solution 1 concurrently from 3 separate instances of Excel
Result: CPU was maxed out (Instance 1 - 50% of CPU, Instances 2 and 3 - 25% each); stopped the code after an hour since it wasn't a viable solution
Solution 3: Tried using SQL UPDATE queries to update main table with the attribute
Result: This failed because apparently Access does not allow for a join on an UPDATE sub-query (or I just stink at writing SQL)
Solution 4 (Best Result): Selected all records from main table that matched the criteria for each attribute,
output the records into csv and assigned the attribute to all records in the csv file.
This created a separate file for each attribute, all in the same format. I then
imported and appended all of the records from the csv files into a new main table.
Result: 2.5 minutes to update 60,000 records
Special thanks to Pynner and Remou who suggested writing the data out to csv.
I never would have thought that this would be the quickest way to update the records with the attribute. I probably would have scrapped the project thinking it was impossible to accomplish with Access and VBA had you not made this suggestion. Thank you so much for sharing your wisdom!