I have run into problem with selecting large data from SQL Server. I have a view with 200 columns and 200 000 rows. And I am using the view for Solr indexing.
I tried to select data with paging but it took a lot of time(more then 6 hours). Now i am selecting it without paging and it take 1 hour. But SQL Server takes a lot of memory.
What is the best method or approach to select large data in such situations from SQL Server 2008 R2?
Thanks in advance.
200k rows is not that much and definitely shouldn't take 6 hours, not even 1 hour.
I did not understant if the problem is on the actuall select or bringing the result to the application.
I would recomend running the select with NOLOCK to ignore blockings, maybe your table is beeing accessed by other processes when you are running the query
SELECT * FROM TABLE WITH(NOLOCK)
If the problem is on bringing the data to the application you'll need to provide mroe details on how you are doing it
I'd suggest taking a look at your execution plan. Look at the properties in the very first operator and see if you're getting "Good Enough Plan Found" or a timeout. If it's the latter, you may have a very complicated view or you may be nesting views (a view calling a view). See what you can do to simplify the query in order to give the optimizer a chance to create a good execution plan.
Related
I have complex query and big database.I execute my query on the Sql Server 2008 it took time 8-10 minute.But I execute my query on the Sql Server 2000 it took time 1-2 hour.Why?I used index and I used execution plan but I didn't solve this problem.Anybody can help me? or Does anyone have a suggestion?
You can possibly create an auxiliary table for this query. That way the query would run faster, since the work is done before the query in background. Note: This usually works if the query retrieving the data doesnt have to be in sync with the DB.
Also, It depends on how you want to use the data, you might be able to cache or precache the results.
I have a MySQL Table on an Amazon RDS Instance with 250 000 Rows. When I try to
SELECT * FROM tableName
without any conditions (just for testing, the normal query specifies the columns I need, but I need most of them) , the query takes between 20 and 60 seconds to execute. This will be the base query for my report, and the report should run in under 60 seconds, so I think this will not work out (it times out the moment I add the joins). The report runs without any problems in our smaller test environments.
Could it be that the Query is taking so long because MySQL is trying to lock the table and waiting for all writes to finish? There might be quite a lot of writes on this table. I am doing the query on a MySQL slave, since I do not want to lockup the production system with my queries.
I have no experience with how much rows are much for a relational DB. Are 250 000 Rows with ~30 columns (varchar, date and integer types) much?
How can I speedup this query (hardware, software, query optimization ...)
Can I tell MySQL that I do not care that the Data might be inconsistent (It is a snapshot from a Reporting Database)
Is there a chance that this query will run under 60 seconds, or do I have to adjust my goals?
Remember that MySQL has to prepare your result set and transport it to your client. In your case, this could be 200MB of data it has to shuttle across the connection, so 20 seconds is not bad at all. Most libraries, by default, wait for the entire result being received before forwarding it to the application.
To speed it up, fetch only the columns you need, or do it in chunks with LIMIT. SELECT * is usually a sign that someone's being super lazy and not optimizing at all.
If your library supports streaming resultsets, use that, as then you can start getting data almost immediately. It'll allow you to iterate on rows as they come in without buffering the entire result.
A table with 250,000 rows is not too big for MySQL at all.
However, waiting for those rows to be returned to the application does take time. That is network time, and there are probably a lot of hops between you and Amazon.
Unless your report is really going to process all the data, check the performance of the database with a simpler query, such as:
select count(*) from table;
EDIT:
Your problem is unlikely to be due to the database. It is probably due to network traffic. As mentioned in another answer, streaming might solve the problem. You might also be able to play with the data formats to get the total size down to something more reasonable.
A last-resort step would be to save the data in a text file, compress the file, move it over, and uncompress it. Although this sounds like a lot of work, you might get 5x - 10x compression on the data, saving oodles of time on the transmission and still have a large improvement in performance with the rest of the processing.
I got updated specs from my client and was able to reduce the amount of users returned to 250, which goes (with a lot of JOINS) though in 60 seconds.
So maybe the answer is really: Try to not dump a whole table with a query, fetch only the exact data your need. The Client has SQL access, and he will have to update his queries, so only relevant users are returned.
I should never really use * as a wildcard. Choose the fields that you actually want and then create an index of these fields combined.
If you have thousands of rows, another option is implement pagination.
If result data directly using for report , no one can look more than 100 rows in single shot.
I have a MySQL (InnoDB) database that contains tables with rows count between 1 000 000 and 50 000 000.
At night there is aggregating job which counts some information and writes them to reporting tables.
Fist job execution is very fast. Every query executes between 100ms and 1s.
After that almost every single query is very slow.
The example query is:
SELECT count(*) FROM tableA
JOIN tableB ON tableA.id = tableB.tableA_id
execution plan for that query shows that for both tables indexes will be used.
Important thing is that CPU, I/O, memory usage is very low.
MySQL server version: 5.5.28 with default setup (just installed on windows 7 developer computer).
It is difficult to tell from the information provided. I am assuming you have done EXPLAIN etc. In a previous experience, one of my queries suddenly slowed down and I realized that a certain field was suddenly populated with a huge amount of data. Instead of using count(*) maybe try count(tableA.id).
See if this helps or provide more information to debug.
Maybe, it's not really the query but the writing to reporting tables, which is slow.
I would try two things:
Measure the performance of the inserts or updates of your reporting tables
Reorder your jobs. Take a slow one to the front, to see whether the first job is fast or whether the job, which is run first, is fast
Can anyone explain to me why there is a dramatic difference in performance between MySQL and SQL Server for this simple select statement?
SELECT email from Users WHERE id=1
Currently the database has just one table with 3 users. MySQL time is on average 0.0003 while SQL Server is 0.05. Is this normal or the MSSQL server is not configured properly?
EDIT:
Both tables have the same structure, primary key is set to id, MySQL engine type is InnoDB.
I tried the query with WITH(NOLOCK) but the result is the same.
Are the servers of the same level of power? Hardware makes a difference, too. And are there roughly the same number of people accessing the db at the same time? Are any other applications using the same hardware (databases in general should not share servers with other applications).
Personally I wouldn't worry about this type of difference. If you want to see which is performing better, then add millions of records to the database and then test queries. Database in general all perform well with simple queries on tiny tables, even badly designed or incorrectly set up ones. To know if you will have a performance problem you need to test with large amounts of data and many simulataneous users on hardware similar to the one you will have in prod.
The issue with diagnosing low cost queries is that the fixed cost may swamp the variable costs. Not that I'm a MS-Fanboy, but I'm more familiar with MS-SQL, so I'll address that, primarily.
MS-SQL probably has more overhead for optimization and query parsing, which adds a fixed cost to the query when decising whether to use the index, looking at statistics, etc. MS-SQL also logs a lot of stuff about the query plan when it executes, and stores a lot of data for future optimization that adds overhead
This would all be helpful when the query takes a long time, but when benchmarking a single query, seems to show a slower result.
There are several factors that might affect that benchmark but the most significant is probably the way MySQL caches queries.
When you run a query, MySQL will cache the text of the query and the result. When the same query is issued again it will simply return the result from cache and not actually run the query.
Another important factor is the SQL Server metric is the total elapsed time, not just the time it takes to seek to that record, or pull it from cache. In SQL Server, turning on SET STATISTICS TIME ON will break it down a little bit more but you're still not really comparing like for like.
Finally, I'm not sure what the goal of this benchmarking is since that is an overly simplistic query. Are you comparing the platforms for a new project? What are your criteria for selection?
I have table in a SQL Server database with only 900 record with 4 column.
I am using Linq-to-SQL. Now I am trying retrieve data from that table for this I have written a select query.
Its not querying data from database and its showing time out error.
Please give me idea for this. First how can I increase time and second how can increase performance of query so can it easily access.
Thanks
That is a tiny table, there is either something very wrong with your database, or your application.
Try seeing what is happening in the database with SQL Profiler.
If you have just 900 records and four columns then unless you are storing many megabytes of data in each field the query should be very fast. I think your problem is that the connection is failing, possibly due to a firewall or other networking problem.
To debug I'd suggest running a simpler query and see if you can get any data at all. Also try running the same query from the SQL Server Management Studio to see if it works there.