Primefaces Autocomplete from huge database not acting fast - mysql

I am using primefaces autocomplete component with pojos and which is filled from a database table with huge number of rows.
When I select value from database which contains millions of entries (SELECT synonym FROM synonyms WHERE synonym like '%:query%') it takes a very long time to find the word on autocomplete because of huge database entries on my table and it will be bigger in future.
Is there any suggestions on making autocomplete acting fast.

Limiting the number of rows is a great way to speed-up autocomplete. I'm not clear on why you'd limit to 1000 rows though: you can't show 1000 entries in a dropdown; shouldn't you be limiting to maybe 10 entries?
Based on your comments below, here is an example database query that you should be able to adapt to your situation:
String queryString = "select distinct b.title from Books b where b.title like ':userValue'";
Query query = entityManager.createQuery(queryString);
query.setParameter("userValue", userValue + "%");
query.setMaxResults(20);
List<String> results = query.getResultList();

I finally went to using an index solar for doing fast requests while my table will contains more than 4 million entries which must be parsed fastly and without consuming a lot of memory.
Here's I my solution maybe someone will have same problem as me.
public List<Synonym> completeSynonym(String query) {
List<Synonym> filteredSynonyms = new ArrayList<Synonym>();
// ResultSet result;
// SolrQuery solrQ=new SolrQuery();
String sUrl = "http://......solr/synonym_core";
SolrServer solr = new HttpSolrServer(sUrl);
ModifiableSolrParams parameters = new ModifiableSolrParams();
parameters.set("q", "*:*"); // query everything
parameters.set("fl", "id,synonym");// send back just the id
//and synonym values
parameters.set("wt", "json");// this in json format
parameters.set("fq", "synonym:\"" + query+"\"~0"); //my conditions
QueryResponse response;
try {
if (query.length() > 1) {
response = solr.query(parameters);
SolrDocumentList dl = response.getResults();
for (int i = 0; i < dl.size(); i++) {
Synonym s = new Synonym();
s.setSynonym_id((int) dl.get(i).getFieldValue("id"));
s.setSynonymName(dl.get(i).getFieldValue("synonym")
.toString());
filteredSynonyms.add(s);
}
}
} catch (SolrServerException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
return filteredSynonyms;
}

Related

Pagination in Dynamo db while scan operation

I want to scan my Dynamo db table with pagination applied to it. In my request I want to send the number from where I want pagination to get start. Say, e.g. I am sending request with start = 3 and limit = 10, where start is I want scan to start with third item in the table and limit is upto 10 items. Limit however I can implement with .withLimit() method(I am using java). I followed this aws document. Following is the code of what I want to achieve:
<Map<String, AttributeValue>> mapList = new ArrayList<>();
AmazonDynamoDB client =AmazonDynamoDBClientBuilder.standard().build();
Gson gson = new GsonBuilder().serializeNulls().create();
Map<String, AttributeValue> expressionAttributeValues = new
HashMap<String,AttributeValue>();
expressionAttributeValues.put(":name",
newAttributeValue().withS(name));
List<ResponseDomain> domainList = new ArrayList<>();
ResponseDomain responseDomain = null;
//lastKeyEvaluated = start
Map<String, AttributeValue> lastKeyEvaluated = null;
do {
ScanRequest scanRequest = new
ScanRequest().withTableName(STUDENT_TABLE)
.withProjectionExpression("studentId, studentName")
.withFilterExpression("begins_with(studentName, :name)")
.withExpressionAttributeValues(expressionAttributeValues).
withExclusiveStartKey(lastKeyEvaluated);
ScanResult result = client.scan(scanRequest);
for (Map<String, AttributeValue> item : result.getItems()) {
responseDomain = gson.fromJson(gson.toJson(item),
ResponseDomain.class);
domainList.add(responseDomain);
} lastKeyEvaluated = result.getLastEvaluatedKey();
} while (lastKeyEvaluated!= null);
//lastKeyEvaluated = size
return responseDomain;
In the above code I am stuck at 3 places:
How can I set lastKeyEvaluated as my start value i.e 3
In the while condition how can I specify my limit i.e 10
When I try to map item from Json to my domain class, I encounter error.
Am I misinterpreting the concept of pagination in dynamodb or doing something wrong in the code. Any guidance will be highly appreciated as I am a newbie.
You can only start reading from some place by the ExclusiveStartKey. This key is the primary key of your table. If you know your item key, you can use it like this (e.g. your table primary key is studentId):
Map<String, AttributeValue> lastKeyEvaluated = new HashMap<String,AttributeValue>();
lastKeyEvaluated.put("studentId", new AttributeValue(STUDENTID));
When you specify the limit = N in dynamo you are setting that it should only read N items from the table. Any filtering is applied after the N items have been read. See Limiting the Number of Items in the Result Set.
That might leave you with less results than expected. So you could create a variable in your code to send requests until you hit your expected limit and cutoff the extra results.
int N = 10;
List<Map<String, AttributeValue>> itemsList = new ArrayList<>();
do {
// scanRequest.withLimit(N)
...
itemList.addAll(result.getItems());
if(itemsList.size() >= N) {
itemsList = itemsList.subList(0, N);
break;
}
} while (lastKeyEvaluated != null && itemsList.size() < N);
// process the itemsList
Dynamo uses it’s own json structure. See Dynamo response syntax.
You can get the value of an attribute the way you stored it in dynamo. If studentId is a string then it could be something like this:
for (Map<String, AttributeValue> item : result.getItems()) {
responseDomain = new ResponseDomain();
responseDomain.setId(item.get("studentId").getS());
domainList.add(responseDomain);
}
Pagination doesn't quite work the way you are thinking.
Use ScanRequest.withLimit(X) to choose the number of items in each page of results. For example, setting ScanRequest.withLimit(10) means that each page of results you get will have 10 items in it.
The lastKeyEvaluated is not a number of a page, it is the actual key of an item in the table. Specifically it is the key of the last item in the last set of results you retrieved.
Think about it like this. Imagine your results are:
Dog
Chicken
Cat
Cow
Rhino
Buffalo
Now lets say I did ScanRequest.withLimit(2) and lastKeyEvaluated = null, so each page of results has 2 items and I will retrive the first page of results. My first scan returns
Dog
Chicken
And
lastKeyEvaluated = result.getLastEvaluatedKey();
Returns
Chicken
And now to get the next page of results I would use ScanRequest.withExclusiveStartKey(Chicken). And the next set of results would be
Cat
Cow
Your code above uses a do/while loop to retrieve every page of results and print it out. Most likely you will want to remove that do/while loop so that you can handle one page at a time, then retrieve the next page when you are ready.

Deleting duplicate data in MySQL

I'm trying to emulate the accepted answer in this SO question: Delete all Duplicate Rows except for One in MySQL? [duplicate] with a twist, I want the data (auto-incrementing ID's) of one table to determine which rows to delete in another table. SQLFiddle here showing data.
In the fiddle referenced above, the end result I'm looking for is the rows in eventdetails_new with Event_ID = 4 & 6 to be deleted (EVENTDETAILS_ID's 5 & 6, and 9 & 10), leaving rows 3 & 5 (EVENTDETAILS_ID's 3 & 4 and 7 & 8). I hope that made sense. Ideally the rows in events_new with those same Event_ID's would get deleted as well (which I haven't started working on yet, so no code samples).
This is the query I'm trying to make work, but I'm a bit over my head:
SELECT *
FROM eventdetails_new AS EDN1, eventdetails_new AS EDN2
INNER JOIN events_new AS E1 ON `E1`.`Event_ID` = `EDN1`.`Event_ID`
INNER JOIN events_new AS E2 ON `E2`.`Event_ID` = `EDN2`.`Event_ID`
WHERE `E1`.`Event_ID` > `E2`.`Event_ID`
AND `E1`.`DateTime` = `E2`.`DateTime`
AND events_new.EventType_ID = 6;
Here's the same SQLFiddle with the results of this query. Not good. I can see the Event_ID in the data, but the query cannot for some reason. Not sure how to proceed to fix this.
I know it's a SELECT query, but I couldn't figure out a way to have two aliased tables in the DELETE query (which I think I need?). I figured if I could get a selection, I could delete it with some C# code. However ideally it could all be done in a single query or set of statements without having to go outside of MySQL.
Here's my first cut at the query, but it's just as bad:
DELETE e1 FROM eventdetails_new e1
WHERE `events_new`.`Event_ID` > `events_new`.`Event_ID`
AND events_new.DateTime = events_new.DateTime AND events_new.EventType_ID = 6;
SQLFiddle won't let me run this query at all, so it's not much help. However, it give me the same error as the one above: Error Code: 1054. Unknown column 'events_new.Event_ID' in 'where clause'
I'm by no means married to either of these queries if there's a better way. The end result I'm looking for is deleting a bunch of duplicate data.
I have hundreds of thousands of these results, and I know that roughly 1/3 of them are duplicates that I need to get rid of before we go live with the database.
Here's what I eventually ended up doing. My co-worker & I came up with a query that would give us a list of Event_ID's that had duplicate data (we actually used Access 2010's query builder and MySQL-ified it). Bear in mind this is a complete solution where the original question didn't have as much detail as far as linked tables. If you've got questions about this, feel free to ask & I'll try to help:
SELECT `Events_new`.`Event_ID`
FROM Events_new
GROUP BY `Events_new`.`PCBID`, `Events_new`.`EventType_ID`, `Events_new`.`DateTime`, `Events_new`.`User`
HAVING (((COUNT(`Events_new`.`PCBID`)) > 1) AND ((COUNT(`Events_new`.`User`)) > 1) AND ((COUNT(`Events_new`.`DateTime`)) > 1))
From this I processed each Event_ID to remove the duplicates in an iterative manner. Basically I had to delete all the child rows starting from the last lowest table so that I didn't run afoul of foreign key restraints.
This chunk of code was written in LinqPAD as C# statements: (sbCommonFunctions is an inhouse DLL designed to make most (but not all as you'll see) database functions be handled the same way or easier)
sbCommonFunctions.Database testDB = new sbCommonFunctions.Database();
testDB.Connect("production", "database", "user", "password");
List<string> listEventIDs = new List<string>();
List<string> listEventDetailIDs = new List<string>();
List<string> listTestInformationIDs = new List<string>();
List<string> listTestStepIDs = new List<string>();
List<string> listMeasurementIDs = new List<string>();
string dtQuery = (String.Format(#"SELECT `Events_new`.`Event_ID`
FROM Events_new
GROUP BY `Events_new`.`PCBID`,
`Events_new`.`EventType_ID`,
`Events_new`.`DateTime`,
`Events_new`.`User`
HAVING (((COUNT(`Events_new`.`PCBID`)) > 1)
AND ((COUNT(`Events_new`.`User`)) > 1)
AND ((COUNT(`Events_new`.`DateTime`)) > 1))"));
int iterations = 0;
DataTable dtEventIDs = getDT(dtQuery, testDB);
while (dtEventIDs.Rows.Count > 0)
{
Console.WriteLine(dtEventIDs.Rows.Count);
Console.WriteLine(iterations);
iterations++;
foreach(DataRowView eventID in dtEventIDs.DefaultView)
{
listEventIDs.Add(eventID.Row[0].ToString());
DataTable dtEventDetails = testDB.QueryDatabase(String.Format(
"SELECT * FROM EventDetails_new WHERE Event_ID = {0}",
eventID.Row[0]));
foreach(DataRowView drvEventDetail in dtEventDetails.DefaultView)
{
listEventDetailIDs.Add(drvEventDetail.Row[0].ToString());
}
DataTable dtTestInformation = testDB.QueryDatabase(String.Format(
#"SELECT TestInformation_ID
FROM TestInformation_new
WHERE Event_ID = {0}",
eventID.Row[0]));
foreach(DataRowView drvTest in dtTestInformation.DefaultView)
{
listTestInformationIDs.Add(drvTest.Row[0].ToString());
DataTable dtTestSteps = testDB.QueryDatabase(String.Format(
#"SELECT TestSteps_ID
FROM TestSteps_new
WHERE TestInformation_TestInformation_ID = {0}",
drvTest.Row[0]));
foreach(DataRowView drvTestStep in dtTestSteps.DefaultView)
{
listTestStepIDs.Add(drvTestStep.Row[0].ToString());
DataTable dtMeasurements = testDB.QueryDatabase(String.Format(
#"SELECT Measurements_ID
FROM Measurements_new
WHERE TestSteps_TestSteps_ID = {0}",
drvTestStep.Row[0]));
foreach(DataRowView drvMeasurements in dtMeasurements.DefaultView)
{
listMeasurementIDs.Add(drvMeasurements.Row[0].ToString());
}
}
}
}
testDB.Disconnect();
string mysqlConnection =
"server=server;\ndatabase=database;\npassword=password;\nUser ID=user;";
MySqlConnection connection = new MySqlConnection(mysqlConnection);
connection.Open();
//start unwinding the duplicates from the lowest level upward
whackDuplicates(listMeasurementIDs, "measurements_new", "Measurements_ID", connection);
whackDuplicates(listTestStepIDs, "teststeps_new", "TestSteps_ID", connection);
whackDuplicates(listTestInformationIDs, "testinformation_new", "testInformation_ID", connection);
whackDuplicates(listEventDetailIDs, "eventdetails_new", "eventdetails_ID", connection);
whackDuplicates(listEventIDs, "events_new", "event_ID", connection);
connection.Close();
//update iterator from inside the clause in case there are more duplicates.
dtEventIDs = getDT(dtQuery, testDB); }
}//goofy curly brace to allow LinqPAD to deal with inline classes
public void whackDuplicates(List<string> listOfIDs,
string table,
string pkID,
MySqlConnection connection)
{
foreach(string ID in listOfIDs)
{
MySqlCommand command = connection.CreateCommand();
command.CommandText = String.Format(
"DELETE FROM " + table + " WHERE " + pkID + " = {0}", ID);
command.ExecuteNonQuery();
}
}
public DataTable getDT(string query, sbCommonFunctions.Database db)
{
return db.QueryDatabase(query);
//}/*this is deliberate, LinqPAD has a weird way of dealing with inline
classes and the last one can't have a closing curly brace (and the
first one has to have an extra opening curly brace above it, go figure)
*/
Basically this is a giant while loop, and the clause iterator is updated from inside the clause until the number of Event_ID's drops to zero (it takes 5 iterations, some of the data has as many as six duplicates).

SQL WHERE LIKE clause in JSF managed bean

Hi i have this managed bean where it makes MySQL queries, the problem here is the SQL statement makes a '=' condition instead of 'LIKE'
Here is the code in my managed bean.
Connection con = ds.getConnection();
try{
if (con == null) {
throw new SQLException("Can't get database connection");
}
}
finally {
PreparedStatement ps = con.prepareStatement(
"SELECT * FROM Clients WHERE Machine LIKE '53'");
//get customer data from database
ResultSet result = ps.executeQuery();
con.close();
List list;
list = new ArrayList();
while (result.next()) {
Customer cust = new Customer();
cust.setMachine(result.getLong("Machine"));
cust.setCompany(result.getString("Company"));
cust.setContact(result.getString("Contact"));
cust.setPhone(result.getLong("Phone"));
cust.setEmail(result.getString("Email"));
//store all data into a List
list.add(cust);
}
return list;
Here the SELECT command does not pull all the numbers in 'Machine' column which is like 53, but if i enter a whole value, such as the complete number (53544) in place of 53 then the result is pulled up. I am confused !!
Also if i replace the above select statement with SELECT * FROM Clients the entire database is stored in list. Any ideas ?
Use wildcards:
Like '%53%'
...means everything that contains '53'.
Like '%53' - it ends with 53
LIKE '53%' - it starts with 53
You can also use _ if You want to replace a single character.
You can find a descriptipn HERE
You sql query should be
"SELECT * FROM Clients WHERE Machine LIKE '%53%'

JSON results are returned in a different order than expected

I am following Phil Haack's example on using jQuery Grid with ASP.NET MVC. I have it working and it works well...except for one minor problem. When I sort the columns by something other than the ID, the JSON data returned from the server is very...well...wrong. Here's is my Controller method.
[HttpPost]
public ActionResult PeopleData(string sidx, string sord, int page, int rows)
{
int pageIndex = Convert.ToInt32(page) - 1;
int pageSize = rows;
int totalRecords = repository.FindAllPeople().Count();
int totalPages = (int)Math.Ceiling((float)totalRecords / (float)pageSize);
var people = repository.FindAllPeople()
.OrderBy(sidx + " " + sord)
.Skip(pageIndex * pageSize)
.Take(pageSize);
var jsonData = new
{
total = totalPages,
page = page,
records = totalRecords,
rows = (
from person in people
select new
{
i = person.PersonID,
cell = new List<string> { SqlFunctions.StringConvert((double) person.PersonID), person.PersonName }
}
).ToArray()
};
return Json(jsonData);
}
When I sort by PersonID in the jsGrid table, I get this data back (I just used the name of the current ID as the name - e.g. 1, One; 2, Two, etc.)
{"total":1,"page":1,"records":6,"rows":[{"i":1,"cell":[" 1","One"]},{"i":2,"cell":[" 2","Two"]},{"i":3,"cell":[" 3","Three"]},{"i":4,"cell":[" 4","Four"]},{"i":5,"cell":[" 5","Five"]},{"i":6,"cell":[" 6","Six"]}]}
When I sort by PersonName, however, every other row has the order (the ID vs. the name) flipped around. So when I show it in the table, the PersonName is in the ID column and the ID is in the person column. Here is the JSON result.
{"total":1,"page":1,"records":6,"rows":[{"i":5,"cell":[" 5","Five"]},{"i":4,"cell":["Four"," 4"]},{"i":1,"cell":[" 1","One"]},{"i":6,"cell":["Six"," 6"]},{"i":3,"cell":[" 3","Three"]},{"i":2,"cell":["Two"," 2"]}]}
Anybody have any insight into what I've done wrong that causes this to happen?
Update
So, I have learned that, what is happening, is that my array values are flipping for every other item in the array. For example...if I populate my database with:
[A, B, C]
then for every even-numbered result (or odd, if you're counting from 0), my data is coming back:
[C, B, A]
So, ultimately, my JSON row data is something like:
[A, B, C]
[C, B, A]
[A, B, C]
[C, B, A]
...etc
This is always happening and always consistent. I am going a bit crazy trying to figure out what's going on because it seems like it should be something simple.
I have the same problem with my data which are INT type.
If elements in my queue (A,B,C) are NVARCHAR type I do not have this problem.
So problem is obviously in SqlFunction.StringConvert function.
Try to use the method described here. If you use fields instead of properties in the repository.FindAllPeople() you should look at the commented part of the code where are used FieldInfo and GetField instead of PropertyInfo and GetProperty.
I found the solution here: linq to entities orderby strange issue
The issue ultimately stems from the fact that Linq to Entities has trouble handling strings. When I was using the SqlFunctions.StringConvert method, this was incorrectly performing the conversion (although, I must admit that I don't fully understand why the order was then switched around).
In either case, per the above post, the solution for fixing the problem was to do the selection locally so that I could "force" Linq to Entities to work with strings properly. From this, my final code is:
var people = repository.FindAllPeople()
.OrderBy(sidx + " " + sord)
.Skip(pageIndex * pageSize)
.Take(pageSize);
// Due to a problem with Linq to Entities working with strings,
// all string work has to be done locally.
var local = people.AsEnumerable();
var rowData = local.Select(person => new
{
id = person.PersonID,
cell = new List<string> {
person.PersonID.ToString(),
person.PersonName
}
}
).ToArray();
var jsonData = new
{
total = totalPages,
page = page,
records = totalRecords,
rows = rowData
};
return Json(jsonData);

LINQ variable to list of string without using column names?

In an C# ASP.Net MVC project, I'm trying to make a List<string> from a LINQ variable.
Now this might be a pretty basic thing, but I just cannot get that to work without using the actual column names for the data in that variable. The thing is that in the interests of trying to make the program as dynamic as possible, I'm leaving it up to a stored procedure to get the data out. There can be any amount of any which way named columns depending on where the data is fetched from. All I care about is taking all of their values into a List<string>, so that I can compare user-input values with them in program.
Pointing to the columns by their names in the code means I'd have to make dozens of overloaded methods that all just basically do the same thing. Below is false non-functioning code. But it should open up the idea of what I mean.
// call for stored procedure
var courses = db.spFetchCourseInformation().ToList();
// if the data fails a check on a single row, it will not pass the check
bool passed = true;
foreach (var i in courses)
{
// each row should be cast into a list of string, which can then be validated
// on a row-by-row basis
List courseRow = new List();
courseRow = courses[i]; // yes, obviously this is wrong syntax
int matches = 0;
foreach (string k in courseRow)
{
if (validator.checkMatch(courseRow[k].ToString()))
{
matches++;
}
}
if (matches == 0)
{
passed = false;
break;
}
}
Now below is an example of how I currently have to do it because I need to use the names for the columns
for (int i = 0; i < courses.Count; i++)
{
int matches = 0;
if (validator.checkMatch(courses[i].Name))
matches++;
if (validator.checkMatch(courses[i].RandomOtherColumn))
matches++;
if (validator.checkMatch(courses[i].RandomThirdColumn))
matches++;
if (validator.checkMatch(courses[i].RandomFourthColumn))
matches++;
/* etc...
* etc...
* you get the point
* and one of these for each and every possible variation from the stored procedure, NOT good practice
* */
Thanks for help!
I'm not 100% sure what problem you are trying to solve (matching user data to a particular record in the DB?), but I'm pretty sure you're going about this in slightly the wrong fashion by putting the data in a List. I
t should be possible to get your user input in an IDictionary with the key being used for the column name, and the object as the input data field.
Then when you get the data from the SP, you can get the data back in a DataReader (a la http://msmvps.com/blogs/deborahk/archive/2009/07/09/dal-access-a-datareader-using-a-stored-procedure.aspx).
DataReaders are indexed on column name, so if you run through the keys in the input data IDictionary, you can check the DataReader to see if it has matching data.
using (SqlDataReader reader = Dac.ExecuteDataReader("CustomerRetrieveAll", null))
{
while (reader.Read())
{
foreach(var key in userInputDictionary.AllKeys)
{
var data = reader[key];
if (data != userInputDictionary[key]) continue;
}
}
}
Still not sure about the problem you are solving but, I hope this helps!
A little creative reflection should do the trick.
var courses = db.spFetchCourseInformation()
var values = courses.SelectMany(c => c.GetType().GetProperties() // gets the properties for your object
.Select(property => property.GetValue(c, null))); // gets the value of each property
List<string> stringValues = new List<string>(
values.Select(v => v == null ? string.Empty : v.ToString()) // some of those values will likely be null
.Distinct()); // remove duplicates