Extract all words from text field in mysql - mysql

I have a table that contains text fields. In those fields I store text. There are around 20 to 50 sentences in each field depending on the row. I am making an auto-complete HTML object with HTML and PHP, and I would like to start typing the beginning of a word and that the database return sentences containing those words (Like Microsoft office 2007/2010 navigation pane).
I need mysql to return those words or sentences as a separate result, so i can manipulate them further.
Example:
--------------------------------------------------------------------
| id | title |content |
--------------------------------------------------------------------
1 | test 1 | PHP is a very nice language and has nice features.
2 | test 2 | Spain is a nice country to visit and has nice language.
3 | test 3 | Perl isn\'t as nice a language as PHP.
I need mysql query to return following as different result:
1,"nice language"
1,"nice features"
2,"nice country"
2,"nice langugage"
3,"nicea a language"
Here is my sql query:
SELECT id, SUBSTR(content,POSITION('nice' IN content),50)
FROM entries
MATCH (title,entry) AGAINST ('nice' WITH QUERY EXPANSION)

New Answer
OP is actually asking nothing to do with php and javascript - his question concerns doing string manipulation directly within MySQL.
String manipulation isn't really the main focus of a DBMS. When dealing with "words" in a fluid text sense, there's a lot of logic required to determine where the next word boundary is, and you don't want your database doing this really. Plus, any queries written to do this will probably be incredibly difficult to read.
It depends exactly what you are doing, but it's quite likely that a DB only approach will be slower because there will be more function calls: SQL functions are pretty limited.
And for re-usability and best practice, what if you wanted to change your database in the future to say MongoDB? You'd need to re-write the whole damned awkward query.
No, my suggestion would be to pull the whole value using standard MySQL into PHP, throw it into PCRE, very simple regex, job done. It's better to show what you're actually doing in your PHP code as it's more "intention revealing".
At least 33% of a developer's work is picking the right tool for the job. PHP is the right tool in this example.
Original Answer
You have included the tags php and javascript, so I'm guessing (although your question needs more clarification on this) that you obviously want this 'autocomplete' running client-side. So as a result, you have to get your data from server-side to client-side first.
Twitter Bootstrap has something really cool called Typeahead. This uses JavaScript to perform (what I think) you require: the example on that page shows how you can type a country and it'll auto-complete it for you. It looks like this:
How do you get this working? Include the required JavaScript file first, and then write your HTML.
Here's some from the source code of the bootstrap page so you can see how it works:
<input type="text" data-provide="typeahead" data-items="4" data-source='["Alabama","Alaska","Arizona","Arkansas","California"]'>
Can you see how the data-source attribute is the one that gives the typeahead the information you want? You want to connect to MySQL, grab your data, and shove these into the data-source array for the JavaScript to work with, as above.
So, on your page load, you connect to MySQL and you pull all the relevant strings you would like to be "auto-complete-able" from the Database. You then put these as new Data attributes for the typeahead, and that's pretty much it!
--
Edit: There's a fork of twitter bootstrap's typeahead that allows AJAX calls, so you could use this to perform the data retrieval asynchronously (if you can figure it out, I'd recommend this approach).

Related

Generating truth tables for basic logic circuits

Let's say I have a text file that looks like this:
<number> <name> <type> <inputs...>
1 XOR1 XOR A B
2 SUM XOR 1 C
What would be the best approach to generate the truth table for this circuit?
That depends on what you have available, and how big your file is.
Perl is optimized for reading files and generating simple text output. It doesn't have a library of boolean operators, but they're easy enough to write. I'd use that if I just wanted text-in, text-out.
If I wanted to display the data online AND generate a results file, I'd use PHP to read the data and write the table to a CSV file that could either be opened in Excel, or posted online in an HTML table.
If your data is in a REALLY BIG data file, I'd use SQL.
If your data is in a really huge file that you want to be accessible to authorized users online, and you want THEM to be able to create truth tables, I'd use Oracle's APEX to create an easy interface for them to build their own truth tables and play around with the data without altering it.
If you're in an electrical engineering environment, use the tools designed for your problem -- Verilog or similar.
Whatcha got? Whatcha wanna do with it?
-- Ada
I prefer using C#. I already have the code to 'parse' the input
text file. I just don't know where to start in terms of
actually 'simulating' it. The output can simply be a text file
with inputs and output values – Don 12 mins ago
How many inputs and how many outputs in the circuit you want to simulate?
The size of the simulation determines how it can most easily be run. If the circuit is small(ish), you can enter the inputs and circuit values into vector arrays, then cross them to get the output matrix.
Matlab is ideal for this, as it was written for processing arrays.
Again: Whatcha got, and whatcha wanna do with it?
-- Ada

How to create RegEx with SubMatches of the same Match that capture 2 different types of output?

I'm trying to get my Jira data via JSON REST API into Excel, i.e. using VBA, and I'm parsing JSON output using RegEx. There are plenty of useful tutorials on the web, and after a couple of days I do have more or less working solution I'm happy with, except one minor obstacle. Long story short:
Among many issue fields I need friendly Assignee name, but some issues in my projects may be Unassigned, that obviously results in TWO VERY different kinds of JSON output:
Unassigned issue:
..."assignee":null,"updated"...
Assigned issue:
"assignee":{
"self":...
<Lots of NOT needed fields here>
...
},
"displayName":"Doe, John", <-- That's what I need, name only part
"active":...
<Lots of NOT needed fields here>
...
},
"updated"...
Well, I suppose that something like:
"assignee".*?"displayName":"(.*?)"|"assignee":(.*?),"updated"
will handle the job by producing TWO possible Matches, but... Is there a way to create RegEx where ANY of output options will result in SubMatches of ONE Match?
I'm a total newbie to RegEx, so sorry if the wording of my question is silly due to incorrectly used terms. Anyway, I hope the sample part is more or less clear, and I'll be extremely grateful for useful suggestions.
After an hour of tryouts on regex101 I ended up with the following RegEx:
"assignee":(null|.*?"displayName":"(.*?)","active")
Probably it's ugly and may be improved - but it DOES the job, and does NOT ruin in the process the indexes of subsequent Matches in collection, therefore keeping the rest of code working as it is now.

Passing a command line argument (as a string) into my Perl script

I'm extremely new at Perl and trying to prove I can pick it up quickly. What I was asked to do is add a string as an argument on my command line, and then feed that into my script. From there it is supposed to search a MySQL table I've made for matches in one column, and spit the contents of another column into an array. It was suggested I used the Getops:Std but I'm uncertain how exactly to do that, and if that's the best technique.
For example: I have a MySQL table with car manufacturers, and car models. I want to run, Perl myscript.pl Ford, and then have it shoot me back an array with
Mustang
Escape
Focus
But I'm uncertain how to get that string input in the first place. Would Getops:Std be best? If so how would it be written? I'm picking this up quickly, but I've been at it less than a week, so the simpler the explanation, the better.
Edit: Basically I was confused why it was suggested I should use GetOpts::Std for this. It seems to be completely inappropriate for what I'm trying to do.
GetOpts::Std is overkill for this. Your command line arguments are in #ARGV. If you haven't been able to work that out after a week, then you need better references for Perl.
The first argument will be in $ARGV[0], the second in $ARGV[1] , and so on.
You should check the DBI module. Google for some tutorial out there.
Then try to write your script and post more specific questions with some code if you need more help.

Parsing HTML content into a MySQL database using a parser

I want to be able to parse specific content from a website into a mySQL database. For example, on site http://allrecipes.com/Recipe/Fluffy-Pancakes-2/Detail.aspx I want to parse into my database (which has a table with columns RecipeName, Ingredients 1-10).
So basically my database will contain the name and all the ingredients for that recipe. There is no need to edit the content, simply parse them in as is (i.e. 3/4 cup milk) since i am using character in my database.
How exactly do I go about doing this? I was looking a pre-built parsers and it seems its tough to find one that's easy to use since I am fairly new to programming. Of course, I can manually enter values in but I want to parse them in.
Would it be possible to just parse this content and write a file that has a RecipieName, Ingredient string which I can then parse into my database? Or should I just do it directly into the database? I am unsure as to how to connect a database to a parser also directly, but I might be able to find some information online.
Basically, I am looking for help on how to exactly go about doing this since I am not very well versed in programming and this seems to be a lot more complicated than it might be.
I am using Java as my main language right now, although I can't say I am very good at it. But I should be able to understand the basic concepts.
Any suggestions on what parser to use or how to do this?
Thanks!
This is how I would do it in PHP. This is almost certainly NOT the most efficient way to do it, nor has it been debugged.
function parseHTML($rawHTML){
$startPosition = strpos($rawHTML,'<div class="ingredients"'); //Find the position of the beginning of the ingredients list, return the character number.
$endPosition = strpos($rawHTML,'</div>',$startPosition); //Find the position of the end of the ingredients list, begin searching from the beginning of the list (found in step 1)
$relevantPart = substr($rawHTML,$startPosition,$endPosition); //Isolate the ingredients list
$parsedString = strip_tags($relevantPart); //Strip the HTML tags off of the ingredients list
return $parsedString;
}
Still to be done: You say you have a mySQL database with 10 separate ingredients columns. This code outputs everything as one big string. You would have to change the strip_tags($relevantPart) function to strip_tags($relevantPart,"<li>"). That would let the <li> tags through. Then, you would have to loop through every <li> tag, performing a similar function to this. It shouldn't be too hard, but I don't feel comfortable writing it with no functioning PHP server.

LinqToSql and full text search - can it be done?

Has anyone come up with a good way of performing full text searches (FREETEXT() CONTAINS()) for any number of arbitrary keywords using standard LinqToSql query syntax?
I'd obviously like to avoid having to use a Stored Proc or have to generate a Dynamic SQL calls.
Obviously I could just pump the search string in on a parameter to a SPROC that uses FREETEXT() or CONTAINS(), but I was hoping to be more creative with the search and build up queries like:
"pepperoni pizza" and burger, not "apple pie".
Crazy I know - but wouldn't it be neat to be able to do this directly from LinqToSql? Any tips on how to achieve this would be much appreciated.
Update: I think I may be on to something here...
Also: I rolled back the change made to my question title because it actually changed the meaning of what I was asking. I know that full text search is not supported in LinqToSql - I would have asked that question if I wanted to know that. Instead - I have updated my title to appease the edit-happy-trigger-fingered masses.
I've manage to get around this by using a table valued function to encapsulate the full text search component, then referenced it within my LINQ expression maintaining the benefits of delayed execution:
string q = query.Query;
IQueryable<Story> stories = ActiveStories
.Join(tvf_SearchStories(q), o => o.StoryId, i => i.StoryId, (o,i) => o)
.Where (s => (query.CategoryIds.Contains(s.CategoryId)) &&
/* time frame filter */
(s.PostedOn >= (query.Start ?? SqlDateTime.MinValue.Value)) &&
(s.PostedOn <= (query.End ?? SqlDateTime.MaxValue.Value)));
Here 'tvf_SearchStories' is the table valued function that internally uses full text search
Unfortunately LINQ to SQL does not support Full Text Search.
There are a bunch of products out there that I think could: Lucene.NET, NHibernate Search comes to mind. LINQ for NHibernate combined with NHibernate Search would probably give that functionality, but both are still way deep in beta.