Algorithm for suggesting new content - mysql

I have a list of articles that have increasing ids associated with them. Some ids are missing because the articles were deleted so the order is going up, but not always incremented by 1.
I am trying to dynamically recommend content like related articles but don't always want to recommend the same articles, but want to make sure that
1) Every article is recommended in another article
2) A page always recommends the same article - so randomness algorithms do not help.
Is there a good way to do this?
Thanks!!

In your sql think like this.
-Select List of catagorically alike stories that match some key
-grab a random 1
However the only way to ENSURE every article is attached to another is to set up a key list. Make the key list hold every article on the left, and random input for other articles on the right categorically again of course. Make this a temp table so when you add articles the left side increments, and the right side re-randoms the related articles, while using everyone in the list.

Related

Do I need a database to handle my website content?

So I'm building a website that contains information about a bunch of different animal species. I will have a list of 500 items, that should be able to be filtered and sorted by different criteria. For example, I will have a 'country selection' option. If Brazil is selected, the Capuchin monkey among other animals (living in Brazil) should be added to the list.
I could see myself making a list with 50 species with no problem, as the HTML would be manageable. But would having 500 items in a list with filterabilty even be possible without using some sort of database?
I was thinking of just pairing animal items from the list with certain filter criteria. For example, Capuchin monkey with "Brazil", "Mammal", "Omnivore", etc.
And when e.g. "Mammal" is selected in the filter, all animals paired with that property (all mammals of the list) is added to the list, or if not paired with the property, then removed from the list.
As you probably can tell, I'm really uneducated on how to go about creating this filterable list. Down the road I might even look into adding a search function.
After pluggin in all content, I would never need to change anything. I've read that databases should only be used if you have dynamic content.
I wouldn't list all 500 items on the same page, as that would make it very slow. I would have 10 items per page.
I don't need a solution per se. I just wish to be pushed in the right direction.
Should I look into MySQL? Can a filterable list of 500 items be possible with just HTML/CSS/Javascript? I am somewhat familiar with javascript, and have read that JSON might be able to provide the things I need.
Sorry if my question is vague or if I'm in the wrong anywhere (this is my first post). Please ask for any clarification and any advice or suggestion is greatly appreciated.
Thanks,
Manne
No you don't need a database. Have a look at this very robust jQuery plugin that will easily allow you to sort/filter/search 500 items in JavaScript alone:
https://datatables.net/
There are examples that are powered from JSON alone so I would suggest you simply store your data in a JSON file until you grow large enough that you need to change that (if you ever do).
Here is an example where the data is pulled from a .txt file:
https://datatables.net/examples/data_sources/ajax.html

Storing site data in columns or rows

This is a question of how to perform the best practice of storing data from a webpage. Like texts/image-urls/links etc.
I have an CMS were you can create web pages. Here you can edit texts/upload images. In the future it would also be nice to "add new elements", add links to a-tags etc.
I need to have a robust and flexible solution that also have good performance. In both getting/recieving this data.
Lets consider I have 1000 pages with each around 25 elements on each page that can be updated and stored in the database.
Alternative 1)
Create a table and 1 column for each element on these pages for example columns like:
title_1, title_2,image_1,image_2.
Here we have a set of columns that we can update, these we can use on the web page.
Alternative 2)
Create 1 table with the columns (id, namespace, page_id, data)
And for each element on the page I add the namespace in association with the page_id to make the data output unique. In the data I can add any kind of information; text, links etc.
What do you suggest as a good solution for this issue? I'm ofcourse also open for other alternatives.
Thanks!
I would recommend option two, with the addition of a column identifying the element id/or type, if indeed the element id is somehow comparable. That is to say, if anchor text (say) is always stored as element id = 4, then you might want an element id = 4 so that you could compare anchor texts across multiple documents.
If, on the other hand (and this is the scenario I imagine is more likely), you may have 1-25 elements on a page and each of them could be different (eg document one has three anchor texts and four images, document two has one anchor text and no images, etc) it would make sense to add an element_type_id table that stores a bit of information about the element types. This is assuming that you ever have any interest in comparing (say) images across multiple documents, or anchor texts across multiple documents, etc.
Another thing to consider: if you are likely to see the same element over and over again, it actually makes more sense to effectively parameterize those elements by way of a lookup table. So basically store each (say) unique anchor text in one table and reference its id in your actual data table.
If I may add one additional thing: SO may not be the best place for the particular question you are asking. I'm not totally sure of that and maybe I'm wrong... but I would poke around the Stack Exchange network and see if other forums more closely deal with the type of question you asking. In the very least, I'd observe that your question is fairly vague and the goal of achieving a "robust and flexible solution that also {has} good performance. In both getting/recieving this data." is not likely to be accomplished simply by asking for advice on SO. There is a LOT that goes into data architecture, and certainly many of the details I would consider important in designing this myself are not present in your questions. And if you're not sure what those details are, I am not sure if SO is really the best place to set about learning them. I think https://softwareengineering.stackexchange.com/ may be a better fit for this question.
Just my opinion, and I could be wrong. Either way, I would consider learning a bit about database normal forms (http://www.bkent.net/Doc/simple5.htm or Google it) as well as do a little research on the types of design considerations that go into building a database (an old but still good SO article on that is here: What are the most important considerations when designing a database?)

Using SQL to search a column for keywords from another table and tag the column with an attribute

basically I have a table of keywords and posts I want tagged with attributes on the display. like I want to draw a green border if #green# is present in the post. Is there a clean way for the DB to do this internally? I am prepared to do it all in C++ by fetching the entire table of keywords and throwing it in a trie and scanning each word, but this approach seems a bit inelegant.
You are mixing storage and display concerns. There be dragons. You are looking for content like '%#green#%'. It would be better to set a bit column or some other flag as the content was inserted or updated from the application side. When reading, retrieve this information along with the content. Let your display logic do the colouring.
Take a look at "separation of concerns" (SoC) as part of the S.O.L.I.D. practices.
Hope this helps.

Is it good to generate dynamic keywords every time when page loaded?(SEO)

For an example generate 10 random keywords from web content.
Thanks..
UPDATED.
For SEO.
I'm assuming you mean for display ...
The only advantage I can think of is that search engines might possibly go "This page is updated frequently, we should check it more often", maybe. I'm not up enough on the latest search engine workings to say if this would actually work or not. I wouldn't trust it to.
Disadvantages depend on usage, but I can't picture any scenario where it's immensely helpful to be "random". If you better describe the reasoning that led you to this conclusion, we can tell you whether it's right or not. My gut feeling however is ... no. If you want to display summary data, then "random" shouldn't fit into the equation, or at least, not at the top level. You should first filter the content based on some useful criteria, then apply random at the last step if necessary.
Example Process:
Filter out words on the stop list (if, is, you, etc).
Count occurences of words, prefer words with high occurence counts.
Prefer words which aren't featured prominently in other content items.
If there are more than 10 words remaining, randomly select 10 from the better scorers.
Keywords for this post: I, of, is, it, to, but, you, on, we, if.

User interface for addition/deletion of items to a list?

I have a ban list that I'm building as part of an application that displays articles. This ban list will contain keywords, which if found in an article, would lead to the article being disabled(the article will not be displayed on the front-end)
I'm having a tough time visualizing the UI. I could always display a textarea and ask the user to enter keywords comma separated and when they want to delete again the textarea will be presented and they can edit the entered keywords. But I find my idea very unfriendly to the user.
My question is how do I program the UI so that its easy to add new keywords. I also would like to be adviced on a nifty way showing the existing keywords and also deleting them.
This ban list will be part of the admin panel/backend and will be accessible only to the site administrator.
How many banned words will there be? If a handfull then your suggestion of a comma separated list makes sense - perhaps sorted alphabetically when re-presented for editing.
I speculate that the list could become quite extensive, and hence perhaps you would need to present several pages of excluded words. In which case, some form of paginated, alphabetic display, with a little (x) beside each entry to permit deletion.
And a separate entry field which would accept single words and add them into the list, displaying the relevent page might work.
One other thought: will your list contain profane or otherwise potentially offensive words? if is possible that representing the list could itself be offensive in some way? You may need to find a way to O??????e the O??????e. Which might present a few challenges.
I would display them as a list, with a textfield at the top or bottom to add new ones.
Add an icon to each to let the user delete it, and implement both adding and deletion by Ajax: then you can sort the list before redisplaying it.
(Actually, you could do that all in the browser with Javascript and not use Ajax: in that case you'll have to pass the whole list to the server when it's needed).