I am new to web programming...I have been asked to create a simple Internet search application which would allow transmit to the browser some data stored remotely in the server.
Considering the client/server architecture (which I am new to) I would like to know if the "client" is represented only by the Internet browser and therefore the entire code of the web application should be stored in the server. As it's a very generic question a generic answer is also well accepted.
As you note, this is a very generic and broad question. You'd be well-served by more complete requirements. Regardless:
Client/server architecture generally means that some work is done by the client, and some by the server. The client may be a custom application (such as iTunes or Outlook), or it might be a web browser. Even if it's a web browser, you typically still have some code executing client-side, Javascript usually, to do things like field validation (are all fields filled out?).
Much of the code, as you note, will be running on the server, and some of this may duplicate your client-side code. Validation, for instance, should be performed on the client-side, to improve performance (no need to communicate with the server to determine if the password meets length requirements), but should be performed on the server as well, since client-side code is easily bypassed.
Either you can put all the code on the server, and have it generate HTML to send back to the browser. Or you can include JavaScript in the HTML pages, so some of the logic runs inside the browser. Many web applications mix the two techniques.
You can do this with all the code stored on the server.
1)The user will navigate to a page on your webserver using an url you provide.
2)When the webserver gets the request for that page, instead of just returning a standard html file, it will run your code, perhaps some PHP, which inserts the server information, perhaps from a database, into a html template.
3) The resulting fully complete html file is sent to the client. To the client's browser, it looks like any other html page.
For an example of PHP the dynamically inserts information into HTML see: (this wont be exactly what you will do but it will give you an idea of how PHP can work)
code:
http://www.php-scripts.com/php_diary/example1.phps
see the result (refresh a few times to see it in action):
http://www.php-scripts.com/php_diary/example1.php3
You can see from this the "code file" looks just like a normal html file, except what is between angled brackets is actually PHP code, in this case it puts the time into the position it is at in the html file, in your case you would write code to pull the data you want into the file.
Related
I'm new to HTML and coding period. I've created a basic HTML page. In that page i want to create dropdown selections that produce outputs from my SQL database. MSSQL not MySQL.
EX: If I select a table or a column from dropdown one and then input a keyword for selection box 2. I want it to produce a table that shows the information in that table/column with that keyword.
If I select a medical name from dropdown and I want it to show only medical names that are equal to Diabetes. and then show me those rows from my database to a table. How would I show that in HTMl from connecting to the database, to creating the dropdown selection linked to the database, and then being able to select the criteria for what I want to be displayed. and then showing that in a table or list format.
Thank you in advance
OK, Facu Carbonel's answer is a bit... chaotic, so since this question (suprisingly) isn't closed yet, I'll write one myself and try to do better.
First of all - this is a VERY BROAD topic which I cannot answer directly. I could give a pile of code, but walking through it all would take pages of text and in the end you'd just have a solution for this one particular problem and could start from scratch with the next one.
So instead I'll take the same path that Factu Carbonel took and try to show some directions. I'll put keywords in bold that you can look up and research. They're all pieces of the puzzle. You don't need to understand each of them completely and thoroughly from the beginning, but be aware what they are and what they do, so that you can google finer details when you need them.
First of all, you need to understand the roles of the "server side" and "client side".
The client side is the browser (Chrome, Firefox, Internet Explorer, what have you). When you type an address in the address bar (or click a link or whatever), what the browser does is it parses the whole thing and extracts the domain name. For example, the link to this question is https://stackoverflow.com/questions/59903087/sql-drop-down-selections-in-html?noredirect=1#comment105933697_59903087 and the domain part of that is stackoverflow.com. The rest of this long jibberish (it's called an "URL" by the way) is also relevant, but later.
With the domain in hand the browser then uses the DNS system to convert that pretty name into an IP address. Then it connects via network to the computer (aka "server") designated by that IP address and issues a HTTP request (HTTP, not HTML - don't mix these up, they're not the same thing).
HTTP, by the way, is the protocol that is used on the web to communicate between the server and the browser. It's like a language that they both understand, so that the browser can tell the server hey, give me the page /questions/59903087/sql-drop-down-selections-in-html. And the server then returns the HTML for that page.
This, by the way, is another important point to understand about HTTP. First the browser makes its request, and the server listens. Then the server returns its response, and the browser listens. And then the connection is closed. There's no chit-chat back and forth. The browser can do another request immediately after that, but it will be a new request.
Now, the browser is actually pretty limited in what it can do. Through these HTTP requests it gets from the server the HTML code, the CSS code and the Javascript code. It also can get pictures, videos and sound files. And then it can display them according to the HTML and CSS. And Javascript code runs inside the browser and can manipulate the HTML and CSS as needed, to respond to the user's actions. But that's all.
It might seem that the Javascript code that runs inside the browser is all powerful, but that is only an illusion as well. It's actually quite limited, and on purpose. In order to prevent bad webpages from doing bad things, the Javascript in each page is essentially limited to that page only.
Note a few things that it CANNOT do:
It cannot connect to something that doesn't use HTTP. Like an SQL server.
It can make HTTP requests, but only to the same domain as the page (you can get around this via CORS, but that's advanced stuff you don't need to worry about)
It cannot access your hard drive (well, it can if the user explicitly selects a file, but that's it)
It cannot affect other open browser tabs
It cannot access anything in your computer outside the browser
This, by the way, is called "sandboxing" - like, the Javascript code in the browser is only allowed to play in its sandbox, which is the page in which it was loaded.
OK, so here we can see, that accessing your SQL server directly from HTML/CSS/Javascript is impossible.
Fortunately, we still need to talk about the other side of the equation - the web server which responded to the browser's requests and gave it the HTML to display.
It used to be, far back in the early days of the internet, that web servers only returned static files. Those days are long gone. Now we can make the webserver return -- whatever we want. We can write a program that inspects the incoming request from the browser, and then generates the HTML on the fly. Or Javascript. Or CSS. Or images. Or whatever. The good thing about the server side is - we have FULL CONTROL over it. There are no sandboxes, no limits, your program can do anything.
Of course, it can't affect anything directly in the browser - it can only respond to the browsers requests. So to make a useful application, you actually need to coordinate both sides. There's one program running in the browser and one program running on the web server. They talk through HTTP requests and together they accomplish what they need to do. The browser program makes sure to give the user a nice UI, and the server program talks to all the databases and whatnot.
Now, while in browser you're basically limited to just Javascript and the features the browser offers you, on the server side you can choose what web server software and what programming language you use. You can use the same Javascript, or you can go for something like PHP, Java (not the same as Javasctipt!), C#, Ruby, Python, and thousands of others. Each language is different and does things its own way, but at the end of the day what it will do is that it will receive the incoming requests from the browser and generate some sort of output that the browser expects.
So, I hope that this at least gives you some starting point and outlines where to go from here.
First of all there is something that you need to know to do this, and that is the difference between a front-end and a back-end.
Html is a front-end technology, they are called like that because that's what is shown to the user and the back-end it's all mechanisms that run behind the hood.
The thing is, in your front-end you can't do things of back-end, like do querys from a database, manage sessions and that kind of thing.
For that you need a back-end running behind, like php, ruby, node.js or some technology like that.
From the html you can only call functions on the server using things like <form action="/log" method="POST"> this wold call the action /log that you should have already program on your back-end. Don't get confuse with this, there is plenty of ways to sending request to your back-end and this is just one way to do it.
For your specific case I recommend you to look up for ajax, to do the query on your database with no need of the browser to refresh after the query is made.
Some topics you need to know to understand this is:
-what's front-end and back-end and their differences.
-what is client-server architecture
-ajax
-http requests
-how to work with a back-end, doing querys to the database, making routes, etc.
-and for last, wile your server it's not open to the world with your own domain name, what is localhost and how to use it.
I hope that this clarify a bit this, that is no easy thing, but with a bit of research and practice you will accomplish!
I am not able to understand the difference between active and dynamic web pages.
I know that Active web pages are first downloaded on the client machine and then executed.
Dynamic web pages are executed on the server and then sent to the client.
But I am not able to correlate it with some real time example.
Kindly explain me the difference with some simple examples.
Also explain what is Applet and why it is active web page not dynamic.
As you said, dynamic is what's being executed on the server and then the result is being sent back to the client (browser). So for example when using PHP, your browser isn't able to execute PHP, so the server executes the PHP file and performs all logic in your code. The result will be an HTML file, which is then sent back to the client. The important thing to understand is that when the result is served to the client, the information in it won't change.
An active web page is a page where the browser performs the logic instead of the server. So for example when you've got a page where you're showing share prices, then you want it to update e.g. every 5 seconds. A solution would be to use AJAX with JavaScript. In contrast to PHP, your browser is able to execute JavaScript, so it is happening without reloading the page. So with an active page, everything is happening inside your browser without the need to reload the page every time you want new information.
An applet is an embedded application, like Flash or Java (not to be confused with JavaScript). To execute an applet, you most likely need a browser plugin. Because the applet is executed by the plugin and your browser, it is active and not dynamic (you don't need to request a new applet for information in it to change). The advantages of using an applet is that the programming language (like Java) has more possibilities than HTML. A lot of browser games are made with applets, but nowedays it is used less and less because we can achieve the same with techniques like JavaScript, HTML5 and WebGL.
I've had a thought of using Wordpress as a CMS backend, because well a lot of people know it and it is easy to use and then using Node.JS as the front-end. You're probably thinking now why would I want to do that in the first place, what is the advantage?
I want to use websockets and the wonderful Socket.io library for Node.JS provides beautiful cross-browser websockets support. Essentially I want a user to come to a site, a websocket is created and then content is fed to the frontend asynchronously as JSON and then decoded on the frontend all without page refreshing.
Effectively I am making Wordpress become a real-time CMS. You visit a site, but every link you click fetches the page as JSON and returns it via a websocket to save multiple requests and of course, page size.
How do I go about getting Node.JS talking to a MySQL database, pulling out info and then showing it? Any tutorials, resources and other useful tips would be gratefully appreciated. A few of my colleagues have wondered the same thing, so I think the answers will be a big help to everyone.
To be exact, you can't use Node.js for a front-end solution, since it runs on the server, not the browser (think of it like any other server-side language such as PHP, JSP etc).
You can, however, create the described solution with jQuery or any other Javascript library, you just have to implement data transfer with Socket.IO. On the server-side you'd need something to handle websockets, so the most native way would be to use Node.js, but since you want to use Wordpress, it gets really complicated, as Wordpress is not meant to be used in the way you described, so I'm afraid you'd have to write your CMS from ground up in Node.
Also, the way you described has a huge flaw. Search engine crawlers are still unable to parse and run Javascript, so if all of your content is loaded dynamically, it would seem empty to Google and others, so it would be impossible to ever make it in the search results rendering your site pretty much useless.
For MySQL and other modules for Node, you should check NPM registry and the Node modules page.
EDIT
After Dwayne explained his solution in comments, this is how I'd do it:
I'd use jQuery for front-end. Binding the document with .on(), and setting the selector to 'a', so that every anchor on the webpage would fire the handler.
The handler parses the a.href attribute and figures out whether it's an external link, which shouldn't be handled by Javascript, or if it's a link to the next page, to an article etc. You can prevent the default action by calling e.preventDefault() in the handler, which prevents the browser from redirecting to the location.
Then the handler would get the content in JSON by calling .getJSON() to the URL based on the article. The easiest way would be to have a certain pattern (such as all urls like www.domain.com/api) redirect to the Node service via .htaccess, to prevent cross-domain problems.
Node would then see the request, extract the parameters and figure out what the user wants. Then connect to the MySQL database with this module (it's as simple as it can get) and return the corresponding content formatted as JSON. Don't forget to set the Content-Type headers to 'application/json'.
jQuery gets the response, figure out the type of the request and updates the content accordingly. Profit.
As you can see, I wouldn't use WebSockets in this case, since you wouldn't really benefit much from it. They are mostly meant for small real-time updates (no huge HTTP headers to reduce the bandwidth) that are both-ways. This means that the server could also push data into the browser, without the browser asking for it. In a blog context, this is not required, and you won't have too many request, so the difference in bandwidth wouldn't be noticeable anyway. If, however, you would like to use it for educational purposes, just basically replace the getJSON part with SocketIO, I'm not sure whether Apache supports proxying WebSockets, though. Extra information about SocketIO basics are here.
Edit: I overlooked the part with 'using Node.js on the front-end'. As Vahur Roosimaa said, Node.js is on the server-side (think of it as Nginx / Apache + PHP combination). Node isn't a frontend library like jQuery.
If you want you can use it just for the websockets functionality (I suggest using Socket.IO).
Nice tutorials about Node.js and MySQL:
http://www.giantflyingsaucer.com/blog/?p=2596
http://mclear.co.uk/2011/01/26/very-simple-nodejs-mysql-select-query-example/
http://www.hacksparrow.com/using-mysql-with-node-js.html
This SO question might also help: MySQL with Node.js
Also check the examples from the github repo of node-mysql.
If you want something more advanced like an ORM, I recommend Sequelize.
Another good question from SO: Which ORM should I use for Node.js and MySQL?
You should check out Wordscript which I recently added a Node JS example which can act as a simple front end for doing basic post retrieval from a Wordpress database.
It uses a common mysql library for node, and generates MySQL queries from get parameters and renders data as it is retrieved from the database; including tags.
Wordscript aims to free backend/frontend developers from being forced to work with the Wordpress PHP codebase, but still allows for Wordpress'es administrative interface to be used when needed (and prudent to do so). API's have been written in Ruby and PHP that both return JSON feeds and function generally the same way the node version does; so thats an additional option where a scripting language is available.
One option you have, if you want to have wordpress as the CMS and keep its admin UI, is to write your wordpress templates to output JSON instead of HTML.
In contrast to Wordscript, this is more solution specific, since you will need to write your JSON output for every template/data you want. The upside is that you can create the JSON specifically for your needs.
On the node side, you write a small server that will consume the JSON, letting you use whatever javascript template language you want. Nodejs will also help out with performance, since you can save the rendered content and/or the JSON output in memory, saving you roundtrips to the wordpress templates.
I wrote a blog about this, which describes more of the benefits of using nodejs and wordpress together.
http://www.1001.io/improve-wordpress-with-nodejs/
This all goes back to some of my original questions of trying to "index" a webpage. I was originally trying to do it specifically in java but now I'm opening it up to any language.
Before I tried using HTML unit and other methods in java to get the information I needed but wasn't successful.
The information I need to get from a webpage I can very easily find with firebug and I was wondering if there was anyway to duplicate what firebug was doing specifically for my needs. When I open up firebug I go to the NET tab, then to the XHR tab and it shows a constantly updating page with the information the server is updating. Then when I click on the request and look at the response it has the information I need, and this is all without ever refreshing the webpage which is what I am trying to do(not to mention the variables it is outputting do not show up in the html of the webpage)
So can anyone point me in the right direction of how they would go about this?
(I will be putting this information into a mysql database which is why i added it as a tag, still dont know what language would be best to use though)
Edit: These requests on the server are somewhat random and although it shows the url that they come from when I try to visit the url in firefox it comes up trying to open something called application/jos
Jon, I am fairly certain that you are confusing several technologies here, and the simple answer is that it doesn't work like that. Firebug works specifically because it runs as part of the browser, and (as far as I am aware) runs under a more permissive set of instructions than a JavaScript script embedded in a page.
JavaScript is, for the record, different from Java.
If you are trying to log AJAX calls, your best bet is for the serverside application to log the invoking IP, useragent, cookies, and complete URI to your database on receipt. It will be far better than any clientside solution.
On a note more related to your question, it is not good practice to assume that everyone has read other questions you have posted. Generally speaking, "we" have not. "We" is in quotes because, well, you know. :) It also wouldn't hurt for you to go back and accept a few answers to questions you've asked.
So, the problem is?:
With someone else's web-page, hosted on someone else's server, you want to extract select information?
Using cURL, Python, Java, etc. is too painful because the data is continually updating via AJAX (requires a JS interpreter)?
Plain jQuery or iFrame intercepts will not work because of XSS security.
Ditto, a bookmarklet -- which has the added disadvantage of needing to be manually triggered every time.
If that's all correct, then there are 3 other approaches:
Develop a browser plugin... More difficult, but has the power to do everything in one package.
Develop a userscript. This is much easier to do and technologies such as Greasemonkey deal with the XSS problem.
Use a browser macro technology such as Chickenfoot. These all have plusses and minuses -- which I won't get into.
Using Greasemonkey:
Depending on the site, this can be quite easy. The big drawback, if you want to record data, is that you need your own web-server and web-application. But this server can be locally hosted on an XAMPP stack, or whatever web-application technology you're comfortable with.
Sample code that intercepts a page's AJAX data is at: Using Greasemonkey and jQuery to intercept JSON/AJAX data from a page, and process it.
Note that if the target page does NOT use jQuery, the library in use (if any) usually has similar intercept capabilities. Or, listening for DOMSubtreeModified always works, too.
If you're using a library such as jQuery, you may have an option such as the jQuery ajaxSend and ajaxComplete callbacks. These could post requests to your server to log these events (being careful not to end up in an infinite loop).
I need to write a script that go to a web site, logs in, navigates to a page and downloads (and after that parse) the html of that page.
What I want is a standalone script, not a script that controls Firefox. I don't need any javascript support in that just simple html navigation.
If nothing easy to do this exists.. well then something that acts though a web browser (firefox or safari, I'm on mac).
thanks
I've no knowledge of pre-built general purpose scrapers, but you may be able to find one via Google.
Writing a web scraper is definitely doable. In my very limited experience (I've written only a couple), I did not need to deal with login/security issues, but in Googling around I saw some examples that dealt with them - afraid I don't remember URL's for those pages. I did need to know some specifics about the pages I was scraping; having that made it easier to write the scraper, but, of course, the scrapers were limited to use on those pages. However, if you're just grabbing the entire page, you may only need the URL(s) of the page(s) in question.
Without knowing what language(s) would be acceptable to you, it is difficult to help much more. FWIW, I've done scrapers in PHP and Python. As Ben G. said, PHP has cURL to help with this; maybe there are more, but I don't know PHP very well. Python has several modules you might choose from, including lxml, BeautifulSoup, and HTMLParser.
Edit: If you're on Unix/Linux (or, I presume, CygWin) You may be able to achieve what you want with wget.
If you wanted to use PHP, you could use the cURL functions to build your own simple web page scraper.
For an idea of how to get started, see: http://us2.php.net/manual/en/curl.examples-basic.php
This is PROBABLY a dumb question, since I have no knowledge of mac but what language are we talking about here, and also is this a website that you have control over, or something like a spider bot that google might use when checking page content? I know that in C# you can load in objects on other sites using an HttpWebRequest and a stream reader... In java script (this would only really work if you know what is SUPPOSED to be there) you could open the web page as the source of an iframe, and using java script traverse the contents of all the elements on the page... or better yet, use jquery.
I need to write a script that go to a web site, logs in, navigates to a page and downloads (and after that parse) the html of that page.
To me this just sounds like a POST or GET request to the URL of the login page could do the job.With the proper parameters username and password (depending on the form input names used on the page) set in the request, the result will be the html of the page that you can then parse as you please.
This can be done with virtually any language. What language do you want to use?
I recently did exactly what you’re asking for in a C# project. If login is required your first request is likely to be a post and include credentials. The response will usually include cookies which persist the identity across subsequent requests. Use Fiddler to look at what form data (field names and values) is being posted to the server when you logon normally with your browser. Once you have this you can construct an HttpWebRequest with the form data and store the cookies from the response in a CookieContainer.
The next step is to make the request for the content you actually want. This will be another HttpWebRequest with the CookieContainer attached. The response can be read by a StreamReader which you can than read and convert to a string.
Each time I’ve done this it has usually been a pretty laborious process to identify all the relevant form data and recreate the requests manually. Use Fiddler extensively and compare the requests your browser is making when using the site normally with the requests coming from your script. You may also need to manipulate the request headers; again, use Fiddler to construct these by hand, get them submitting correctly and the response as you expect then code it. Good luck!