What are sessions? How do they work? - language-agnostic

I am just beginning to start learning web application development, using python. I am coming across the terms 'cookies' and 'sessions'. I understand cookies in that they store some info in a key value pair on the browser. But I have a little confusion regarding sessions, in a session too we store data in a cookie on the user's browser.
For example - I login using username='rasmus' and password='default'. In such a case the data will be posted to the server which is supposed to check and log me in if authenticated. However during the entire process the server also generates a session ID which will be stored in a cookie on my browser. Now the server also stores this session ID in its file system or datastore.
But based on just the session ID, how would it be able to know my username during my subsequent traversal through the site? Does it store the data on the server as a dict where the key would be a session ID and details like username, email etc. be the values?
I am getting quite confused here. Need help.

Because HTTP is stateless, in order to associate a request to any other request, you need a way to store user data between HTTP requests.
Cookies or URL parameters ( for ex. like http://example.com/myPage?asd=lol&boo=no ) are both suitable ways to transport data between 2 or more request.
However they are not good in case you don't want that data to be readable/editable on client side.
The solution is to store that data server side, give it an "id", and let the client only know (and pass back at every http request) that id. There you go, sessions implemented. Or you can use the client as a convenient remote storage, but you would encrypt the data and keep the secret server-side.
Of course there are other aspects to consider, like you don't want people to hijack other's sessions, you want sessions to not last forever but to expire, and so on.
In your specific example, the user id (could be username or another unique ID in your user database) is stored in the session data, server-side, after successful identification. Then for every HTTP request you get from the client, the session id (given by the client) will point you to the correct session data (stored by the server) that contains the authenticated user id - that way your code will know what user it is talking to.

Explanation via Pictures:
You can think of a session kinda like a library ID card. Everytime you go to a library, then you you show them your ID card which was issued by that particular library.
Simple Explanation by analogy
Imagine you are in a bank, trying to get some money out of your account. But it's dark; the bank is pitch black: there's no light. You are surrounded by another 20 people. They all look the same. And everybody has the same voice. And everyone is a potential bad guy. In other words, HTTP is stateless.
This bank is a funny type of bank - for the sake of argument here's how things work:
you talk to your teller and make a request to withdraw money, and then
you have to wait briefly on the sofa, and 20 minutes later
you collect your money from the teller.
But how will the teller tell you apart from everyone else?
The teller can't see or readily recognise you, remember, because the lights are all out.
What if your teller gives your $10,000 withdrawal to someone else - the wrong person?! It's absolutely vital that the teller can recognise you as the one who made the withdrawal, so that you can get the money (or resource) that you asked for.
Solution:
When you first appear to the teller, he or she tells you something in secret:
"When ever you are talking to me," says the teller, "you should first identify yourself as GNASHEU329 - that way I know it's you".
Nobody else knows the secret passcode.
Example of How I Withdrew Cash:
So I decide to go to and chill out for 20 minutes and then later I go to the teller and say "I'd like to collect my withdrawal"
The teller asks me: "who are you??!"
"It's me, Mr. George Banks!"
"Prove it!"
And then I tell them my passcode: GNASHEU329
"Certainly Mr. Banks!"
That basically is how a session works. It allows one to be uniquely identified in a sea of millions of people. You need to identify yourself every time you deal with the teller.
Difference between Sessions and Cookies
Sessions: You can think of sessions as the temporary passcode in the above example. Once the bank (i.e. server) sees the passcode - they will be able to identify who you are, what you want etc.
Cookie: You can think of a cookie as simply plastic card upon which information is printed on. You can store anything on that card, like:
name / age / sex / marital status
passcodes
Security Concerns with Cookies
The bank can write information onto your card - and so can you. But this can be dangerous:
name: Ben Koshy
sex: male
bank balance: $1.99.
If I wanna be sneaky, I could edit my ID card:
name: Ben Koshy
sex: male
bank balance: $1 billion bucks. <------ new line
Hooray! I could print more money than Yellen and Powell combined. This presents a security risk: it is for this reason that banks "encrypt" information on cookies, so that if you tampered with it, the bank would know. As a general rule you should never put anything compromising, that can be tampered into a cookie - the bank balance should be stored on the server, where nobody can directly tamper with it.
In this case, Powell decided to tamper with bank balance in his cookie. The bank can now invalidate his session:
name: Jerome Powell
session: tampering with bank balance, session invalid. log him out immediately. And fire him too.

"Session" is the term used to refer to a user's time browsing a web site. It's meant to represent the time between their first arrival at a page in the site until the time they stop using the site. In practice, it's impossible to know when the user is done with the site. In most servers there's a timeout that automatically ends a session unless another page is requested by the same user.
The first time a user connects some kind of session ID is created (how it's done depends on the web server software and the type of authentication/login you're using on the site).
Like cookies, this usually doesn't get sent in the URL anymore because it's a security problem. Instead it's stored along with a bunch of other stuff that collectively is also referred to as the session. Session variables are like cookies - they're name-value pairs sent along with a request for a page, and returned with the page from the server - but their names are defined in a web standard.
Some session variables are passed as HTTP headers. They're passed back and forth behind the scenes of every page browse so they don't show up in the browser and tell everybody something that may be private. Among them are the USER_AGENT, or type of browser requesting the page, the REFERRER or the page that linked to the page being requested, etc. Some web server software adds their own headers or transfer additional session data specific to the server software. But the standard ones are pretty well documented.
Hope that helps.

HTTP is stateless connection protocol, that is, the server cannot differentiate between different connections of different users.
Hence comes cookie, once a client connects first time to a server, the server generates a new session id, which later will be sent to the client as cookie value. And from now on, this session id will identify that client connection, because within each HTTP request it will see the appropriate session id inside cookies.
Now for each session id, the server keeps some data structure, which enables him to store data specific to user, this data structure you can abstractly call session.

Think of HTTP as a person(A) who has SHORT TERM MEMORY LOSS and forgets every person as soon as that person goes out of sight.
Now, to remember different persons, A takes a photo of that person and keeps it. Each Person's pic has an ID number. When that person comes again in sight, that person tells it's ID number to A and A finds their picture by ID number.
And voila !!, A knows who is that person.
Same is with HTTP. It is suffering from SHORT TERM MEMORY LOSS. It uses Sessions to record everything you did while using a website, and then, when you come again, it identifies you with the help of Cookies(Cookie is like a token).
Picture is the Session here, and ID is the Cookie here.

Session is broad technical term which can be used to refer to a state which is stored either on server side using in-memory cache or on the client side using cookie, local storage or session storage.
There is nothing specific on the browser or server that is called session. Session is a kind of data which represents a user session on web. And that data can be stored on server or client.
And how it stored and shared is another topic. But the brief is when a user is logged in, the server creates a session data and generates a session ID. The session Id is sent back to user in custom header or set-cookie header which takes care of automatically storing it on user's browser. And then when next time the user revisits, the session ID is sent along the request and server check if there is existing session by that ID and processes accordingly.
You can store whatever you want in an session but the the main purpose is to remember the the user (browser) who have previously visit your site whether it's about login, shopping cart, or other activities.
And that's why it also important to protect the session ID from being intercepted by a hacker who will use it to identify himself as an another user.
By reading about Cookie, you will get the idea of session: (https://developer.mozilla.org/en-US/docs/Web/HTTP/Cookies)
Excerpt from MDN:
Cookies are mainly used for three purposes:
Session management
Logins, shopping carts, game scores, or anything else the server should remember
Personalization
User preferences, themes, and other settings
Tracking
Recording and analyzing user behavior

Related

Secure AND Stateless JWT Implementation

Background
I am attempting to implement token authentication with my web application using JSON Web tokens.
There are two things I am trying to maintain with whatever strategy I end up using: statelessness and security. However, from reading answers on this site and blog posts around the internet, there appears to be some folks who are convinced that these two properties are mutually exclusive.
There are some practical nuances that come into play when trying to maintain statelessness. I can think of the following list:
Invalidating compromised tokens on a per-user basis before their expiration date.
Allowing a user to log out of all of their "sessions" on all machines at once and having it take immediate effect.
Allowing a user to log out of the current "session" on their current machine and having it take immediate effect.
Making permission/role changes on a user record take immediate effect.
Current Strategy
If you utilize an "issued time" claim inside the JWT in conjunction with a "last modified" column in the database table representing user records, then I believe all of the points above can be handled gracefully.
When a web token comes in for authentication, you could query the database for the user record and:
if (token.issued_at < user.last_modified) then token_valid = false;
If you find out someone has compromised a user's account, then the user can change their password and the last_modified column can be updated, thus invalidating any previously issued tokens. This also takes care of the problem with permission/role changes not taking immediate effect.
Additionally, if the user requests an immediate log out of all devices then, you guessed it: update the last_modified column.
The final problem that this leaves is per-device log out. However, I believe this doesn't even require a trip to the server, let alone a trip to the database. Couldn't the sign out action just trigger some client-side event listener to delete the secure cookie holding the JWT?
Problems
First of all, are there any security flaws that you see in the approach above? How about a usability issue that I am missing?
Once that question is resolved, I'm really not fond of having to query the database each time someone makes an API request to a secure end point, but this is the only strategy that I can think of. Does anyone have any better ideas?
You have made a very good analysis of how some common needs break the stateleness of JWT. I can only propose some improvements on your current strategy
Current strategy
The drawback I see is that always is required a query to the database. And trivial modifications on user data could change last_modified and invalidate tokens.
An alternative is to maintain a token blacklist. Usually is assigned an ID to each token, but I think you can use the last_modified. As operations revocation of tokens probably are rare, you could keep a light blacklist (even cached in memory) with just userId, and last_modified.
You only need to set an entry after updating critical data on user (password, permissions, etc) and currentTime - maxExpiryTime < last_login_date. The entry can be discarded when currentTime - maxExpiryTime > last_modified (no more non-expired tokens sent).
Could not sign out the action just trigger some client-side event listener to delete the cookie secure holding the JWT?
If you are in the same browser with several open tabs, you can use the localStorage events to sync info between tabs to build a logout mechanism (or login / user changed). If you mean different browsers or devices, then a you would need to send some way of event from server to client. But it means maintain an active channel, for example a WebSocket, or sending a push message to a native mobile app
Are there any security flaws that you 'see in the above approach?
If you are using a cookie, note you need to set an additional protection against CSRF attacks. Also if you do not need to access cookie from client side, mark it as HttpOnly
How about a usability issue that i am missing?
You need to deal also with rotating tokens when the are close to expire.

Move information-resource stored in the database tables with two step using 'reservation'

I need to architect a database and service, I have resource that I need to deliver to the users. And the delivery takes some time or requires user to do some more job.
These are the tables I store information into.
Table - Description
_______________________
R - to store resources
RESERVE - to reserve requested resources
HACK - to track some requests that couldn`t be made with my client application (statistics)
FAIL - to track requests that can`t be resolved, but the user isn't guilty (statistics)
SUCCESS - to track successfully delivery (statistics)
The first step when a user requests resouce
IF (condition1 is true - user have the right to request resource) THEN
IF (i've successfully RESERVE-d resource and commited the transaction) THEN
nothing to do more
ELSE
save request into FAIL
ELSE
save request into HACK
Then the second step
IF (condition2 is true - user done his job and requests the reserved resource) THEN
IF (the resource delivered successfully) THEN
save request into SUCCESS
ELSE
save request into FAIL
depending on application logic move resource from RESERVE to R or not
ELSE
save request into HACK, contact to the user,
if this is really a hacker move resource from RESERVE to R
This is how I think to implement the system. I've stored transactions into the procedures. But the main application logic, where I decide which procedure to call are done in the application/service layer.
Am I on a right way, is such code division between the db and the service layers normal? Your experienced opinions are very important.
Clarifying and answering to RecentCoin's questions.
The difference between the HACK and FAIL tables are that I store more information in the HACK table, like user IP and XFF. I`m not going to penalize each user that appeared in that table. There can be 2 reasons that a user(request) is tracked as a hack. The first is that I have a bug (mainly in the client app) and this will help me to fix them. The second is that someone does manually requests, and tries to bypass the rules. If he tries 'harder' I'll be able to take some precautions.
The separation of the reserve and the success tables has these reasons.
2.1. I use reserve table in some transactions and queries without using the success table, so I can lock them separately.
2.2. The data stored in success will not slow down my queries, wile I'm querying the reserve table.
2.3. The success table is kind of a log for statistics, that I can delete or move to other database for future analyse.
2.4. I delete the rows from the reserve after I move them to the success table. So I can evaluate approximately the max rows count in that table, because I have max limit for reservations for each user.
The points 2.3 and 2.4 could be achieved too by keeping in one table.
So are the reasons 2.1 and 2.2 enough good to keep the data separately?
The resource "delivered successfully" mean that the admin and the service are done everything they could do successfully, if they couldn't then the reservation fails
4 and 6. The restrictions and right are simple, they are like city and country restrictions, The users are 'flat', don't have any roles or hierarchy.
I have some tables to store users and their information. I don't have LDAP or AD.
You're going in the right direction, but there are some other things that need to be more clearly thought out.
You're going to have to define what constitutes a "hack" vs a "fail". Especially with new systems, users get confused and it's pretty easy for them to make honest mistakes. This seems like something you want to penalize them for in some fashion so I'd be extremely careful with this.
You will want to consider having "reserve" and "success" be equivalent. Why store the same record twice? You should have a really compelling reason do that.
You will need to define "delivered successfully" since that could be anything from an entry in a calendar to getting more pens and post notes.
You will want to define your resources as well as which user(s) have rights to them. For example, you may have a conference room that only managers are allowed to book, but you might want to include the managers' administrative assistants in that list since they would be booking the room for the manager(s).
Do you have a database of users? LDAP or Active Directory or will you need to create all of that yourself? If you do have LDAP or AD, can use something like SAML?
6.You are going to want to consider how you want to assign those rights. Will they be group based where group membership confers the rights to reserve, request, or use a given thing? For example, you may only want architects printing to the large format printer.

How to maintain client session for shopping cart even when user has not logged in

I have a question, I looked on google and other threads on Stackoverflow as well. But did not get any accurate answer. Please help me redirect to ..
I use Amazon.com (just an example, could be any other e-commerse site) and keep on adding my items in wishlist. (While I have not yet logged in)
I close browser and system and then I came next day or may be after 10 days or longer time.
I can see the Wish list is there as it was.
Is Amazon maintaining Client side cookie on client side only to save data? But data will be too big and not reliable for accuracy.
Are they using a GUID on client side (for 1 year) and they send it to server where they already have Data base sessionId mapping with this in their DB?
when we once login to amazon, my wishlist and cart is always there. means they might have beed updated with client cookie.
I was asked this in Interview as design/architect question and I got confused. So wanted to clear it from you friends.
Ideally you need HTML5 and make use of Local Storage (http://dev.w3.org/html5/webstorage/)
A key value pair can be saved like below:
localStorage.setItem("Basket", {"item":"Product 1","price":100.50,"qty":2,"currency":USD});

good approach in tracking data for unregistered users

This is how the system works:
I have a catalog of items. An guest user can choose to add an item from the catalog to what we call the inquiry bin. The system keeps track of the items added to the inquiry bin for that particular session. The user can delete items from the bin.
I was wondering what may be the most optimal way of storing these items. Database? Sessions? or Cookies?
Thanks in advance!
Are these inquiry items required to be available to everyone? Or just the particular user that created them?
If they have to be globally available, then you'd have to stick them in the database, with appropriate flag fields to mark them as temporary and which session created them. If it's per user, then it's best to stick them in the session.
Cookies shouldn't be used for major data storage, even if it's just a few items. The less data the client has, the less chance there is to mess around with the innards of your system by feeding bad data via the cookie. If there's just a session ID, then there's essentially no chance of doing anything, other than guessing someone else's session ID.
Client side cookies have best performance, No round trip to web server is a big win for performance. But Cookie has size limitation. see following link about limitation on IE, Other browser should have similar limitation.
http://support.microsoft.com/kb/306070, cookies are used for small amount day storage, like session key.
Session normally means one of server process, if you use on a web farm, Session can not be shared across multiple web server. If you have a single web server, session should be best way to store information on the server side.
For database, it is most flexible solution, but it has performance hit. for high performance website, proper caching is key to go.

Safely store credentials between website visits

I'm building a website which allows users to create accounts and access the site's content. I don't want users to log in each time they visit the site, so I'm planning on storing the username and password in a cookie -- however, I've heard this is bad practice, even if the password is hashed in the cookie.
What "best practices" should I follow to safely remember of a users credentials between visits to my website?
Don't ever do that. Throwing around passwords in the open.
Safest method:
Store the username in a database, in the same row a randomly generated salt value, in the same row a hash checksum of the password including the salt. Use another table for sessions that references the table with user credentials. You can insert in the sessions table when the user logs in a date you want the session to expire (eg. after 15 days). Store the session id in a cookie.
Next time the user logs in, you get the password, add to it the salt for the user, geterate the hash, compare it to the one you have. If they match open a session by inserting a row in the sessions table and sending the session id in a cookie. You can check if the user has logged in and which user it is by this cookie.
Edit:
This method is the most popular in use on most sites. It hits a good balance between being secure and practical.
You don't simply use an autoincrement value for the session id. You make it by using some complicated checksum which is hard to repeat. For example concatenate username, timestamp, salt, another random salt, and make an md5 or sha checksum out of it.
In order to implement a feature that involves user credentials in a website/service there most be some exchange of data related to the credentials between the client and the server. This exposes the data to man in the middle attacs etc. Additionally cookies are stored in the users harddrive. No method can be 100% safe.
If you want additional security you can make your site go over https. This will prevent people from stealing cookies and passwords using man in the middle attacks.
Note:
Involving IP addresses in the mix is not a really good idea. Most often multiple clients will come from the same IP address over NATs etc.
You shouldn't need to store the password, just an identifier for the user that your application can interpret to be them.
Things you need to be aware of:
If the cookie is copied, will another user be able to pretend to be that user
A user shouldn't be able to construct a cookie that would authenticate them as another user
A possible solution to deal with these would be to create a one-time key for each user that is changed when they next use the application.
You will probably never be able to remember a user fully securely, so this should only be used if there is no sensitive data involved.
Passwords in any form shouldn't be stored in cookies. Cookies can easily be stolen.
Some browsers already support saving passwords. Why not let the user use that instead?
Storing a hash of the username in a cookie could provide this "remember me" functionality.
However for sensitive areas of the system you would need to know that a user entered the system on cached credentials so that you could offer a username/password prompt before you let them cause any real damage. This could be held as a session based flag.