Cookies solve this problem. A cookie is a small piece of information, often no more than a short session identifier, that the HTTP server sends to the browser when the browser connects for the first time. Thereafter, the browser returns a copy of the cookie to the server each time it connects. Typically the server uses the cookie to remember the user and to maintain the illusion of a "session" that spans multiple pages. Because cookies are not part of the standard HTTP specification, only some browsers support them: currently Microsoft Internet Explorer 3.0 and higher, and Netscape Navigator 2.0 and higher. The server and/or its CGI scripts must also know about cookies in order to take advantage of them.
Cookies contain attributes that tell the browser what servers to send them to. The "domain" attribute tells the browser which host names the cookie should be returned to, and the "path" attribute indicates what URL paths within that domain are valid. For instance, a domain of "megacorp.com" and a path of "/users" tells the browser to return the cookie to hosts with names like "ftp.megacorp.com" and "www.megacorp.com", and to do so only when requesting URLs that start with the path "/users". An important security measure prevents the cookie's domain from being set to top-level domains like ".com". This prevents someone from creating a promiscuous cookie that will be returned to any server.
However cookies can be used for more controversial purposes. Each access your browser makes to a Web site leaves some information about you behind, creating a gossamer trail across the Internet. Among the tidbits of data left along this trail are the name and IP address of your computer, the brand of browser you're using, the operating system you're running, the URL of the Web page you accessed, and the URL of the page you were last viewing. Without cookies, it would be nearly impossible for anyone to follow this trail systematically to learn much about your Web browsing habits. They would have to reconstruct your path by correlating hundreds or thousands of individual server logs. With cookies, the situation changes considerably.
The DoubleClick Network is a system created by the DoubleClick Corporation to create profiles of individuals using the World Wide Web and to present them with advertising banners customized to their interests. DoubleClick's primary customers are Web sites looking to advertise their services. Each member of the DoubleClick Network becomes a host for the advertising of other members of the network. When a Web site joins DoubleClick it creates advertisements for its services and submits them to DoubleClick's server. The Web site then modifies its HTML pages to include an <IMG> graphic that points to DoubleClick. When a user goes to view one of these modified HTML pages, her browser makes a call to DoubleClick's server to retrieve the graphic. The server chooses one of its member's advertisements and returns it to the browser. If the user reloads the page, a different advertisement appears. If the user clicks on the graphic, her browser jumps to the advertised site. Currently many hundreds of sites belong to DoubleClick.
From the user's point of view DoubleClick's graphics appear no different from any other Web advertisement, and there's no visible indication of anything special about the graphic. However, there is an important difference. When a user first connects to the DoubleClick server to retrieve a graphic, the server assigns the browser a cookie that contains a unique identification number. From that time forward whenever the user connects to any Web site that subscribes to the DoubleClick Network, her browser returns the identification number to DoubleClick's server, allowing the server to recognize her. Over a period of time DoubleClick compiles a list of which member sites the user has visited and revisited, using this information to create a profile of the user's tastes and interests. With this profile in hand the DoubleClick server can select advertising that is likely to be of interest to the user. It can also use this information to compile valuable feedback for its member Web sites, such as providing them with audience profiles and rating the effectiveness of the advertisements.
Although names and e-mail addresses are not part of the
information that DoubleClick records, other information that the browser leaves
behind is sufficient, in many cases, to identify the user. See Server
Logs and Privacy for more information. For this reason many people are
uncomfortable with DoubleClick's use of cookies. To find out whether you have
been tracked by DoubleClick, examine your browser's cookies file. On Unix
systems using Netscape, the cookies file can be found in your home directory in
the file ~/.netscape/cookies. If a line like this appears:
then you are carrying a DoubleClick cookie.ad.doubleclick.net FALSE / FALSE 942195440 IAA d2bbd5
Windows users will find the equivalent information in the file cookies.txt,
located in their C:\Programs\Netscape\Navigator directory, while
Macintosh users should look in their System Folder under Preferences:Netscape.
Users of Microsoft Internet Explorer should examine the files located in C:\Windows\Cookies.
Current versions of both Netscape Navigator and Internet Explorer offer the option of alerting you whenever a server attempts to give your browser a cookie. If you turn this alert on, you will have the option of refusing cookies. You should also manually delete any cookies that you have already collected. The easiest way to do this is to remove the cookies file entirely.
The drawback to this scheme is that many servers will offer the same cookie repeatedly even after you refuse to accept the first one. This rapidly leads to a nuisance situation. Before you panic over cookies, it's worth remembering that the vast majority of cookies are benign attempts to improve your Web browsing experience, not intrusions on your privacy. Netscape Navigator 4.0 provides a new feature that allows you to refuse cookies that are issued from sites other than the main page you are viewing. This foils most DoubleClick schemes without interfering with the more benign cookies. To access this option, select Edit->Preferences->Advanced, and select the appropriate radio button from the cookies section.
Some people might want to allow transient cookies (ones active only during a browsing session) but forbid persistent ones (ones that store user identification information over an extended period). On Unix systems, you can do this easily by creating a symbolic link between the Unix "bit bucket" device, /dev/null and the cookies file. Users of other operating systems may have to invest in third party products that intercept cookies. A representative listing of such products follows:
However, unless this type of system is implemented carefully, it may be vulnerable to exploitation by unscrupulous third parties. For instance, an eavesdropper armed with a packet sniffer could simply intercept the cookie as it passes from your browser to the server, using it to obtain free access to the site. Because browsers use the domain name system (DNS) to determine what cookies belong to a server, it is possible to trick a browser into sending a cookie to a rogue server by temporarily subverting the DNS. If the cookie is persistent, of course, it is also vulnerable to being stolen from the user's cookie database file.
Now consider a transaction processing systems that uses cookies as session IDs to preserve state during the steps of a multi-part transaction. Examples of such systems include a system that allows authorized employees to update records in a corporate database, an on-line ordering system, or a bank transaction system. If care is not taken to protect the cookie from interception, it is possible for an interloper to hijack the transaction and use it to make unauthorized transactions.
Designers of systems that use cookies for authentication and state-preservation should be alert to the possibility of cookie interception. Cookies should aways contain as little private information as possible. In particular, it is never appropriate for cookies to contain plaintext user names and passwords. In ISP environments where many users share the same Web server, it is important to use as specific a path as possible in the cookie. For instance, if the program that processes cookies lives at URL http://bigISP.com/users/fred/order.cgi, then the developer should arrange for the cookie path to be set to /users/fred/order.cgi rather than a more general path like /.
If possible, cookies should contain information that allows the system to verify that the person using them is authorized to do so. A popular scheme is to include the following information in cookies:
The MAC code is there to ensure that none of the fields of the cookie have been tampered with. There are many ways to compute a MAC, most of which rely on one-way hash algorithms such as MD5 or SHA to create a unique fingerprint for the data within the cookie. Here's a simple but relatively secure technique that uses MD5:
MAC = MD5("secret key " +
MD5("session ID" + "issue date" +
"expiration time" + "IP address" +
"secret key")
)
This algorithm first performs a string concatenation of all the data fields in
the cookie, then adds to it a secret string known only to the Web server. The
whole is then passed to the MD5 function to create a unique hash. This value is
again concatenated with the secret key, and the whole thing is rehashed. (The
second round of MD5 hashing is necessary in order to avoid an attack in which
additional data is appended to the end of the cookie and a new hash recalculated
by the attacker.)
This hash value is now incorporated into the cookie data. Later, when the cookie is returned to the server, the software should verify that the cookie hasn't expired and is being returned by the proper IP address. Then it should regenerate the MAC from the data fields, and compare that to the MAC in the cookie. If they match, there's little chance that the cookie has been tampered with.
Perl programmers can take advantage of the HMAC algorithm, a slightly more sophisticated technique that has been incorporated into a module by Gisle Aas. It is available at CPAN in the module MD5::Digest.
An alternative method is to encrypt the entire cookie with an encryption algorithm such as DES, IDEA or RC4. For more information on using encryption and hash algorithms, see the cryptography references at the end of this document.
For extremely sensitive applications, developers should probably encrypt the entire channel between browser and server using a non-crippled version of SSL. The cookie will be encrypted along with the rest of the data stream in such a way that network eavesdroppers cannot intercept the cookie without first cracking the encryption. To avoid the cookie being inadvertently disclosed across a non-secure channel, developers should set the "secure" attribute so that the browser only transmits the cookie when SSL is in effect.
Office: (415) 331-5582, 435-8510, 806-4222, Fax: (415) 789-8711
[Onsite Support | Web Site Design | Support Links ]
PCHelpMarin.com
is
Located in Marin
County, California
Mailing Address:
1001
Bridgeway, Sausalito, CA 94965
Professionals dispatched from Tiburon,
CA
E-mail:
Please E-mail us with any questions, comments or suggestions for this site
Back to
Homepage