《HTTP Programming Recipes for C# Bots》
第一章
选择GET还是POST取决于传送到服务器的数据的多少。GET传送的数据少,POST几乎对传送的数据无限制。
It is important to note that only one physical file is transferred per HTTP request.
每次HTTP请求只传送了一个物理文件
调用顺序:
• Step 1: Obtain a HttpWebRequest object.
获取一个HttpWebRequest对象
• Step 2: Set any HTTP request headers.设置HTTP请求头
• Step 3: POST data, if this is a POST request.附上数据,如果是POST请求?
• Step 4: Obtain a HttpWebResponse object.获取一个HttpWebResponse对象
• Step 4: Read HTTP response headers.读取HTTP响应头
• Step 5: Read HTTP response data.
读取HTTP响应数据
Typical Request HeadersGET /1/1/typical.php HTTP/1.1Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,application/x-shockwave-flash, */*Referer: http://www.httprecipes.com/1/1/Accept-Language: en-usUA-CPU: x86Accept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1;.NET CLR 1.1.4322; .NET CLR 2.0.50727)Host: www.httprecipes.comConnection: Keep-AliveThere are really two parts to the headers: the first line and then the rest of the header lines. The first line, which begins with the request type, is the most important line in the header block, and it has a slightly different format than the other header lines. The request type can be GET, POST, HEAD, or one of the other less frequently used headers. Browsers will always use GET or POST. Following the request type is the file that is being requested. In the above request, the following URL is being requested:There are really two parts to the headers: the first line and then the rest of the header lines. The first line, which begins with the request type, is the most important line in the header block, and it has a slightly different format than the other header lines. The request type can be GET, POST, HEAD, or one of the other less frequently used headers. Browsers will always use GET or POST. Following the request type is the file that is being requested. In the above request, the following URL is being requested:There are really two parts to the headers: the first line and then the rest of the header lines. The first line, which begins with the request type, is the most important line in the header block, and it has a slightly different format than the other header lines. The request type can be GET, POST, HEAD, or one of the other less frequently used headers. Browsers will always use GET or POST. Following the request type is the file that is being requested. In the above request, the following URL is being requested:
The above URL is not represented exactly as seen above in the request header. The
“Host” header line in the header names the web server that contains the file. The request shows the remainder of the URL, which in this case is /1/1/typical.php. Finally, the third thing that the first line provides is the version of the HTTP protocol being used. As of the writing of this book there are only two versions currently in widespread use: • HTTP/1.1 • HTTP/1.0 This book only deals with HTTP 1.1. Because this book is about writing programs to connect to web servers, it will be assumed that HTTP 1.1 is being used, which is what C# uses when the C# HTTP classes are used. The lines after the first line make up the actual HTTP headers. Their format is colon delimited. The header name is to the left of the colon and the header value is to the right. It is valid to have two of the same header names in the same request. Two headers of the same name are used when cookies are specified. Cookies will be covered in Chapter 8, “Handling Sessions and Cookies.” The headers give a variety of information. Examining the headers shows type of browser being used as well as the operating system, as well as other information. In the headers listed above in Listing 1.3, the Internet Explorer 7 browser was being used on the Windows XP platform. The headers finally terminate with a blank line. If the request had been a POST, any posted data would follow the blank line. Even when there is no posted data, as is the case with a GET, the blank line is still required. A web server should respond to every HTTP request from a web browser. The web server’s response is discussed in the next section.
HTTP Response Headers
When the web server responds to a HTTP request, HTTP response header lines are sent. The HTTP response headers look very similar to the HTTP request headers. Listing 1.4 shows the contents of typical HTTP response headers.Listing 1.4: Typical Response HeadersHTTP/1.1 200 OKDate: Sun, 02 Jul 2006 22:28:58 GMTServer: Apache/2.0.40 (Red Hat Linux)Last-Modified: Sat, 29 Jan 2005 04:13:19 GMTETag: "824319-509-c6d5c0"Accept-Ranges: bytesContent-Length: 1289Connection: closeContent-Type: text/html
As can be seen from the above listing, at first glance, response headers look nearly the
same as request headers. However, look at the first line. Although the first line is space delimited as in the request, the information is different. The first line of HTTP response headers contains the HTTP version and status information about the response. The HTTP version is reported as 1.1, and the status Code, 200, means “OK,” no error. Also, this is where the famous error code 404 (page not found) comes from. Error codes can be grouped according to the digit in their hundreds position: • 1xx: Informational - Request received, continuing process • 2xx: Success - The action was successfully received, understood, and accepted • 3xx: Redirection - Further action must be taken in order to complete the request • 4xx: Client Error - The request contains bad syntax or cannot be fulfilled • 5xx: Server Error - The server failed to fulfill an apparently valid request Immediately following the headers will be a blank line, just as was the case with HTTP requests. Following the blank line delimiter will be the data that was requested. It will be of the length specified in the Content-Length header. The Content-Length header in Listing 1.4 indicates a length of 1289 bytes. For a list of HTTP codes, refer to Appendix E, “HTTP Response Codes.”