[ Team LiB ] Previous Section Next Section

5.1 Downloading the Contents of a URL

Example 5-1 shows how you can download the network resource referred to by a URL using the URL class. This class serves mainly to represent and parse URLs but also has several important methods for downloading URLs. The most high-level of these methods is getContent( ), which downloads the content of a URL, parses it, and returns the parsed object. This method relies on special content handlers having been installed to perform the parsing. By default, the Java SDK has content handlers for plain text and for several common image formats. When you call the getContent( ) method of a URL object that refers to a plain text or GIF or JPEG image file, the method returns a String or Image object. More commonly, when getContent( ) doesn't know how to handle the data type, it simply returns an InputStream so that you can read and parse the data yourself.

Example 5-1 doesn't use the getContent( ) method. Instead, it calls openStream( ) to return an InputStream from which the contents of the URL can be downloaded. This InputStream is connected, through the network, to the remote resource named by the URL, but the URL class hides all the details of setting up this connection. (In fact, the connection is set up by a protocol handler class; the Java SDK has default handlers for the most common network protocols, including http:, ftp:, mailto: and file:.)

Example 5-1 is a simple standalone program that downloads the contents of a specified URL and saves it in a file or writes it to the console. You'll note that most of this program looks like it belongs in Chapter 3. In fact, as we'll see in this and other examples in this chapter, almost all networking involves the use of the stream-based I/O techniques we learned about in that chapter.

Example 5-1. GetURL.java
package je3.net;
import java.io.*;
import java.net.*;

 * This simple program uses the URL class and its openStream( ) method to
 * download the contents of a URL and copy them to a file or to the console.
public class GetURL {
    public static void main(String[  ] args) {
        InputStream in = null;   
        OutputStream out = null;
        try {
            // Check the arguments
            if ((args.length != 1)&& (args.length != 2)) 
                throw new IllegalArgumentException("Wrong number of args");
            // Set up the streams
            URL url = new URL(args[0]);   // Create the URL
            in = url.openStream( );        // Open a stream to it
            if (args.length == 2)         // Get an appropriate output stream
                out = new FileOutputStream(args[1]);
            else out = System.out;
            // Now copy bytes from the URL to the output stream
            byte[  ] buffer = new byte[4096];
            int bytes_read;
            while((bytes_read = in.read(buffer)) != -1)
                out.write(buffer, 0, bytes_read);
        // On exceptions, print error message and usage message.
        catch (Exception e) {
            System.err.println("Usage: java GetURL <URL> [<filename>]");
        finally {  // Always close the streams, no matter what.
            try { in.close( );  out.close( ); } catch (Exception e) {  }
    [ Team LiB ] Previous Section Next Section