Eric Bidelman, G Suite APIs team
February 2010
Introduction
Current web standards provide no reliable mechanism to facilitate the HTTP upload of large files. As a result, file uploads at Google and other sites have traditionally been limited to moderate sizes (e.g. 100 MB). For services like the YouTube and the Google Documents List APIs which support large file uploads, this presents a major hurdle.
The Google Data resumable protocol directly addresses the aforementioned issues by supporting resumable POST/PUT HTTP requests in HTTP/1.0. The protocol was modeled after the ResumableHttpRequestsProposal suggested by Google Gears team.
This document describes how to incorporate Google Data's resumable upload feature into your applications. The examples below use the Google Documents List Data API. Note that additional Google APIs that implement this protocol may have slightly different requirements/response codes/etc. Please consult the service's documentation for the specifics.
The Resumable Protocol
Initiating a resumable upload request
To initiate a resumable upload session, send an HTTP POST
request to the resumable-post link. This link is found at the feed level.
The DocList API's resumable-post link looks like:
<link rel="http://schemas.google.com/g/2005#resumable-create-media" type="application/atom+xml" href="https://docs.google.com/feeds/upload/create-session/default/private/full"/>
The body of your POST
request should be empty or contain an Atom XML entry and must not include the actual file contents.
The example below creates a resumable request to upload a large PDF, and includes a title for the future document using the
Slug
header.
POST /feeds/upload/create-session/default/private/full HTTP/1.1 Host: docs.google.com GData-Version: version_number Authorization: authorization Content-Length: 0 Slug: MyTitle X-Upload-Content-Type: content_type X-Upload-Content-Length: content_length empty body
The X-Upload-Content-Type
and X-Upload-Content-Length
headers should be set to
the mimetype and size of the file you will eventually upload. If the content length is unknown at
the creation of the upload session, the X-Upload-Content-Length
header can be omitted.
Here is another example request that instead uploads a word document. This time, Atom metadata is included and will be applied to the final document entry.
POST /feeds/upload/create-session/default/private/full?convert=false HTTP/1.1
Host: docs.google.com
GData-Version: version_number
Authorization: authorization
Content-Length: atom_metadata_content_length
Content-Type: application/atom+xml
X-Upload-Content-Type: application/msword
X-Upload-Content-Length: 7654321
<?xml version='1.0' encoding='UTF-8'?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:docs="http://schemas.google.com/docs/2007">
<category scheme="http://schemas.google.com/g/2005#kind"
term="http://schemas.google.com/docs/2007#document"/>
<title>MyTitle</title>
<docs:writersCanInvite value="false"/>
</entry>
The server's response from the initial POST
is a unique upload URI in the Location
header
and an empty response body:
HTTP/1.1 200 OK
Location: <upload_uri>
The unique upload URI will be used to upload the file chunks.
Note: The initial POST
request does not create a new entry in the feed.
This only happens when the entire upload operation has completed.
Note: A resumable session URI expires after one week.
Uploading a file
The resumable protocol allows, but doesn't require, content to be uploaded in 'chunks', because there are no inherent
restrictions in HTTP on request sizes. Your client is free to choose its chunk size or just upload the file as a whole.
This example uses the unique upload URI to issue a resumable PUT
. The following example sends the first 100000
bytes of 1234567 byte PDF file:
PUT upload_uri HTTP/1.1 Host: docs.google.com Content-Length: 100000 Content-Range: bytes 0-99999/1234567 bytes 0-99999
If the size of the PDF file was unknown, this example would use Content-Range: bytes
0-99999/*
. Read more information on the Content-Range
header
here.
Server responds with the current byte range that has been stored:
HTTP/1.1 308 Resume Incomplete Content-Length: 0 Range: bytes=0-99999
Your client should continue to PUT
each chunk of the file until the entire file has been uploaded.
Until the upload is complete, the server will respond with an HTTP 308 Resume Incomplete
and the byte range it knows
about in the Range
header. Clients must use the Range
header to determine where to start the next chunk.
Therefore, do not assume that the server received all bytes originally sent in the PUT
request.
Note: The server may issue a new unique upload URI in the Location
header during a chunk. Your client should
check for an updated Location
and use that URI to send the remaining chunks to the server.
When the upload is complete, the response will be the same as if the upload had been made using
the API's non-resumable upload mechanism. That is to say, a 201 Created
will be returned along with the <atom:entry>
,
as created by the server. Subsequent PUT
s to the unique upload URI will return the same response as what was returned when the upload completed.
After a period of time, the response will be 410 Gone
or 404 Not Found
.
Resuming an upload
If your request is terminated prior to receiving a response from the server or if you receive an HTTP 503
response from the server, you can
query the current status of the upload by issuing an empty PUT
request on the unique upload URI.
Client polls the server to determine which bytes it has received:
PUT upload_uri HTTP/1.1 Host: docs.google.com Content-Length: 0 Content-Range: bytes */content_length
Use *
as the content_length if the length is not known.
Server responds with the current byte range:
HTTP/1.1 308 Resume Incomplete Content-Length: 0 Range: bytes=0-42
Note: If the server has not committed any bytes for the session, it
will omit the Range
header.
Note: The server may issue a new unique upload URI in the Location
header during a chunk. Your client should
check for an updated Location
and use that URI to send the remaining chunks to the server.
Finally, the client resumes where the server left off:
PUT upload_uri HTTP/1.1 Host: docs.google.com Content-Length: 57 Content-Range: 43-99/100 <bytes 43-99>
Canceling an upload
If you want to cancel the upload and prevent any further action on it, issue a
DELETE
request on the unique upload URI.
DELETE upload_uri HTTP/1.1 Host: docs.google.com Content-Length: 0
If successful, the server responds that the session is canceled, and responds with the same code
for further PUT
s or query status requests:
HTTP/1.1 499 Client Closed Request
Note: If an upload is abandoned without cancelation, it naturally expires one week after creation.
Updating an existing resource
Similar to initiating a resumable upload session, you can utilize
the resumable upload protocol to replace an existing file's content. To start a resumable update request,
send an HTTP PUT
to the entry's link with rel='...#resumable-edit-media
'.
Each media entry
will contain such a link if the API supports updating the resource's content.
As an example, a document entry in the DocList API will contain a link similar to:
<link rel="http://schemas.google.com/g/2005#resumable-edit-media" type="application/atom+xml" href="https://docs.google.com/feeds/upload/create-session/default/private/full/document%3A12345"/>
Thus, the initial request would be:
PUT /feeds/upload/create-session/default/private/full/document%3A12345 HTTP/1.1 Host: docs.google.com GData-Version: version_number Authorization: authorization If-Match: ETag | * Content-Length: 0 X-Upload-Content-Length: content_length X-Upload-Content-Type: content_type empty body
To update a resource's metadata and content at the same time, include Atom XML instead of an empty body. See the example in the Initiating a resumable upload request section.
When the server responds with the unique upload URI, send a PUT
with your payload. Once you have the unique
upload URI, the process for updating the file's content is the same as uploading a file.
This particular example will update the existing document's content in one shot:
PUT upload_uri HTTP/1.1 Host: docs.google.com Content-Length: 1000 Content-Range: 0-999/1000 <bytes 0-999>
Client library examples
Below are samples of uploading a movie file to Google Docs (using the resumable upload protocol) in the Google Data client libraries. Note, that not all of the libraries support the resumable feature at this time.
int MAX_CONCURRENT_UPLOADS = 10; int PROGRESS_UPDATE_INTERVAL = 1000; int DEFAULT_CHUNK_SIZE = 10485760; DocsService client = new DocsService("yourCompany-yourAppName-v1"); client.setUserCredentials("user@gmail.com", "pa$$word"); // Create a listener FileUploadProgressListener listener = new FileUploadProgressListener(); // See the sample for details on this class. // Pool for handling concurrent upload tasks ExecutorService executor = Executors.newFixedThreadPool(MAX_CONCURRENT_UPLOADS); // Create {@link ResumableGDataFileUploader} for each file to upload Listuploaders = Lists.newArrayList(); File file = new File("test.mpg"); String contentType = DocumentListEntry.MediaType.fromFileName(file.getName()).getMimeType(); MediaFileSource mediaFile = new MediaFileSource(file, contentType); URL createUploadUrl = new URL("https://docs.google.com/feeds/upload/create-session/default/private/full"); ResumableGDataFileUploader uploader = new ResumableGDataFileUploader(createUploadUrl, mediaFile, client, DEFAULT_CHUNK_SIZE, executor, listener, PROGRESS_UPDATE_INTERVAL); uploaders.add(uploader); listener.listenTo(uploaders); // attach the listener to list of uploaders // Start the upload(s) for (ResumableGDataFileUploader uploader : uploaders) { uploader.start(); } // wait for uploads to complete while(!listener.isDone()) { try { Thread.sleep(100); } catch (InterruptedException ie) { listener.printResults(); throw ie; // rethrow }
// Chunk size in MB int CHUNK_SIZE = 1; ClientLoginAuthenticator cla = new ClientLoginAuthenticator( "yourCompany-yourAppName-v1", ServiceNames.Documents, "user@gmail.com", "pa$$word"); // Set up resumable uploader and notifications ResumableUploader ru = new ResumableUploader(CHUNK_SIZE); ru.AsyncOperationCompleted += new AsyncOperationCompletedEventHandler(this.OnDone); ru.AsyncOperationProgress += new AsyncOperationProgressEventHandler(this.OnProgress); // Set metadata for our upload. Document entry = new Document() entry.Title = "My Video"; entry.MediaSource = new MediaFileSource("c:\\test.mpg", "video/mpeg"); // Add the upload uri to document entry. Uri createUploadUrl = new Uri("https://docs.google.com/feeds/upload/create-session/default/private/full"); AtomLink link = new AtomLink(createUploadUrl.AbsoluteUri); link.Rel = ResumableUploader.CreateMediaRelation; entry.DocumentEntry.Links.Add(link); ru.InsertAsync(cla, entry.DocumentEntry, userObject);
- (void)uploadAFile { NSString *filePath = @"~/test.mpg"; NSString *fileName = [filePath lastPathComponent]; // get the file's data NSData *data = [NSData dataWithContentsOfMappedFile:filePath]; // create an entry to upload GDataEntryDocBase *newEntry = [GDataEntryStandardDoc documentEntry]; [newEntry setTitleWithString:fileName]; [newEntry setUploadData:data]; [newEntry setUploadMIMEType:@"video/mpeg"]; [newEntry setUploadSlug:fileName]; // to upload, we need the entry, our service object, the upload URL, // and the callback for when upload has finished GDataServiceGoogleDocs *service = [self docsService]; NSURL *uploadURL = [GDataServiceGoogleDocs docsUploadURL]; SEL finishedSel = @selector(uploadTicket:finishedWithEntry:error:); // now start the upload GDataServiceTicket *ticket = [service fetchEntryByInsertingEntry:newEntry forFeedURL:uploadURL delegate:self didFinishSelector:finishedSel]; // progress monitoring is done by specifying a callback, like this SEL progressSel = @selector(ticket:hasDeliveredByteCount:ofTotalByteCount:); [ticket setUploadProgressSelector:progressSel]; } // callback for when uploading has finished - (void)uploadTicket:(GDataServiceTicket *)ticket finishedWithEntry:(GDataEntryDocBase *)entry error:(NSError *)error { if (error == nil) { // upload succeeded } } - (void)pauseOrResumeUploadForTicket:(GDataServiceTicket *)ticket { if ([ticket isUploadPaused]) { [ticket resumeUpload]; } else { [ticket pauseUpload]; } }
import os.path import atom.data import gdata.client import gdata.docs.client import gdata.docs.data CHUNK_SIZE = 10485760 client = gdata.docs.client.DocsClient(source='yourCompany-yourAppName-v1') client.ClientLogin('user@gmail.com', 'pa$$word', client.source); f = open('test.mpg') file_size = os.path.getsize(f.name) uploader = gdata.client.ResumableUploader( client, f, 'video/mpeg', file_size, chunk_size=CHUNK_SIZE, desired_class=gdata.docs.data.DocsEntry) # Set metadata for our upload. entry = gdata.docs.data.DocsEntry(title=atom.data.Title(text='My Video')) new_entry = uploader.UploadFile('/feeds/upload/create-session/default/private/full', entry=entry) print 'Document uploaded: ' + new_entry.title.text print 'Quota used: %s' % new_entry.quota_bytes_used.text
For complete samples and source code reference, see the following resources:
- Java library sample app and source
- Objective-C library sample app
- .NET library source