WAS StartCopyFromBlob operation and Transaction Compensation

The latest Windows Azure SDKs v1.7.1 and 1.8  have a nice feature called “StartCopyFromBlob” that enables us to instruct Windows Azure data center to perform cross-storage accounts blob copy.  Prior to this, we need to download chunks of blob content then upload into the destination storage account.  Hence, “StartCopyFromBlob” is more efficient in terms of cost and time as well.

The notable difference in version 2012-02-12 is that copy operation is now asynchronous.  It means once you made a copy request to Windows Azure Storage service, it returns a copy ID (a GUID string), copy state and HTTP status code 202 (Accepted).  This means that your request is scheduled.  Post to this call, when you check the copy state immediately, it is most probably in “pending” state.

StartCopyFromBlob – An TxnCompensation operation

An extra care is required while using this API, since this is one of the real world transaction compensation service operation.  After making the copy request, you need to verify the actual status of the copy operation at later point in time.  The later point in time would be varied from very few seconds to 2 weeks based on various constraints like source blob size, permission, connectivity, etc.

The figure below shows a typical sequence of StartCopyFromBlob operation invocation.

(Click on the above image to see full view)

CloudBlockBlob and CloudPageBlob classes in Windows Azure storage SDK v1.8 provide StartCopyFromBlob() method which in turn calls the WAS REST service operation (http://msdn.microsoft.com/en-us/library/windowsazure/dd894037.aspx).  Based on the Windows Azure Storage Team blog post (http://blogs.msdn.com/b/windowsazurestorage/archive/2012/06/12/introducing-asynchronous-cross-account-copy-blob.aspx), this request is placed on internal queue and it returns copy ID and copy state.  The copy ID is an unique ID for the copy operation.  This can be used later to verify the destination blob copy ID and also the way to abort copy operation later point in time.  CopyState gives you copy operation status, number of bytes copying, etc.

Note that sequence 3 “PushCopyBlobMessage” in the above figure is my assumption about the operation.

ListBlobs – Way for Compensation

Although, copy ID is in your hand,  there is no simple API that receives array of copy IDs and to return the appropriate copy states.  Instead, you have to call CloudBlobContainer‘s ListBlobs() or GetXXXBlobReference() to get the copy state.  If the blob is created by the copy operation, then it will have the CopyState.

CopyState might be null for blobs that are not created by copy operation

The compensation action here is to take what we need to do when a blob copy operation is neither succeeded nor in pending state.  Mostly, the next call of StartCopyFromBlob()  will end up with successful blob copy.  Otherwise, further remedy should be taken.

Final Words

It is very pleasure to use StartCopyFromBlob().  It would be much more pleasure, if the SDK or REST version provides simple operations like the following:

  • GetCopyState(string[] copyIDs) : CopyState[]
  • RetryCopyFromBlob(string failedCopyId) : void