Wednesday, October 29, 2008

Deadlock in OpenXML When Called Twice Within the Same Transaction in each of two Concurrent Instances of a Stored Procedure

THE PROBLEM

We encountered a deadlock in a stored procedure that occurs when two instances of the stored procedure run at the same time in different threads. The stored procedure itself is a fairly simple operation that takes an XML document as a parameter and uses OpenXML to extract two hierarchical levels of content from the XML document. The procedure performs the following tasks in order:

  1. Opens a transaction;
  2. Extracts Order information from the XML document using OpenXml and INSERTS a row into the Order table;
  3. Uses SCOPE_IDENTITY to retrieve the new Primary Key from the insert as @OrderID;
  4. Extracts Order Item information from the XML document in a second OpenXML call and INSERTS 0 or more rows into the OrderItem table.
  5. Commits the transaction.

The procedure deadlocks with another instance of itself in Step 4 if two calls are made at the same time from two different threads (i.e. the calling Web Service receives two orders from two users at the same time.)

Using SQL Server Profiler, we have observed that both procedures make it past step 3 and into step 4. There is a foreign key constraint between Order.OrderID (PK) and OrderLine.OrderID (FK). The execution plan suggests that a Clustered Index INSERT on the child table and a Clustered Index SEEK on the parent table at the same time. By design, SQL Server takes a shared lock on the Order table and an exclusive lock on the OrderItem table, not exactly the recipe for a deadlock situation. Yet there it was, time after time after time.

THE SOLUTION

The SQL statement that deadlocked was a standard INSERT INTO dbo.OrderItem (OrderID, (columns)) SELECT (@orderID, (columnList)) FROM OpenXML WITH (ColumnList) where @OrderID is the new OrderID from Step 3 and (ColumnList) is the result set of order items from the XML document. To solve the problem, we moved access to the OpenXml document outside of the transaction and stored its result set in a table variable. We did this for both OpenXml accesses (i.e. Step 2 and Step 4). The new step order for the procedure became:

  1. Create two table variables: @OrderTable (OrderColumns) and @OrderItem (OrderItemColumns)
  2. INSERT INTO @OrderTable SELECT ((orderColumns) FROM OpenXML WITH (OrderColumns)
  3. INSERT INTO @OrderItemTable SELECT ((orderItemColumns) FROM OpenXML WITH (OrderItemColumns)
  4. BEGIN TRANSACTION
  5. INSERT INTO dbo.Order(orderColumns) SELECT (orderColumns) FROM @OrderTable
  6. Use SCOPE_IDENTITY to retrieve the new Primary Key from the insert as @OrderID
  7. INSERT INTO dbo.OrderItem(@OrderID, orderItemColumns) SELECT (orderItemColumns) FROM @OrderItemTable
  8. COMMIT TRANSACTION.

The OpenXML calls were moved outside of the Transaction scope. This limited the scope of any shared locks on each OpenXml instance to the queries that accessed them.

So WHY Did This Fix the Problem?

We're not quite sure. A search of Microsoft Technet did not reveal any articles that described the specific problem. However, it feels as if the two concurrent procedure calls, each running in its own process and accessing its own instance of OpenXML twice from within its own transaction, created a thread deadlock inside the OpenXML component.

In closing, we add that while OpenXML is still supported, Microsoft now considers XQuery to be the preferred method to do what has traditionally been done using OpenXML. We'll have more on XQuery as a replacement for OpenXML in a future post. Stay Tuned.

No comments: