Using the Worker Process' ThreadPool or not?

Dec 13, 2007 at 6:15 PM
Edited Dec 13, 2007 at 6:26 PM
According to MSDN (see below), some Fx BCL functions used by fbasync use the managed threadpool internally. Wouldn't it be better to use a custom threadpool to avoid ASP.NET worker process threadpool contention? Web requests to Facebook usually take between 250 and 500(?) ms; isn't that a bit long to be tying up a threadpool thread?

http://msdn2.microsoft.com/en-us/library/system.net.httpwebrequest.begingetresponse.aspx

EDIT: Actually, after reading up on the subject, I can't find a verifiable source as to whether or not (e.g.) BeginGetResponse() uses the same managed threadpool available to ASP.NET. It seems to vary with at least the IIS version and the Windows version. I'm assuming that on Windows Server 2003 and its IIS, BeginGetResponse() uses an IOCP thread outside of the worker process. Could you point me somewhere that has hard facts on this?
Coordinator
Dec 13, 2007 at 7:59 PM
I'm fairly certain it will end up back in the ASP.NET worker threadpool through the synchronization context, but I don't know for certain either. It's definitely unclear in the documentation. In either case, there is one thing not quite right about your statement above: "isn't that a bit long to be tying up a threadpool thread?". There is in fact never a thread tied up. That's the whole point of the asynchronous patterns. :) Notice in the sample code (and way down deep in the source) it is never waiting for a request to return. We delegate that responsibility all the way down to the Windows socket layer which uses shared I/O, IOCP, and other magic to fire a callback when data has been received on the socket. Nothing is every tying up a thread.

Oh, and Facebook calls don't typically take 250ms in my experience, though I have seen that happen on occasion. Just like any big service, they, or the internet, can have slowdowns.
Dec 13, 2007 at 10:23 PM
Edited Dec 13, 2007 at 10:27 PM
Hm, just to make sure we're talking about the same thing, I'm going to state alot of obvious things: The managed threadpool consists of worker threads and IOCP threads. According to MSDN, the default is 25 for both, multiplied by the number of CPUs. As long as BeginProcessRequest() in an asynchronous ASP.NET page is executing, 1 managed worker thread will be in use by the page (i.e. for a very short time). Also, as long as Facebook and the page are engaged in HTTP conversation, 1 managed IOCP thread will be in use (a few hundred milliseconds for me; it might be faster for someone who doesn't have to cross an ocean to reach the Facebook servers). A managed IOCP thread is used because BeginGetResponse() uses Winsock2 and IOCP internally, but if it were to, say, Queue a UserWorkItem, a managed worker thread would be used instead (on Mono, perhaps it will). This is what I mean by a thread being "tied up". So if I'm not mistaken; for 25 managed IOCP threads, there can only be 25 (actually, I think it's 23, because of the contention safeguards) Facebook API calls in action simultaneously. Right? If I monitor the threadpool when using fbasync, I can verify all this. Or have I missed something?

The point I was trying to raise is that using custom threads is more expensive but perhaps also more scalable. I was wondering whether you had given any thought to that, and if so, what conclusions you had reached. =]

http://msdn2.microsoft.com/en-us/library/ms979194.aspx
Coordinator
Dec 19, 2007 at 8:54 PM
I see your point. It was the intent of the library to never tie up a thread unless it was doing CPU intensive tasks. However, in real life, it appears there is one spot where that is not so.

I think the effects of an early design decision (laziness) are popping up with the amount of time the IOCP thread is tied up. Right now, the initial request is asynchronous (so your worker threads exit immediately). However, the reading of the response is not. I am just shoving the response stream into a Xml reader, which is going to synchronously read. That is likely what is tying up your IOCP thread. What the library should be doing, to be fully asynchronous, is using BeginRead on the response stream and shoving chunks of the response at a time into a local MemoryStream. That way we would effectively never be blocking in an IOCP thread. I'll make this change soon. Or, if you're feeling adventurous have a look at the code and give it a whirl! :) If you have a look at the FacebookConnection.WebResponseCallback method you see that I'm just grabbing the response stream and marking the request as complete. Instead of setting the response complete there, we should kick off a BeginRead process on the response stream, and write the data into a MemoryStream. When we're done reading (asynchronously) from the response stream, the result should be set to the MemoryStream we created, instead of the Http response stream. This will eliminate the last of the blocking in the library and you should hardly ever see any IOCP threads in use.

Here's a great article on the entire ASP.NET life-cycle, specifically related to threading and request processing: http://blogs.msdn.com/nicd/archive/2007/04/16/dissection-of-an-asp-net-2-0-request-processing-flow.aspx It's even up to date!
Coordinator
Dec 19, 2007 at 8:58 PM
Oh, to answer your question, using our own threadpool in this case would be a bad idea. Instead we should fix the design flaw and asynchronous read the response data.

There are very few instances where using your own threadpool is a good idea. The major one is when you are performing tasks that will block on I/O and you have no control over them. For example, if you're making a lot of calls out to LDAP or using another library that does I/O without exposing asychronous interfaces, it might be a good idea to have a small threadpool of your own and queue requests through it.
Coordinator
Feb 16, 2008 at 6:18 PM
The latest version in source control has the async fix I was talking about. Sorry I took so long to get to this. Finally had a reason to use this in the real world! The data is now streamed asynchronously from Facebook and buffered into a MemoryStream. The MemoryStream is then read to hydrate the objects. You should see a big improvement in the usage of IOCP threads.