Quite often I see people making programming mistakes that will cost them dearly in the long run. They'll do things like use App.DoEvents to keep UI updating properly, or they'll make their program inaccessible by leaving tight loops run with no way to cancel them. These situations are perfect examples of when you should be using a thread.
Before we get to code, I'm going to explain a bit more about how threads work under the hood. A thread is nothing terribly special or magical. It's simply a way to slice program execution into discrete bits. I'm going to speak in some general (and in some cases not 100% technically correct, but easier to understand than the technical explanation) terms about how threads work in REALbasic.
When you create a thread, we create a data structure that points to the next bit of code to execute. Initially, this points to the Run event of the thread, and as the thread is executing, this instruction pointer is adjusted. The data structure keeps other pieces of information around as well (such as priority, current state, etc) so that the thread scheduler can do its job. In actuality, we're keeping a reference to an OS thread around and our thread scheduler actually controls (via OS calls) the OS thread the data structure owns. However, the concept is still generally the same -- it's only due to mundane reasons that we're forced to do it this way.
One thing to keep in mind is that a CPU can only execute one "bit" of code at a time. So even when using a thread, your application is running in a linear fashion. However, because the threads have such small (in human terms, but not in computer terms) slice of time that they are executing, it appears that threads run in parallel to one another. But it's important to keep in mind that only a single chunk of code is executing at a time.
There are two different threading models that are commonly used: preemptive threads and cooperative threads. With a preemptive thread, the OS has a set time slice that each thread executes in. When the thread's time slice is done, then the OS switches to another thread immediately. This causes all sorts of headaches for the programmer because that context switch could be happening in the middle of executing a line of code. This means you need to take special care when accessing any shared data and you can't make assumptions about what code has been executed between threads. Basically, it's much (much) easier to crash your application in strange and hard to reproduce ways when using preemptive threads. Thankfully, REALbasic uses the other threading model: cooperative. In this model, the threads need to cooperate with one another by telling the OS "I'm done now, please switch me out." We do this switching for you at certain boundary cases -- while evaluating a looping construct (like for or while loops), at the end of methods, etc. The downside to cooperative threads is that external (to the REALbasic framework) code needs to be able to alert the framework that a context switch is ok to do. We give plugin authors a way to do this, however, not all of them do it. And calling into declares certainly don't have a way to do this. More on this later.
So now that you know a bit more about threads in a general sense, let's talk about some design decisions you can make for your application so that you're using threads properly.
Let's say you've got a number crunching application -- you're calculating the next digit of PI. But since this is a long operation, you want your UI to still be useable so that the user can do things like select menu items or click on a Cancel button.
This is a perfect time to use threads! If we put the processing code into a thread, then it can run "in the background" while your application's main thread is allowed to keep the UI functional. The way that I usually structure this type of solution is with a message passing interface. This lets the thread update whatever is controlling it (so you can do things like update a StaticText, etc), and the controller can update the thread by calling methods on it.
So let's start with our interface:
[rbcode]
Interface MessageReceiver
Function ReceiveMessage( msg as String, data as Variant ) as Variant
End Function
End Interface
[/rbcode]
The interface is quite simple -- it lets the thread send a generic message along with a "cookie" of data. The receiver can then return some information back to the calling Thread. With this sort of generic interface, you can write code that both pushes information onto the controller as well as pulls data from it. Like this:
[rbcode]
call mReceiver.ReceiveMessage( "UI Update", "93% done processing" )
someInput = mReceiver.ReceiverMessage( "Get Input", nil )
[/rbcode]
Now, let's take a look at the thread.
[rbcode]
Class MySpiffyThread inherits Thread
Private Dim mReceiver as MessageReceiver
Private Dim mDone as Boolean
Sub Constructor( msg as MessageReceiver )
mReceiver = msg
mDone = false
End Sub
Sub Terminate()
mDone = true
End Sub
Sub Run() Handles Event
// Do some initialization stuff here
while not mDone
// Do some processing here
call mReceiver.ReceiveMessage( "Update", someUpdateValue )
wend
// Do some clean up processing here.
if not mDone then
call mReceiver.ReceiveMessage( "Done", results )
else
call mReceiver.ReceiveMessage( "Cancelled", nil )
end if
End Sub
End Class
[/rbcode]
Now we just need to see how to implement the other end of the example -- the controller. We'll use a Window to complete this example, but technically, you could use anything to control our Thread. First, create a Window and set its interfaces property to MessageReceiver. Now, in the form editor add a PushButton and set its text to Start and name it StartButton. Then, add another PushButton and set its text to Cancel (be certain to set its Cancel property to True as well), and its name should be CancelButton. Also, let's add a StaticText so we have a way to display status information. Then switch into the code editor. We need to add some code so that we satisfy the interface contract, as well as control our thread.
[rbcode]
Class Window1 Inherits Window Implements MessageReceiver
Private Dim mThread as MySpiffyThread
Function ReceiveMessage( msg as String, data as Variant ) as Variant
select case msg
case "Update"
StaticText1.Text = data.StringValue
case "Done"
MsgBox "Finished, with results: " + data.StringValue
case "Cancelled"
MsgBox "User cancelled the operation"
end select
return nil
End Function
Sub CancelButton.Action()
// We want to cancel the thread
Thread1.Terminate
End Sub
Sub StartButton.Action()
if mThread <> nil then return
// Create the thread
mThread = new MySpiffyThread( self)
// Run it
mThread.Run
End Sub
[/rbcode]
Ta da! You now have a threaded application where the thread does all the processing, and the main thread can still update all the UI.
However, if you try this out, you'll notice that your processing takes a lot longer. That's because the main thread has the same priority as the thread you created. The default priority for threads is 5. So let's talk about priorities a bit.
The way priorities work in REALbasic is that they are all relative to one another. So if you have two threads whose priorities are 5, then there are 10 total time slices for the application, and each thread gets 5 of them. If you have one thread with a priority of 10 and another with its priority as 5, then there are 15 total time slices and one thread gets twice as much CPU time as the other.
Back to our example. The problem we want to solve is that we want the worker thread to get a lot more processing time than the main thread. The main thread is just there to process the UI which doesn't take much effort. But the worker thread is there to crunch numbers and do other hard things, so we want it to get most of the CPU's time. So let's change the worker's priority!
I'd say that the UI only needs to get about 10% of the CPU time to be considered responsive. I just pulled this number out of thin air though, so you may want to fiddle with the numbers. In any event, that means we want the main thread, whose Priority is fixed at 5, to be 10% of the thread's priority. 10% of 50 is 5, so we've just found the thread's priority. Let's set it:
[rbcode]
Sub StartButton.Action()
if mThread <> nil then return
// Create the thread
mThread = new MySpiffyThread( self)
// Set its priority
mThread.Priority = 50
// Run it
mThread.Run
End Sub
[/rbcode]
Now the thread runs and uses 90% of the total application's CPU time to do its processing. You should see a significant boost in your thread's execution time now since the application isn't "wasting" a lot of its time in updating the UI.
So now that you've seen how to properly use a thread, let's talk about some of the pitfalls that people run into when working with threads in REALbasic.
One common mistake that people make when using threads is putting declare code into a thread. Don't bother. Declares have no way of calling back into the REALbasic framework to let it know when to do a context switch. So your thread will continue to use the CPU the entire time you're in the declared code. It's not until execution control comes back to REALbasic that the context switch will happen. So if you have a drawn-out declare, putting it into a thread won't help in any way. There are certain calls that REALbasic code makes that have this issue as well -- for example, FolderItem.CopyFileTo or MoveFileTo. Once the framework calls into an OS function, thread context switches cannot happen. It's unfortunate, but it's the price you pay for having an easy threading model that's very forgiving.
Another common mistake that I've seen is people thinking that threads are a panacea and will solve all sorts of speed problems. They won't -- remember, threads take up CPU time and all your code is executed in a linear fashion. This means that threads will actually slow your code down! Think about it for a moment: if your computing operation takes 10 seconds while using 100% of the CPU, it'll take 20 seconds if it only gets 50% of the CPU. However, the slowdown usually isn't a big deal for two reasons: 1) you can control how much CPU time a thread gets by changing priorities and 2) the slowdown doesn't matter since the goal is to keep the UI updating. Users only notice "lockups" when you're processing because of some visual or feedback cues that tell them the application is locked up. If the application never locks up, they're usually happy (within reason, of course).
I think that covers the majority of the information about threads themselves and how to use them within REALbasic. Perhaps in a future article I'll talk about resource management between threads and how to properly use the locking mechanisms REALbasic provides. Until next time, happy coding!
This article is great! I'm finally starting to understand how threads work and how to use them. And, as a side effect, I think I'm beginning to understand interfaces more. I'm from the VB6 camp, so threads and interfaces were never something I had to or could deal with.
Just a quick question: So for interfaces, they are a way to have define a way for two methods to communicate with each other without having to lock them into a specific instances?
Oh, one other comment,. it seems whenever you have regular text right after code without a line between, the regular text seems to run right off to the right. I'm guessing it's a CSS problem. I'm using IE6, don't know what those with FireFox, Mozilla, or Safari see.
Glad it helped you! :-)
As for your question: an interface is a way for you to generically define methods on a class. For example, there's a Readable interface which supports a method called Read on it. This means you can pass classes around as a Readable object and not have to worry what the actual class is.
So if you have Foo, Bar and Baz which all implement Readable, then you can do this:
Sub SpiffyAwesomeCool( r as Readable )
MsgBox r.Read( 10 )
End Sub
SpiffyAwesomeCool( new Foo )
SpiffyAwesomeCool( new Bar )
SpiffyAwesomeCool( new Baz )
Did that make sense?
Hmm, with FireFox, I don't see the issue. The only run-on problem I see is with code lines themselves, once they get too long. I'll try to remember to put an extra space between code and text though, since it sounds like that solves the issue.
Yup... I think I'm getting it with interfaces... Guess I'll know for sure after I try :P
>> code needs to be able to alert the framework that a context switch is ok to do.
>> We give plugin authors a way to do this, however, not all of them do it. And
>> calling into declares certainly don’t have a way to do this. More on this later.
I'm missing that essential google keyword today...what's the plug-in code for alerting the framework that a context switch is OK? #pragma? http://support.realsoftware.com/listarchives/realbasic-nug/2004-11/msg01631.html
I read this article looking for more insight into the best/proper/optimal way for a multi-threaded plug-in to call back into an RB event. Notes from this perspective would be appreciated.
Future articles would gain a lot of readability if you added a header for each of your main points. There are a lot of "rules" here which could be emphasized by 5 or so bold words atop the descriptive paragraph. "RB executables are Cooperative multitasking", "Don't put Declares into threads", for example.
If your looking for more filler for future thread articles, please consider a compare and contract regarding Sleep and Suspend.
I believe it's REALYieldToRB or something along those lines.
I wasn't anticipating the post being as long as it turned out to be, otherwise I would have just written it as an article.
But you bring up an interesting idea. Perhaps I could post a few more blogs about threads and then combine them all into a coherent article.
Great article Aaron! It helped me a lot to understand how threads works and how and when to use them.
Could you please explain in what circumnstances we should change the thread's "StackSize" property default value (64K Mac/Linux - 940K Win)? The LR does not help a lot on this :)
I think I'll write another post or two about threading since it seems like there's a fair amount of confusion on the topic. However, to answer your question quickly, the stack size is something you don't usually need to monkey with. So you probably shouldn't be changing it (just leave it set to 0 so we'll pick the right thing). However, you would change the stack size if you had to create a whole bunch of local variables or other things that go on the stack. It's not something you really run into often.
>> Perhaps I could post a few more blogs about threads and then combine them all into a coherent article.
Yep. Blog more, collate later. :D Blogging's just a time-sliced article...
Sub CancelButton.Action()
// We want to cancel the thread
Thread1.Terminate
shouldn't this be : mThread.Terminate ??
Further i don't get the calculation of the 90 % cpu-time IMHO the Spiffythread gets 50 timeslices and the main thread gets 5 timeslices, then Spiffy gets 50/55 of the total is 90,9%, the Spiffythread should get 45 timeslices then IMO it's 90%.OK just a rounding error.
And why isn't my listing showing the same layout as yours, e.g.
Class MySpiffyThread
Inherits Thread
shown in two lines? And why don't we have colored or bold keywords anymore as we had in R\B5.5.5?
Don't get me wrong, this is a very good article and it shows me a better insight in the handling of threads.
And i think even that the explanation of the use of interface justifies a seperate article, as i can imagine now that it has enormous possibilities i never saw myself before.
Congratulations, again a SPLENDID explanation.
You're absolutely right -- it should be mThread, not Thread1. Good catch!
And yes, there's a rounding error there as well, thanks for pointing it out. :-)
I'm not certain I follow you about the layout issues. I just typed all that up by hand in the web browser, I didn't grab it from the IDE in any way. I usually try to keep my code listings as close to RB-syntax as possible, which is why you see things like Handles Event, but sometimes I take shortcuts as well (such as the Interface code -- it wouldn't have an End Function on it since it's a declaration, not a definition).
Aha that clarifies the little differences in the layout.
I guessed you copied right from the IDE but that you used RB5.9xx because your keywords are blue colored and with RB2005 in the print options, the option bold and colored keywords aren't available anymore.
IMO you could use a virtual pdf-printer to create programlistings for this blog. It probably would save you some time.
Again thanks for the excelent explanation and please keep going!
Does that mean that having multiple threads will not be spread across multiple processors (as that will force them to be pre-emptive)?
Correct, cooperative threads are not spread across multiple CPUs. In order to take advantage of multi-CPUs, you'd need a preemptive thread.
A suggestion for the article, if you don't mind :)
How to use multiple threads to perform a job.
If you could provide an example that would be great - it could be an array of 20 URLs to perform a specific job (downloading the pages with an HTTPSocket and store on disk) with only 3 threads, using the above interface method. I think that using a group of URLs would be fine as any user could try it.
BTW, I don’t know if using threads for this kind of job is the best way to do it :)
>>I guessed you copied right from the IDE but that you used RB5.9xx because your keywords are blue colored and with RB2005 in the print options, the option bold and colored keywords aren’t available anymore.
Actually, I think someone wrote a PHP syntax highlighter for RB code. Aaron's probably not using it here, though, since this is WordPress. How do you do that, Aaron? Copying right from the IDE wouldn't format the code properly.
I use a version of Jon's RB syntax highlighter: found here.