I posted this on the other site some time ago. Here it is again, slightly edited:
I believe there is a misconception about what multi-core means especially in regard to what runs on what core. I believe it would be more accurate to imagine each core in your CPU as a checkout counter at a bank or market with there being a single line (queue) which is serviced in a "next available cashier" mode. It matters not whether there is one core or a hundred, the person/task/thread at the head of the line goes to whichever cashier/core is not currently in use.
Unlike people with a shopping cart full of things, what happens in the computer cashier, is only one thing can be purchased (task performed) then the customer (program), if it still has more to buy (things to do), has to go get back at the end of the line. The next time through the loop, the customer may or may not get the same cashier. In addition, in these time sharing multi-tasking O/S's we are using, not only is the customer restricted to "purchasing one item in each visit", but if they are taking too long, they get clipped mid transaction and again have to go to the end of the line (the time limit is often called process quantum). Of course, they get to resume the transaction exactly from where they left off at the next visit no matter which core they end up being assigned.
You can see this behavior in the CPU usage monitors. Those with multi-core CPUs running a single threaded program, say RW2 or MSTS, may see each CPU(core) being used roughly equally. I.e. a dual core will show each CPU being used roughly 50% and a quad core showing each core being used roughly 25%. This is because there is only one "person" (the single threaded program) in line and each time they/it gets put back in line, it ends up being at the head so they/it are immediately routed to the next available core. They use the core for their quantum period, put back in line, immediately routed to the next CPU, etc. Any kind of wait required of the task (any kind of I/O or mutex wait or what not) will also kick the task off the core but in that case, it is placed in a "go wait over there, we'll call you when your <wait> is finished" spot. When whatever the task was waiting for finishes (i.e. I/O completes), the task is placed back at the end of the line. I should point out if a core becomes available but no process is in line, the core spins in an idle loop (this shows up in the monitors as the idle time).
If for some odd reason one wanted to restrict a program to always run on one core no matter what, then the program can assign an 'affinity' to itself (or have it assigned) confining its scheduling to one or more cores, but I believe this is not typical and should be avoided. Even though one can assign an affinity of a program to one core, it won't prevent other programs from using that core also. I liken it to confining yourself to using an Express Lane at a market or the merchants window at a bank. Even though there might be other cashiers with no lines, you confine yourself to using the express lane regardless of whether there is anyone in line ahead of you. It strikes me as being less than optimum.*
Multi-threaded (or multi-process) programs can take advantage of multi-cores. Continuing along the cashier analogy, imagine yourself with your cart full of things preparing to get in line but this time you have 3 other individuals with you. Each of you grabs one item from the cart and gets in line. Each will be routed through however many cashiers there are. You can complete your checkout in 1/4 the time if there were at least 4 cores; 1/2 the time with 2 cores; no difference with 1 core. Except for the time it takes, you don't notice or change anything about your procedure whether there were 1 or 100 cashiers. Likewise, the computer won't do anything different whether there is one or 100 of you involved in your purchasing. In addition, having more cores than you have threads isn't going to help much unless you are also running additional programs.
* I am not sure why there is such a thing as affinity but I speculate it is to allow programs that were written multi-threaded but not multi-core aware to run without error. I.e. multi-threaded programs that were only ever tested on single core CPUs. It is pretty easy to screw up while writing multi-threaded applications but they will still work on single core processors because things the program expects may still happen in the intended order. With multi-core CPUs, parts of the program may (likely will) complete out of order producing some unexpected results. I.e. "Something bad has happened", "MSTS Phone home", BSOD, race conditions, mutex deadly embraces, etc.