27 Aug 2009 @ 11:14 PM 

In my previous post, we have seen basis of Task Parallel Library, where I have mentioned new task scheduler in .NET 4.0 thread pool.  In this post I brief about it along with “Task” class.   Before this, let us see how to make LINQ as parallel using PLINQ.

PLINQ

PLINQ has implementation of  all LINQ to Objects extension methods in System.Linq.ParallelEnumerable for IEnumerable.  To make an IEnumerable parallel, you simply call AsParallel() extension method.  AsParallel() internally calls System.Linq.Parallel.ParallelEnumerableWrapper which is of type System.Linq.ParallelQuery<TSource>.  Let us take the same customer-order example as shown in my previous post with  PLINQ enabled.

var customersForOrderId7 =
    from c in customers.AsParallel()
    from o in c.Orders
    where o.OrderId == 7
    select c;
 

customersForOrderId7.ForAll(c =>
    Console.WriteLine(c.ToString()));

The above code selects customers who have ordered OrderId 7.  AsParallel() at line 3 makes Customer[] as parallel enabled.  In this example, you can see that instead of typical for…each, I’ve used ForAll().  This differs from for..each, means that for..each preservs the final order of the results.

Thread Pool and Work Stealing Queue

Until .NET3.5, we can queue our work items as thread in to ThreadPool using QueueUserWorkItem().  In .NET 4.0, improvements had been made in the thread pool.  It is now based on System.Threading.Tasks.Task

Task

Task is a work item which can be executed independently along with other tasks, technically it represents an asynchronous operation i.e. System.Action.  It is defined in System.Threading.Tasks.Task which enables you to do the following actions on a task instance:

  • start – to start the instance
  • wait – to wait to for a task to complete
  • cancel – cancel the task asynchronously

It seems that task is a lite version of thread.

Thread Pool

Following diagram shows the conceptual view of .NET 4.0 thread pool.

.NET 4.0 Thread Pool

Previously, thread pool had only one queue on which all the work items were queued and enqueued in FIFO order (ofcourse, thats why queue).  The worker threads are allocated for every work item access the work item from this queue.  In .NET 4.0, it has been improved by introducing local queue for every worker thread, in addition to qlobal queue.  Tasks those are created by program thread queued on global queue.  The task scheduler enqueues the tasks from global queue in FIFO order and distributes to respective worker thread’s local queue.  The worker thread enqueues the tasks from its local queue in LIFO order.  Interesting!

The introduction of local queue makes these threads can be executed on different processors without contention issue which normally occur in single queue thread pool.  The reason for worker thread picking up the tasks in LIFO order is the assumption that “last-in” is hot to act which results no qurantee in task ordering, but better performance.

Work Stealing Queue

Let us assume that worker thread 1 completed all of its tasks in the queue.  It scans the global queue for any task and if nothing there, it scans other worker thread’s local queue.  In this case, let us assume there are two tasks in WT2 queue.  WT1 enqueues first-in task from WT2 queue which avoids contention issue.  Getting tasks from other local queue is called as work stealing.  This again results better performance.

References

MSDN article: Task Parallel Library Overview: http://msdn.microsoft.com/en-us/library/dd460717(VS.100).aspx

Daniel Moth’s videocast and PPT deck: http://channel9.msdn.com/pdc2008/TL26/ and his article about thread pool at http://www.danielmoth.com/Blog/2008/11/new-and-improved-clr-4-thread-pool.html

My Code sample for PLINQ as used in this post is at http://www.udooz.net/file-drive/doc_details/9-tpl-and-plinq-example.html

Tags Tags: , , , , , ,
Categories: .NET
Posted By: udooz
Last Edit: 27 Aug 2009 @ 11 18 PM

EmailPermalinkComments (0)
 26 Aug 2009 @ 11:49 PM 

We are in the multi-core era, where our applications are expected to effectively use these cores.  Simply, go parallel, means that partitioning the work being done into smaller pieces those are executed on the available processors in the target system.  Until .NET 3.5, parallel means we are used to use ThreadPool and Thread classes.  We are not in a situation to have a component which partition and schedule work into work items on different cores.  A craft-and-weave parallel framework from Intel is available while for MC++ and not sure how easy it is to use.

Even though threads are basic to parallel, from a developer perspective, what required for parallel programming is not the thread, work items in a work, named “Task”.

.NET 4.0 Parallel Extensions 

Task represents a work item being performed independently along with other work items.  This is the approach taken by the experts in Microsoft Parallel Computing division, and released parallel extensions (Task Parallel Library and Parallel LINQ) in .NET 4.0 beta 1.  As like Microsoft’s other frameworks, developers need not worry about the internals.  The task partitioning and scheduling of the work items are taken care by parallel extensions.

Task and Thread

Let us first understand the relationship between task and thread.  See the following figure.

Thread and Task Relationship

Typically task is resided in a thread.  A task may contain one or more child tasks those are not necessarly resided in the parent task’s thread.  In the above figure, Child Task M2 of Task M resided in Thread B.

Task Parallel Library (TPL)

This library provides API to perform task based parallel programming under System.Threading and System.Threading.Tasks namespaces.  The partitioned tasks are automatically scheduled on available processors by Task Scheduler which is in ThreadPool.  The work stealing queue algorithm in the task scheduler makes the life easier.  I’ll explain about this in the next post. 

Scheduling the tasks on the processors is the runtime behaviour of the task scheduler, so it does support ’scale up’ without recompiling the program on the target machine with few or more cores. 

There are two types of parallelism you can do using TPL.

  • Data Parallelism
  • Task Parallelism

Data Parallelism

It is very common to perform a set of actions against each element in a collection of data using for and for..each.  Parallel.For() and Parallel.ForEach() enable to make your collection processing actions to be parallel.  The following figure depicts this.

 Data Parallelism

Let us see this with typical customer-order collection.  The following code shows Customer and Order declaration.

public class Customer
{
    public string Name;
    public string City;
    public Order[] Orders;
 

    public override string ToString()
    {
        return String.Format("Name: {0} - City: {1}",
            Name, City);
    }
}

public class Order
{
    public int OrderId;
    public int Quantity;

    public override string ToString()
    {
        return String.Format("Order Id: {0} - Quantity: {1}",
            OrderId, Quantity);
    }
}

Let us create a in-memory collection for this as like below code.

var customers = new Customer[]
{
    new Customer{Name="Hussain", City="Vizak", Orders =
        new Order[]
        {
            new Order{OrderId=1, Quantity=10},
            new Order{OrderId=10, Quantity=5}
        }},
    new Customer{Name="Abdul", City="Chennai", Orders =
        new Order[]
        {
            new Order{OrderId=7, Quantity=2},
            new Order{OrderId=10, Quantity=4}
        }},
    new Customer{Name="Daniel", City="Texas", Orders =
        new Order[]
        {
            new Order{OrderId=12, Quantity=3},
            new Order{OrderId=7, Quantity=1},
            new Order{OrderId=10, Quantity=1}
        }}
};

Let us write a simple parallel code which iterate through each customers and its orders.

Parallel.ForEach<Customer>(customers,
    customer =>
    {
        Console.WriteLine("**** {0} ****", customer.ToString());
        Parallel.ForEach<Order>(customer.Orders,
            order =>
            {
                Console.WriteLine("\t{0}", order.ToString());
            });
    });

In the above code, printing customer detail is one task and printing order detail for a specified customer is the child one for that.  I’ve taken simple overloaded version of For…Each

public static ParallelLoopResult ForEach<TSource>(IEnumerable<TSource> source, Action<TSource> body);

The task scheduler partitions the customers and orders, and schedule them into available processors.

Task Parallelism

In cases like multiple distinct actions to be performed concurrently on the same or different source.  The following figure depicts this.  Note that task count need not be equal to number of actions.

Task Parallelism

Parallel.Invoke() is used for this.  See the following code.

Parallel.Invoke(() =>
    {
        Console.WriteLine("Task 1.  Getting total quantity ordered for OrderId 10");
        var orderId10Sum =
            (from c in customers
            from o in c.Orders
            where o.OrderId == 10                       
            select o.Quantity).Sum();
        Console.WriteLine("Quantity: {0}", orderId10Sum);
    },() =>
    {
        Console.WriteLine("Task 2.  Getting number of customers and their city ordered OrderId 7");
        var items =
            from c in customers
            from o in c.Orders
            where o.OrderId == 7
            select c.City;
        Console.WriteLine("Count: {0}\nFrom following City:", items.Count());
 

        foreach (string city in items)
            Console.WriteLine(city);
    }
);

I’ve taken the following overloaded version of Parallel.Invoke().

public static void Invoke(params Action[] actions);

I’ve specified two different actions those are acted on customers.

The source code for this article is available at http://www.udooz.net/file-drive/doc_details/8-net-40-tlp-demo-1.html.

In the next post, I’ll explain the more about tasks, Parallel LINQ portion and task scheduler.

Tags Tags: , , , ,
Categories: .NET
Posted By: udooz
Last Edit: 27 Aug 2009 @ 08 13 PM

EmailPermalinkComments (1)
 15 Aug 2009 @ 7:23 AM 

Two security issues really surprised me. One is with Linux and another one is Adobe Flash.

Linux Kernals and NULL Pointers

To handle unavailable operations for some protocols, Linux kernal has methods that are not doing any NULL pointer check before deferencing those methods.  An attacker can put his code that will get executed with kernel privileges.  For more details, visit: http://blog.cr0.org/2009/08/linux-null-pointer-dereference-due-to.html.

Flash’s Vulnerability Pitch

Flash is one of the premium vechile for web sites with extravaganza contents.  A critical vulnerability allows attackers can compromise the system with Flash 9.x and 10.x for all platforms.  Visit: http://www.adobe.com/support/security/bulletins/apsb09-10.html to download the patch for the pitch.

Finally, one good news about IE 8.

IE8 – Highly Secured Browser in the Universe (Google’s promo style!)

NSS Lab is one of the leading product security testing and certification independent body has published comparative browser security testing in IE 8, Firefox 3, Safari 4, Chrome 2 and Opera 10.  The report said that IE 8 (83%) followed by FF 3 (80%) are most consistent in the high level of protection from phishing URL block rate.  Chrome and Safari score 26% and 2% respectively.

The  socially engineered malware block rate for IE8 is 81% which surpassed all the other browsers in the earth (again Google’s promo style!).  FF3 scores 27% and Chrome2 7%.

Read the complete report at http://www.nsslabs.com/browser-security.

Okey, now let me brief the reason for this post’s title.  Always, people from OS (open source) said that they are more stronger in skills than the engineers at Microsoft and other CS (closed source) No.1s.  Now, they have to understand that skill is not at all related to open source.  It is a myth. 

PS: I am not against OS.

Tags Tags: , , ,
Categories: General
Posted By: udooz
Last Edit: 16 Aug 2009 @ 07 39 PM

EmailPermalinkComments (0)
 12 Aug 2009 @ 8:47 AM 

As a response to growing Bing and Facebook search engines, Google is developing next generation search engine with new architecture. It keeps the features and internals secret.

The development version is available at http://www2.sandbox.google.com/. The only difference I found when I googled “Next generation Google search engine” is search result. Current engine gives 37,300,000 results, and the new one gives 10,100,000 results (could be more accuracy!).

Differences between these has been explained at http://www.betanews.com/article/Googles-next-search-engine-Whats-the-difference/1250012846.

Tags Tags: ,
Categories: General
Posted By: udooz
Last Edit: 12 Aug 2009 @ 08 50 AM

EmailPermalinkComments (0)
 03 Aug 2009 @ 8:42 AM 

Erik Meijer, the man behind LINQ is now come up with a framework called “Rx Framework” which contains API those decorates LINQ2Objects as mathematics dual.  Let us see in details.

LINQ2O and Mathematics Dual

Wikipedia says “A moprhism f:A->B is a monomorphism if  f.g=g.h implies g = h. Performing the dual operation, we get the statement that g.f = h.f  implies g = h. for a morphism f: B->A. This is precisely what is means for f to be an epimorphism. In short, the property of being a monomorphism is dual to the property of being an epimorphism.”

 LINQ to Objects is set of extension methods which enables to manipulate on a IEnumerable<T>.  From a layman point of view, a collection is nothing but a data source from where we are pulling data using LINQ to Objects and/or ofcourse for…each.  From duality, this is f:A->B.

It is very common that we need to notify a data source when new items need to be added or existing item to be updated either.  The interaction is happened either sychronously, but in general asychronously.  Since, we are living in non-deterministic, disconnected programming world.

Updating a data source would be based on some user-interaction in an application through event.  Event is programming world idiom for asynchronous invocation, or if you take any non-user interaction application which is again based on “event” driven architecture.  This model is based on GoF’s observer pattern.

Observer Pattern

GoF says “Define a one-to-many dependency between objects so that when one object changes state, all its dependents are notified and updated automatically”.

Observer Pattern

The observer those want to be notified whenever the subject undergoes a change in state should attach themselves with subject.  If you take button click event, we can write one or more event handlers.  Whenever a button clicked, it sends EventArgs to its subscribed observers (event handlers).

Introducing IObservable and IObserver

Let us  come back to the actual problem.  Since, IEnumerable<T> can only be used to pull data from a data source as read-only i.e f:A->B, here A is data source and B is my IEnumerable<T>.  However If I want  f: B->A which means that I would like to send/push data from “B” to “A”.  Here, I treat the callback or event handlers as “A” and the events as “B”.  It means that I need to push data to a data source as like as IEnumerable which is for pull.

As part of Reactive Framework, Meijar introduced two interfaces IObservable<T> and IObserver<T>.  Based on observer pattern, IObservable<T> is source.  See the following figure.

IObserverable and Observer

The  Subscribe() method is used to register one or more IObservers for notification which is similar to Subject.Attach() in the observer pattern.  The IDisposable is .NET idiom which mimics Source.Detach().  To understand the fundamental objective of these objects think in reverse in place with IEnumerable.  To traverse through the IObservable, create an IObserver, give it to an IObservable, and the IObservable “pushes” data into the IObserver by invoking its methods.

OnNext() is used to iterate over data sources and push the argument “value”.  OnCompleted() is post-iteration handler.

An Example

Theory is enough.  Let us see an example in Silverlight 3.0.  I create a button with mouse move event which acts as IObservable and create a TextBox which acts as IObserver.  Whenver a mouse moved on the button, the textbox observer appends coordinates into TextBox.Text property.

To create a mouse move observable, I used System.Linq.Observable which is used to create IObservables.  I extend the System.Windows.Controls.Button with GetMouseMoves() which returns IObservable<Event>MouseEventArgs>> like the following:

public static class ButtonExtension
{
    public static IObservable>
      GetMouseMoves(this Button button)
    {
        return Observable.FromEvent((EventHandler genericHandler)
          => new MouseEventHandler(genericHandler),
            mouseHandler => button.MouseMove += mouseHandler,
            mouseHandler => button.MouseMove -= mouseHandler);
    }
}

 The FromEvent() overload takes three Action(s) as arguments. One to convert from a generic event handler (EventHandler) to MouseEventHandler.  Remaining two register/unregister.  For observers, I create a custom observer for handling mouse coordinates like

public class CoordinateObserver : IObserver<Event<MouseEventArgs>>
{
    public TextBox CoordinateTextBox { get; set; }
 

    #region IObserver<Event<MouseEventArgs>> Members

    public void OnCompleted()
    {

    }

    public void OnError(Exception exception)
    {

    }

    public void OnNext(Event<MouseEventArgs> value)
    {
        CoordinateTextBox.Text += (string.Format(&quot;Clicked at ({0},{1})\n&quot;,
          value.EventArgs.GetPosition(CoordinateTextBox).X,
          value.EventArgs.GetPosition(CoordinateTextBox).Y));
    }

    #endregion
}

I create an instance of CoordinateObserver and subscribes it to button’s GetMouseMoves()

rxButton.GetMouseMoves().Subscribe
  (new CoordinateObserver { CoordinateTextBox = this.rxResult });

The results would be

 The output

Download and Resources

Till now, there is no official release of Reactive Framework.  However, The Silverlight 3.0 Unit Test Framework uses this which comes as part of Silverlight 3.0 Toolkit.  Assembly Name: System.Reactive.dll, which unfortunately builds with Silverlight’s System.Core, Version=2.0.5.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e.  See the following image:

Assembly

  1. First download Silverlight 3.0 Toolkit and install it.
  2. Extract source code from $Program Files$\Microsoft SDKs\Silverlight\v3.0\Toolkit\Jul09\Source\Source Code.zip.
  3. Get the System.Reactive.dll from $Source Code Folder$\Binaries.

The full source code of the example is available at http://www.udooz.net/index.php?option=com_docman&task=doc_download&gid=5&Itemid=5.

Tags Tags: , , , ,
Categories: .NET, Silverlight
Posted By: udooz
Last Edit: 03 Aug 2009 @ 08 58 PM

EmailPermalinkComments (4)
\/ More Options ...
Change Theme...
  • Users » 1
  • Posts/Pages » 54
  • Comments » 39
Change Theme...
  • VoidVoid « Default
  • LifeLife
  • EarthEarth
  • WindWind
  • WaterWater
  • FireFire
  • LightLight