Electronic Design

  
Reprints     Printer-Friendly    Email this Article    RSS        Font Size     What's This?


[Technology Report]
Multicore Projects Mean Multiple Choices
Multicore solutions may be finding their way into more projects, but opinions vary on the best architecture to use.

Daniel Harris  |   ED Online ID #17695  |   December 13, 2007


When it comes to multiprocessing, what’s good for the hardware goose is not necessarily good for the software gander. The ideal hardware architecture for a multicore design is a heterogeneous (asymmetric) single instruction-set architecture (ISA) that essentially includes both high- and low-complexity cores to achieve lower power and higher throughput, somewhat mitigating Amdahl’s Law1.

Now imagine that Amdahl’s Law (used to find a system’s maximum expected improvement when only part of the system is improved) was of no concern and we had unlimited die sizes. The ideal multicore from a programming perspective would be homogeneous (symmetric), so dependence wouldn’t be built up on a specific ISA. Courtesy of IBM, Sony, and Toshiba, the Cell microprocessor has a heterogeneous architecture— though it isn’t a single ISA. Yet programming the device can be rather arduous, leaving you with code that’s heavily architecture-dependent. According to Dave Haas, principal architect at Raza Microelectronics, you should be careful not to pigeonhole yourself into a given vendor or architecture when you can avoid it, making homogeneous architectures a safer bet when given a choice.

Regardless of the best approach, there’s a limited number of options for today’s embedded and general-purpose system designers. If you’re in the embedded space, several of the multicore choices are heterogeneous. If you live in a general-purpose world, you might only be able to get a homogeneous multicore.

DECISIONS • When it comes to multiprocessing, several tradeoffs exist that squeeze the most performance out of your transistor (see the table). For example, there’s the threadversus- core tradeoff. According to Kevin Kissell, MIPS principal architect, you must start by analyzing your system to determine which applications can be decomposed into a number of constituent tasks or threads.

“Parallelization of monolithic applications is often possible, but seldom easy, and it’s generally easier for a big scientific code than a small embedded real-time application,” says Kissell. And to save on area, consider utilizing a more thread-heavy architecture. The idea is to maximize the performance per watt and choose an architecture that will saturate the memory and power envelope.

“To the extent that a single-threaded core cannot keep its pipeline fully utilized because of delays from memory and slow functional units, multithreading can extract throughput with a relatively modest increase in area, and in many cases the payback is superlinear,” he says.

For instance, you might achieve 30% more throughput for 15% more area in the CPU and cache subsystem. “This can be converted into a power optimization if that recovery of lost bandwidth allows the multithreaded core to run at a lower frequency than an equivalent single-threaded core, and still meet performance targets,” says Kissell.

So if your application doesn’t require significant amounts of shared data or instructions, a distributed memory scheme is probably the best candidate. “Each processing element’s memory can be sized to its dedicated tasks,” Kissell says, “and one can use different processor frequencies, different processor models, and even different processor architectures for the different processing elements to achieve the best area/power/performance values.”

But if there’s an abundance of code and/or data sharing, a symmetric configuration may be your best bet. According to Kissell, this approach “adds complexity and loses a bit of peak performance relative to a distributed memory model, because there will be some contention for the shared memory array, and because a cache-coherency protocol must be used among the cores to ensure that they all see the same values at each memory location, despite the presence of caches.”

But according to Chuck Moore, senior fellow for Advanced Micro Devices, end users may have misaligned expectations about multicore technology.

“Multicore is very good for throughput and responsiveness, but given that most applications are still serial, these actually won’t speed up on multicore,” says Moore. “Over time, there will be an increasing number of parallel applications available, but this is going to take more time than people seem to realize.”

DIFFERENT VIEWS • When it comes to multiprocessing, all “coaches” believe their team has the best strategy for winning (see “Multicore My Way” at www.electronicdesign.com, ED Online 14631). Take AMD and Intel, which have gone public about their opposite approaches to next-generation cores. Intel believes homogeneous cores are the way to go, while AMD believes the future lies in heterogeneous cores.

“Multicore solutions of tomorrow will be heterogeneous,” says AMD’s Moore. “They will initially involve the use of architecturally compatible cores with varying capabilities, but will grow to include more special-purpose and power-efficient hardware that is accessed through well-defined APIs (application programming interfaces).”

Intel and Vivace Semiconductor also have radically different views of the embedded space. “Intel’s Embedded and Communications Group estimates the percentage of multicore designs that will utilize asymmetric multiprocessing (AMP) in the next three to four years of all Embedded and Communications Group-deployed multicore platforms to be about 10%,” says Edwin Verplanke, platform solution architect with Intel’s Embedded and Communications Group.

Continue to next page


<-- prev. page     [1] 2 3     next page -->

Reprints   Printer-Friendly  Email this Article  RSS    Font Size   What's This?


  • In EDA, A Year Of Mergers, Failed And Otherwise
  • 2008 BEST Electronic Design Winners
  • Engineers Rely On Internet For Product Info
  • Rochester Electronics Establishes New Design and Technology Group
  • November 17, 2008
  • Custom Sources Light Way To 22-nm IC Lithography
  • Software Turns Scopes Into Vector RF Signal Analyzers
  • Couple’s $15 Million Gift Advances Rice Engineering Education
    1) Build A Smart Battery Charger Using A Single-Transistor Circuit
    (268 views today)
    2) Ten Top Design Skills For Tough Times
    (199 views today)
    3) Consumer Electronics Series: AMD Live! Home Cinema Platform
    (190 views today)
    4) Easily Convert Decimal Numbers To Their Binary And BCD Formats
    (164 views today)
    5) FPGA Costs Half A Buck
    (110 views today)
    ALL TOP 20



    POST YOUR COMMENTS HERE
    Name:

    Email:
    Your Comments:

    Enter the text from the image below


    Please refresh the page if you have trouble reading this text.

    Search Electronic Design
         
      
     
    Web Seminar
    Sponsored By:
    Title: Read Pacing: A Performance Enhancing Feature of PCI Express Gen 2 Switch Devices
    Speakers: 
    Date: 07/01/08
    Register: 

    Electronic Design Europe Electronic Design China EEPN Power Electronics Auto Electronics Microwaves & RF
    Mobile Dev & Design Schematics Find Power Products Military Electronics EE Events Related Resources