2008/10/26

LineRate Systems mentioned by Rocky Radar Media

Mentioned the presentation that I did for LineRate Systems on 10/1 as part of the CU Innovation Alliance Breakfast at ESPRIT 08.

Paper: Visualizing Potential Parallelism in Sequential Programs

Graham Price, a fellow student at CU Boulder, will be presenting his paper on "Visualizing Potential Parallelism in Sequential Programs" at PACT in Toronto tomorrow (Monday 10/27).

This paper is related to my last post discussing how Parallelism is an Optimization and presents a high-performance visualization technique, based on Binary Decision Diagrams (BDDs) and Quad Trees, to allow programers to identify regions of code as candidates for coarse-grain parallelization.

More information on the paper can be found at the group's website http://ce.colorado.edu/core/

Visualizing Potential Parallelism in Sequential Programs
Graham D. Price, John Giacomoni, and Manish Vachharajani
The 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), October 2008.


Abstract:
This paper presents ParaMeter, an interactive program analysis and visualization system for large traces. Using ParaMeter, a software developer can locate and analyze regions of code that may yield to parallelization efforts and to possibly extract performance from multicore hardware. The key contributions in the paper are (1) a method to use interactive visualization of traces to find and exploit parallelism, (2) interactive-speed visualization of large-scale trace dependencies, (3) interactive-speed visualization of code interactions, and (4) a BDD variable ordering for BDD-compressed traces that results in fast visualization, fast analysis, and good compression. ParaMeter's effectiveness is demonstrated by finding and exploiting parallelism in 175.vpr. Measurements of ParaMeter's visualization algorithms show that they are up to seventy-five thousand times faster than prior approaches.

2008/10/23

Parallelism is an optimization

Today I will start a series of posts on the complexities of writing parallel software based on the introductory material I used in my recent Ph.D. dissertation proposal and the technology I am developing for LineRate Systems. This series of posts will begin by surveying the core parallel structures used in programming and then iteratively dig deeper into more complex parallel structures.  I will also interleave posts on the the elements of computer architecture necessary to understand the complexities of writing parallel programs on modern computer architectures.

*deep breath*

Before we can discuss the complexities of writing parallel programs one point needs to be made clear that seems to escape the notice of most programmers and even graduate students in computer science (myself included).

Parallelism is an optimization and not a correctness criterion.

You might ask yourself, why is parallelism not included in the list of criteria for a program to be correct?  This makes no sense!

The answer is that the academic definition of program correctness only concerns itself with ensuring that a program terminates with the correct output for a given input.  Notice that duration of computation is not part of this informal definition of correctness -- even though performance may be part of the list of requirements for a program to be considered useful.  This is because a computation is nothing more than a ordered (total) set of deterministic transformations on a given set of inputs.

How often have you seen the following situation occur?

A software architect is given a set of requirements describing what the program must do to be considered a success (hopefully).  Note this list can be extremely detailed to as a vague a directive as, "Customer X has Problem Y, solve it."  Most requirements lists fall somewhere in between these two extremes resulting in many iterations of prototypes and specification changes before an acceptable result is produced and there are very good reasons why this state of affairs is quite acceptable.

The architect immediately begins to design a parallel program as the requirements have performance metrics included.

*ouch*

This is both problematic and perplexing as parallel programs are often so complex that they defy reasoning abilities of programmers and verification tools.    This means that determining the cause of a perplexing result generated by a parallel program is difficult, or even impossible, leaving all parties in an uncomfortable lose-lose position.

A sane, albeit potentially more expensive, approach is to maintain a reference sequential (non-parallel) implementation against which the results can be verified.  This approach has an additional advantage, benchmarking the sequential version often reveals the subsections that actually *need* parallelization -- most parallel implementation that I have seen are over architected.

So again, parallelism is an often needed optimization and not a correctness issue.

Entrepreneurs Unplugged: With Todd Vernon and Walter Knapp of Lijit

Entrepreneurs Unplugged: With Todd Vernon, CEO and Walter Knapp, COO of Lijit

@ Engineering Center Room ECCR 245, University of Colorado
October 23, 2008, 5:00pm

Please join us on October 23, 2008 as the Silicon Flatirons Center presents Todd Vernon, CEO and Walter Knapp, COO of Lijit as our featured entrepreneurs in our Entrepreneurs Unplugged series. Mr. Vernon and Mr. Knapp, leading Colorado entrepreneurs, have a wide breadth of experience on the business and technical sides of creating start-ups.

Entrepreneurs Unplugged is a meeting place where faculty, students and community members with technical backgrounds learn about and get involved in entrepreneurship. In particular, the program offers students and faculty an opportunity to learn how a successful start up is created as well as an opportunity to network. Each Entrepreneurs Unplugged meeting features food, drink and - most importantly - an experienced entrepreneur to discuss his/her start-up experiences.

For additional info, see http://www.silicon-flatirons.org/events.php?id=476.

2008/10/16

CUNVC: Workshop #2: – Your Business Concept: Opportunities vs. Ideas

The CU New Venture Challenge's second workshop, "Your Business Concept: Opportunities vs. Ideas," will be tonight (10/16) at 19:00 in the ATLAS Auditorium.  Event will be preceded with networking in the lobby at 18:30.

2008/10/13

Candidacy

Today I gave my oral Ph.D. dissertation proposal defense exam... and passed :)
(well technically I will collect the 5th signature tomorrow, but I have 4 out of 5 which is sufficient *g*)

All in all not too bad :)  Advice to future students in this situation is to schedule the defense earlier than 10am so the committee isn't awake enough to ask detailed questions ;)

Now to get cranking on the work needed for my PhD and LineRate Systems as the core technology is the same :)

Stay tuned!

2008/10/06

CU Innovation Alliance Breakfast Redux

Last week I presented a brief overview on what LineRate Systems is all about at the CU Innovation Alliance Breakfast that was part of ESPRIT '08.

I wish that I had had time to present my thoughts last week while everything was still fresh in my mind, alas my pesky PhD dissertation proposal consumed my attention for the remainder of the week and weekend ;)


So briefly here are my thoughts :)

Suvica - Has a system to quickly design experiments that combine different anti-cancer agents to reduce the number of trials needed to find true positives.  The system leans towards the side of eliminating false-positives and increasing false-negatives.  But this is not so bad as the efficiency with which true positives are found is high.

Ion Engineering: Colorado Carbon Capture - They talked mostly about "sweetening" natural gas (removing CO2 and H2S) although their techniques apply to a wide range of purification systems.  The net effect is that a company tapping a dirty source of natural gas can add purification equipment can deliver sweetened natural gas from a source that was previously too dirty.

TissueFusion - They use lasers to fuse tissues back together without the need for the surgeon's usual arsenal of tools that are best left in the hands of a tailor ;)  Their initial product is focused on septoplasty and rhinoplasty although they see a wide range of applicability.

LineRate Systems - this was my presentation and I will talk more about it below.

3QMatrix - They are tackling the problem of wounds that refuse to heal and remain open.  Existing solutions are very expensive and often not treated properly due to the expense.  Their product is a new type of packing which is design to help the healing processes and can be "functionalized" with medications and other substances to help the wound heal.  The product delivers dramatically accelerated healing at a fraction of the cost.

XenoPur Systems - Is a technique to remove heavy, precious, and radioactive metals from a solution (including the sludge left behind by mines). 

KMLabs - build femto-second lasers for use in research facilities.  They've been around since 1994.  They have been gaining in popularity and have been expanding to meet the needs of university research labs, homeland security, and other labs.





Ok now onto my presentation of LineRate Systems which given Dave Taylor's post on his blog wasn't delivered as well as it could've been :/ and thus deserves a bit of clarification.

Apparently I gave the impression that we thought we had no competition and that the 40 or so companies that I did identify could not compete with us in terms of software innovation @.@ 

My co-founder and myself consider ourselves infrastructure people (aka "plumbers" ) we make everything flow smoothly and quickly inside your system, we are under no illusion about our ability innovate on full products against the teams of Cisco and Juniper *ouch*

Our focus is on delivering high-value low-cost network appliances with no-hassle support and sales. period.

The confusing marketing slide (I admit it - we had been warned about it before) was supposed to show that our plan is to drive the total revenues of the markets we are interested in to the level of existing vendors' cost-of-goods-sold.  This then leads us to believe that the incumbents must respond in one of the following ways: 1) up-market retreat, 2) licensing our acceleration technology (not the full products), 3) develop their own low cost solution (I may have forgotten to mention this), or 4) die.   The dangers are two fold: a) we fail to establish a sustainable business at this level of revenue, b) account control locks us out of the market.

That being said, there are a couple of companies that we are closely tracking as competitors :)

My PhD dissertation proposal submitted and network appliance performance ratings

Today I finally submitted my PhD Dissertation Proposal to my committee!
After next Monday's oral presentation I'll have only one pesky "little" exam left ;)

The document is titled, "Supporting Fine-Grain Parallelism on Commodity General-Purpose Multicore Hardware," and will wrap up my research on using multicore processor (e.g., Intel quad-core Xeon processors) based systems as network processors (e.g., Intel IXP series). 

The two key technologies so far are FastForward and the Frame Shared Memory architecture (FShm).  FastForward is a cache-optimized core-to-core software communication mechanism (concurrent lock-free data structure for those that care) that decreases the observable latency by an order of magnitude (10x) over the previous gold standard described by Leslie Lamport in 1983.  FShm is an organization that allows safe sharing of buffers between processes and network interfaces.  The two have been used to demonstrate true line-rate bridging of 4 Gigabit Ethernet links without dropping a single frame. (In the context of networking a frame represents the link layer message while a packet usually refers to the application message) One of the goals of the proposal is to increase the performance to support a single 10 Gigabit Ethernet stream where frames can be arriving at a rate of one every 67 nanoseconds.

Note!  this is very different from a device claiming to support 10 gigabits per second aggregate over many ports that also allows the system to randomly drop frames to achieve this level of performance.

Why is this distinction important?  There are two scenarios.

1) Raw firewall performance.  If a firewall drops any link-layer frame of a large Internet Protocol (IP) packet, the entire packet needs to be retransmitted.  If you are using a cheap wireless router at home I'm sure you've seen this problem in action; a transmission runs at quickly and then suddenly chokes and hangs.  What is happening is the system defaults to a back-off mode to help relieve network congestion - usually why frames are lost - when there really is none.

2) Consider an attacker who wishes to remaining anonymous as best as possible.  If the attacker knows that the intrusion detection/prevention system cannot sustain line-rate performance on a single link could initiate a Denial-of-Service attack whose entire purpose was to probabilistically hide the attackers true malicious network packets.  If the security system misses some or all of the attack frames it may be impossible to prevent the attack from completing or perform forensic analysis on the data.

Therefore before you buy your next network appliance consider what your needs really are and ask your vender what they mean by there performance ratings.