This is Joe Mamretti’s third IMR article on advances in communications technologies for the 21st century. Prof. Mambretti is director of the International Center for Advanced Internet Research at Northwestern University (iCAIR). His first article described the significant improvements in legacy communication services and technology that were being driven by the capabilities needed for 21st century communications (Jan/Feb 2013, pp.18-21.) His second highlighted macro trends that are rapidly transforming the communications landscape (March/April 2015, pp.3-4, 13-14).
His latest article, below, focuses on how enhanced capacity and programmability are facilitating the transport of large-scale Big Data on today’s networks.
Advanced economies are increasing based on digital information, which is growing exponentially in volume as new network-based enterprise and consumer services, applications and devices proliferate. In response to this growth, new networking services and technologies are being developed to transport ever-larger volumes of data across local and wide area networks.
Some of these new developments relate to technologies that directly address providing for capacity itself, for example, the trend toward 100 Gbps and beyond networking as a replacement for commonly deployed 10 Gbps paths. However, a related trend is the provision of new architecture, technologies and techniques for programmable networking, such as Software Defined Networking (SDN), which is required to ensure the optimal utilization of large-scale network capacity.
Usually, only raw capacity is measured when general communities consider measurements for network quality and performance. However, increasingly programmability of networks is a critically important attribute. Both are required to address appropriately the requirements for transporting large-scale “Big Data” on today’s networks.
Emergence of 100 Gbps Networking
Over the last few years, 100 Gbps networking has emerged from a set of research projects, prototypes and industry standards to a wide spectrum of commercial products. These production products include 100 Gbps routers, switches, optical transport equipment, and even server Network Interface Controllers (NICs) that allow 100 Gbps network interfaces for compute and storage nodes.
Today, the majority of wide-area networks are based on 100 Gbps technologies, and increasingly 100 Gbps paths are being implemented not only in the network core but also to facilities at the edge of the network, especially computer clouds with data centers. Also, the majority of large-scale data centers have 100 Gbps core networks, and many have implemented Wide Area Network (WAN) among data centers that enable traffic to be transmitted at 100 Gbps.
Major exchanges are also implementing 100 Gbps switching capabilities. For instance, the StarLight International/National Communications Exchange in Chicago currently supports more than 40 100 Gbps connections.
Given that 100 Gbps connections have become the “currency of the realm,” standards group researchers are planning for capacity beyond 100 Gbps, exploring options for connections at 300 Gbps, 400 Gbps and 1 Tbps (terabit/second). These enhanced capacities are supporting many millions of individual data flows, including much digital media.
However, this capability can also support specialized applications, including data-intensive science, ultra-high-resolution media, and next-generation medicine based on real time analytics.
Transmitting 300 Gbps
In May and June of 2016, in partnership with Ciena, CANARIE (the national research and education network of Canada), and the StarLight consortium, iCAIR at Northwestern University undertook a trial directed at transmitting 300 Gbps over 1440 km using coherent optical modulation (8QAM—quadrature amplitude modulation) to place 150 Gbps of capacity on each of two light paths.
The trial was conducted successfully on a live research network tested connecting the StarLight facility in Chicago to Ciena’s research labs in Ottawa. This demonstrated the potential for providing efficient, flexible large-scale network capacity over long distances.
For the SC14 supercomputing conference in November 2014, iCAIR, CalTech and multiple other research partners showcased a Tbps optical fiber ring composed of 10 100 Gbps channels. In 2016, iCAIR, the StarLight consortium, SCinet, and other research partners implemented another Tbps optical network for SC16, Nov. 16-17 in Salt Lake City, using variable-capacity light paths (e.g., 200 Gbps paths) as bonded paths within super channels.
This demonstration showcased large-scale data-intensive science applications, including high-energy physics, radio astronomy, weather prediction, geophysical sciences, and computational genomics. Several demonstrations showcased capabilities for enhanced network services for high-energy physics experiments using data from the Large Hadron Collider at CERN. Others demonstrated new capabilities for precision medicine by enabling advanced techniques for bioinformatics using genomic data.
As noted, it is important to provide not only large-scale capacity in networking but also programmability at 100 Gbps. Many techniques for programmable networking have been developed over many years by the Grid networking community, through initiatives directed at ensuring that networks were “first class Grid resources” managed by Grid middleware (1). A number of these architectures and capabilities were formalized by the Open Grid Forum (OGF) standards organization (2,3).
Other programmable network techniques have emerged from research test beds such as the National Science Foundation’s Global Environment for Network Innovations (GENI), a large-scale distributed environment created to support experimental network science research (4,5).
Programmable networking has proliferated through the architecture, protocols and technologies that comprise Software Defined Networking (SDN) and associated OpenFlow protocols, which provide for separating the control plane from the data plane. This allows more flexible and granulated control over individual data flows in a network, enabling traffic optimization, efficient load balancing, and customized applications and services by allowing for segmentation of individual flows (6).
Benefits of Software Defined Networking
SDN is being widely deployed in provider networks, within large data centers supporting services based on large-scale clouds, and across WANs (e.g., SD-WANs) that interconnect data centers.
A major proven benefit of SDN is the potential for optimal network resource utilization, which provides for exceptional cost efficiency. Another benefit is granulated view into all individual flows across and among data centers and precise control over those flows. Such control allows operators to direct flows that fully utilize network resources, adjust dynamically to spikes in demand, balance flows in accordance with specific schedules during the day, and to redirect flows when problems arise.
In addition, SDN is useful for providing tenant (organizationally owned) networks in data centers, in part based on Infrastructure-as-a-Service (IaaS) techniques. Large cloud providers support many thousands or hundreds of thousands of individual organization tenants, each of which requires capabilities for designing, implementing, configuring, and managing their networks. Consequently, cloud providers are implementing SDN-based “self-service” networks, which enable them to provide tenants with management, control and data planes as well as provisioning, configuration, and monitoring and analysis tools.
Currently, SDN is a single domain set of architectures and protocols, resulting in multiple SDN islands. To address this issue, which is leading to multiple SDN “islands,” Software Defined Network Exchanges (SDXs) are being developed as prototypes to integrate these islands as well as to provide new capabilities for exchanges, e.g,, highly granulated views into individual data flows at exchanges and enhanced control over those flows.
For example, within the framework of the GENI initiative, the StarLight exchange has implemented a prototype SDX, and is currently developing an NSF-supported international SDX to interconnect other SDXs emerging around the world especially to support large-scale collaborative data-intensive science research, such as high-energy physics, computational bioinformatics, atmospheric modeling, and radio astronomy.
Innovative Specialized Services
The high degree of virtualization and programmability within these SDXs enables them to be customized for innovative specialized services. For example, for the SC15 supercomputing conference in Austin, Texas, the StarLight SDX demonstrated capabilities for switching individual 100 Gbps streams through a customized SDX to support extremely-large-capacity streams of data, which cannot be supported in traditional networks. Such streams are required by many types of data-intensive science research, as well as by ultra-high-resolution digital media.
Another type of service SDXs can support is end-to-end (E2E) encrypted channels, which can transmit large amounts of highly secure data.
An example of practical implications of closely integrating extremely-high-capacity channels with high levels of programmability within SDXs is another demonstration showcased at SC15 This was a prototype BioInformatics SDX designed to support the complex workflows environments for precision medicine—precision medicine enabled by precision networking!
These types of workflows require exceptionally high capacity and flexibility to support complex dynamic transfers of processes, extremely large files, and large collections of small files from among multiple analytic and data repository sites. Such workflows are required by basic research into fundamental life processes, for example, investigations into genomic functions and abnormalities, and promising explorations of developing unique pharmaceutical compounds for patients based on their individual genomic sequence.
Such approaches require the transport of many 100 Gbps of data because sample sequential data must be analyzed within the context of large collections of related data at remote repositories.
Opportunities Impossible with Traditional Networks
Meeting the requirements of Big Data services and applications with a combination of large-scale capacity (100 Gbps and beyond) and high levels of agile programmability through technologies such as SDN are providing opportunities for many innovative capabilities not possible with traditional network architecture.
Also, SD techniques are being extended beyond networks through efforts to develop architecture termed “Software Defined Infrastructure” or SDI. These types of architectures and techniques are already being placed into production by large-scale cloud-based service providers and increasingly by communication providers to support general internet service.
Several specialized emerging applications have been noted but there are still more opportunities. For example, the tools and techniques being developed for tenant networks based on high levels of abstraction and virtualization could lead to the development of worldwide “personal networks”—not just local personal networks.
To some degree, such personal networks can be seen in development in environments such as NSF’s GENI, where individual researchers can design, deploy and operate their own large-scale distributed networks, even to the extent of deploying their own customized protocols, including alternatives to TCP/IP.
For example, because the majority of internet use is based on obtaining information, some research groups are developing methods of transmitting information through object identifiers (e.g., content routing) instead of using physical addresses, i.e., an IP number.
- Travostino, J. Mambretti, G. Karmous-Edwards (Eds). Grid Networks: Enabling Grids with Advanced Communication Technology, John Wiley & Sons, July 2006.
- Roberts et al. “NSI Connection Service V2.0,” Open Grid Forum, Group Working Draft (GWD), Candidate Recommendation Proposed (R-P), 2013.
- Krzywania et al. “Network Service Interface – Gateway for Future Network Services” TERENA White Paper, Feb 2012.
- Rick McGeer et al (Eds).The GENI Book, Springer International Publishing, 2016.
- McKeown, et al. “OpenFlow: Enabling Innovation in Campus Networks,” ACM SIGCOMM Computer Communication Review, 2008, 2, pp. 69-74.
Joe Mambretti, Director, International Center for Advanced Research, Northwestern University, http://www.icair.org; Director, Metropolitan Research and Education Network; firstname.lastname@example.org