Communications Crossroads - Fall 2009
Network Storage Technology: A Primer
The late, great George Carlin used to talk about “stuff.” A house is just a place for your stuff, he’d say. If you didn’t have so much stuff, you wouldn’t need a house. You go on vacation, you bring some stuff – but while you’re there, you buy more stuff. The preposterousness of humankind’s perceived “need” for all manner of things brilliantly and hilariously dissected in roughly five minutes.
Things haven’t changed much in the 20+ years since that sketch debuted. These days, people still “need” all kinds of stuff. Content-rich stuff, in fact. Your customers need IPTV. They need video on demand. They need to share photos from their company’s annual meeting in real time. And they really need to be able to update their Facebook pages from the airplane.
Customers and their personal and professional needs vary. In other words, not everyone needs the same stuff. But telecommunications providers need to have it all on hand – just in case. So where do you keep such a vast collection of rich data? How might you be storing it in the future? The experts weigh in here, outlining the challenges and projecting the future of network storage.
But first, a brief glossary, as provided by Jieming Zhu, a distinguished technologist for HP StorageWorks, to help you muddle through.
Storage Area Networks
A storage area network (SAN) is a network architecture that attaches different computer storage devices, such as disk arrays and tape libraries, to servers in such a way that the devices appear as locally attached to the operating system.
A SAN is usually clustered in close proximity to other computing resources but also may extend to remote locations for backup and archival storage, using wide area network carrier technologies.
SANs support disk mirroring, backup and restore, storage and retrieval of archived data, data migration from one storage device to another as well as the sharing of data among different servers in a network. SANs also can incorporate sub-networks with network-attached storage (NAS) systems.
Although the cost and complexity of SANs are dropping, they are typically part of the overall network of computing resources for larger enterprises.
Network Attached Storage
NAS is a complete storage system that is designed to be attached to a traditional data network to provide file access to heterogeneous network clients.
NAS uses file-based protocols, such as Network File System (NFS) or Common Internet File System (CIFS), where it is clear that the storage is remote and computers request a portion of an abstract file rather than a disk block. This differentiates a NAS device from a SAN, as a SAN is a separate network to which storage devices are attached.
In most cases, a NAS system is less expensive to purchase and less complex to operate than a SAN. However, a SAN can typically provide better performance and a larger range of configuration options.
Fibre Channel
Fibre Channel (FC) is a set of standards for connecting storage devices in a fabric network. It has become the standard connection type for SANs in enterprise storage. FC supports connectivity over fiber optic cabling or copper wiring.
FC technology can transmit data between computer devices at data rates of up to 8 Gbps. FC is especially suited for connecting computer servers to shared storage devices and for interconnecting storage controllers and drives. Since FC is fast, reliable, scalable and flexible, it has become the preferred transmission interface between servers and clustered storage devices in enterprise data centers.
The emerging Fibre Channel Over Ethernet (FCoE) standard will further enable the FC protocol to be tunneled in enhanced Ethernet link layers, achieving the converged network infrastructure vision that allows both local area network and SAN traffics to be carried in the same physical media.
iSCSI
iSCSI (pronounced “eye-scuzzy”) stands for Internet Small Computer Systems Interface. iSCSI is an IP-based storage networking standard for linking data storage facilities. By carrying SCSI commands over IP networks, iSCSI is used to facilitate data transfers over intranets and to manage storage over long distances.
When an application attempts to read from an iSCSI device, the SCSI read command is encapsulated inside a TCP packet. The TCP packet is then routed just like any other TCP packet on the network. When the TCP packet reaches its destination, the encapsulation is stripped off and the SCSI read command is interpreted by the iSCSI drive. SCSI write commands are handled in the same manner.
iSCSI is one of two main approaches to storage data transmission over IP networks; the other method, Fibre Channel over IP (FCIP), translates FC control codes and data into IP packets for transmission between geographically distant FC SANs. FCIP can only be used in conjunction with FC technology; in comparison, iSCSI can run over existing Ethernet networks.
iSCSI is a popular SAN protocol, allowing organizations to consolidate storage into data center storage arrays while providing hosts (such as database and web servers) with the illusion of locally attached disks. Unlike traditional FC, which requires a special-purpose FC infrastructure (host bus adapters, switches and cables), iSCSI can be run over long distances using existing Ethernet network infrastructures.
Solid State Drives
A solid state drive (SSD) is a storage device that stores persistent data on solid state random access media. Most SSD technology today is based on flash memory. SSDs actually are not hard drives at all, in the traditional sense of the term, as there are no moving parts involved. Instead, an SSD has an array of semiconductor memory organized as a disk drive, using integrated circuits rather than magnetic or optical media.
This arrangement has many advantages. Data transfer to and from solid state drives is much faster than electromechanical disk drives. Input/Output (I/O) latencies are substantially reduced, as there is no rotational or magnetic media. Users typically enjoy much faster boot times as well. In general, SSDs also are more durable and much quieter, with no moving parts to break or spin up or down. Development and adoption of SSDs have been driven by a rapidly growing need for higher I/O performance. High-performance laptops or any applications that need to deliver information in or near real-time can benefit from SSDs. Historically, SSDs have been much more expensive than conventional hard drives. Due to improvements in manufacturing technology and expanded chip capacity, however, prices have dropped, leading both consumers and enterprise-level customers to re-evaluate SSDs as viable, if still a bit expensive, alternatives to conventional storage.
In recent years, SSDs have been used in enterprise storage to speed up applications and performance without the cost of adding additional servers.
Storage Virtualization
Storage virtualization is generally defined as the “transparent abstraction of storage at the block level.” In essence, virtualization separates logical data access from physical data access, enabling users to create large storage pools from physical storage. Virtual disks are created from these pools and are allocated to servers on the network as logical storage when needed.
Virtual storage reduces the physical one-to-one relationship between servers and storage devices. The physical disk devices and distribution of storage capacity become transparent to servers and applications. According to John Webster, a storage industry analyst with Illuminata, “The single most important attribute of any storage virtualization solution is the ability to mask complexity and thereby make manageable that which is increasingly becoming unmanageable.” The goal of products and solutions using storage virtualization, therefore, is to simplify management.
There are three levels within a networked storage environment in which virtualization can occur: the server level, the storage network or SAN fabric level, and the storage system level. These levels can be used together or independently to substantially increase the benefits to users.
1. Server-based virtualization
Example: HP PolyServe Software for Microsoft SQL Server
2. Storage network or SAN-based virtualization
Example: HP StorageWorks SAN Virtualization Services Platform
3. Storage-based virtualization
Example: HP StorageWorks EVA, Virtual Library System
Solutions for Broadband
Consider the products, services and applications customers are using around the clock today— all that data flying this way and that. “Streaming media providers have an unexpected ramp of usage at any given time due to technology advances in media standards,” says HP’s Zhu. “This dynamic environment changes the requirements of business models rapidly.”
In the past, according to Zhu, businesses had to spend two to five times more than what they might have for traditional storage solutions in order to establish the scale-out or cloud-computing environments required to manage large content-rich data files. “Alternately, many developed their own ‘white-box’-based solutions, resulting in massive hardware sprawl that raised power and cooling costs and significantly increased the number of staff required to manage this expanding environment.”
Cloud-based storage, Zhu believes, is changing how companies commoditize and store the exponential amounts of data from video sharing and the growth of digital media. “The need to store massive amounts of data with real-time access capabilities will change the storage industry, as we [will] see a greater uptake in cloud-based storage solutions that address the challenges of data growth, but at a fraction of the cost of traditional infrastructures.” HP’s StorageWorks 9100 Extreme Data Storage System, Zhu says, will provide the scalability to power new cloud-based projects.
Storage solutions for broadband service providers can be very diverse, according to Lance Smith, senior vice president of product marketing for Fusion-io. “It just depends on whether one is looking at augmenting existing deployments or building new solutions. Either way, Fusion-io’s ioDrive solid state storage products can be very effective in video-serving applications.”
Unlike hard disk drives (HDDs), he says, the ioDrive can support very large bandwidths “and can be attached directly to video servers for a distributed storage solution or centralized in storage directors. The more scalable approach attaches ioDrives directly to the streaming servers.” A single ioDrive supports more than 700 MB of bandwidth, equivalent to more than 370 HD video streams.
Points to Ponder
Price, performance and reliability, the experts say, are among the most critical factors affecting a company’s storage technology decision.
“Storage architecture will be determined by the applications and business needs it supports,” says Eric Rapisarda, EMC director, TME Industry Marketing & Solution Development. “Many back-office applications such as billing, e-mail and customer information systems require faster-performing, highly available storage. Then you have other, higher-tier options for replication and back up (for business continuity and disaster recovery) environments, depending on the criticality of the data.”
Options that meld tiers of storage on single arrays are now available, says Rapisarda. Companies are able to combine different technologies depending on the importance of the data. Less frequently accessed data with lower performance requirements, he says, can rely on more budget-friendly tier-2 storage, while “stale” data can be archived. “For example, in some cases Call Detail Records must be kept for compliance reasons and require searchable, retrievable archives.
“Storage tiering and ILM [information lifestyle management] are becoming increasingly popular as power and space limitations force carriers to use storage more efficiently and effectively. Our carrier customers tell us a file system assessment is one of the easiest ways to get a handle on storage utilization so they can be sure their infrastructures are optimized.”
Fusion-io’s Smith says it boils down to two simple rules of reliability: “Don’t lose data and don’t deliver bad data.” Performance, he says, opens the single largest bottleneck in today’s modern server. “High-performance processors typically lay idle waiting for data from the storage subsystem. The ioDrive allows two to 10 times the application-level performance.”
Green is Good
What are the biggest technical challenges according to industry pros?
“Power,” says an emphatic Smith.
“Data centers are out of space and out of power, but the demand for performance and capacity continues to grow. Mechanical HDDs have hit their limits. Adoption of solid state storage in new architectures will take time but ultimately will allow higher computational capacity within a server, lower data center costs and, perhaps more importantly, lower carbon footprints.”
Says Zhu, “Moving to energy-efficient storage isn’t just a nice, green idea but an essential part of using less power to reduce energy bills, bottom line. Key storage technologies such as deduplication and solid state storage eliminate the need for unnecessary hardware to not only improve performance, but to reduce power consumption.”
In spite of hurdles that include a weak economy, compliance issues and shifts in technology, companies still need to control the flood of information while successfully cultivating their businesses.
So, in the end, when it comes to storage solutions, do the telecoms “get it?”
According to research conducted by HP in February, many respondents claim that the first step in restructuring in the current economic climate is cutting discretionary spending. “[This means] prioritizing projects, consolidation and extending the useful life of their current technology,” explains Zhu. Even so, many of the entities polled say that among the capital expenditures that will remain on the priority list are server and storage consolidation (56 percent) and virtualization (49 percent).
It should probably be on your radar screen, too. For industry professionals like Zhu, it’s a matter of giving their clients some simplicity and control. “We are helping our customers manage the data center’s growing complexity with unified technologies that maximize storage investments by lowering the cost of capacity, reducing networking costs, improving bandwidth utilization and providing ease of management.”
With solid state technology positioned to become the new engine for the enterprise marketplace and storage virtualization primed to transform the data center into more flexible entities for storing information, telcos making the necessary changes to keep up with the light-speed evolution of network storage will be poised to provide the vast and ever-increasing array of services and information— or “stuff”—that customers will continue to demand.