## **SNIA SSSI - PCIe Round Table**

- Standards
- Technology / Architecture
- Deployment Strategies

Presentations by:

Fusion-io - Intel - Micron - Sata-IO - Seagate - Tailwind

Solid State Storage Initiative

SNIA Winter Symposium SSSI Face to Face Monday 23 January 2012 10:00 AM - 1:00 PM St. Claire Hotel, San Jose CA

Webex: https//snia.webex.com Meeting No. 795 947 658 Password: sssi2012 Telecon: 1-877-270-2716 ID: 0021 Password: 8520

#### Agenda

| 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|----|---------------------|-------------------------------------------------------|------------------------------|
| 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
| 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
| 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
| 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
| 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstein, Fusion-io    |
| 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |



PCIe Solid State Storage - Higher Performance / Lower Latencies

#### Solid State Storage PCIe . . .

a Round Table



What are issues facing Adoption of PCIe Solid State Storage devices?

- Standards for PCIe Attached Storage
- Technology & Architectural Issues
- Mass Storage Ecosystem Adoption & Optimization
- Market & Product Positioning
- Deployment Strategies



|                        | SNIA Solid State Storage Perfo | ormance Tes                                                                                                                                                                                                                  | st Specification (PTS) |
|------------------------|--------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------|
| PTS-E                  | PTS-E PTS Enterprise ver 1.0   |                                                                                                                                                                                                                              | PTS Client ver 1.0     |
| ideas, me<br>SNIA goal |                                | <image/> <section-header><section-header><section-header><section-header><section-header><section-header><text></text></section-header></section-header></section-header></section-header></section-header></section-header> |                        |
|                        | April 26, 2011                 |                                                                                                                                                                                                                              |                        |

#### SNIA PTS-C & PTS-E Specifications: Standardizing SSD Performance Test

SNIA SSSI Solid State Performance Test Spec link:

www.snia.org/tech activities/standards/curr standards/pts

Understanding SSD Performance Project link:

www.snia.org/forums/sssi/pts

Understanding SSD Performance White Paper & Powerpoint link:

www.snia.org/forums/sssi/knowledge/education

Understanding SSD Performance Webcast link:

www.brighttalk.com/webcast/663/40549

#### PTS Provides a Standardized Methodology to Compare SSD Performance





#### IOs Traverse the SW / HW Stack

- Storage IOs Must Traverse the SW/HW Stack
- IOs are subject to cache, OS task switching & timing, driver fragmentation & coalescing
- IO can be different at the Device & System level
- Can lose 1:1 correspondence original IO & Physical Device IO
- Performance is Heavily influenced by SW / HW Stack

#### Solid State Performance Issues

- Solid State Performance is MUCH Faster than HDD Storage
- SSDs must be optimized to Storage Ecosystem
- Solid State Storage employ Virtual Mapping of PBA to LBA
- Asymmetric Read / Write Response Times for Flash
- Response Time & Cost varies for DRAM, PCIe, SLC, MLC, HDD



|                                                  | Reference Test P      | latform (RTP 2.0)           |                   |  |  |
|--------------------------------------------------|-----------------------|-----------------------------|-------------------|--|--|
|                                                  | Hardware              | Software                    | Software          |  |  |
| Processor Single Intel Xeon 5580W 3.2 Ghz 4 core |                       | Operating System - Back End | CentOS 5.6        |  |  |
| Motherboard                                      | Intel 5520 HC         | Test Software - Back End    | CTS 6.5           |  |  |
| RAM                                              | 12 GB ECC DDR3        | Front End - GUI             | Chrome Browser    |  |  |
| НВА                                              | 6 Gb/s LSI 9212-4e-4i | Front End: OS, Database     | Windows 7 / MySQL |  |  |



#### PTS Reference Test Platform - Allows Comparison of PCIe, SAS, SATA, HDD Performance



| PTS rev 1.0 Performance Tests                                         |                                                   |                                          |        |  |  |
|-----------------------------------------------------------------------|---------------------------------------------------|------------------------------------------|--------|--|--|
| Test  Test Description    WSAT  Continuous RND 4KiB W from FOB, No PC |                                                   | Purpose                                  | Metric |  |  |
|                                                                       |                                                   | FOB Performance Evolution over Time      | IOPS   |  |  |
| IOPS                                                                  | Large & Small Block RND IOs at Steady State       | Steady State IO Transfer Rate per second | IOPS   |  |  |
| Throughput                                                            | Large Block SEQ R/W Data Transfer at Steady State | Steady State Bandwidth Speed             | MB/Sec |  |  |
| Latency                                                               | AVE & MAX Response Times measured at a single OIO | Steady State IO Response Time Latency    | mSec   |  |  |



WSAT Test is useful to Evaluate Solid State Small Block RND Write Behavior



#### Solid State Storage Technology - RND 4KiB Write Performance\*

\* All Data SNIA PTS-E 1.0 WSAT Test Compliant



#### WSAT: RND 4KiB W - IOPS v TGBW

\* All Data SNIA PTS-E 1.0 WSAT Test Compliant





















Solid State Storage Initiative



DRAM 71,500 IOPS



#### Agenda

| 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|----|---------------------|-------------------------------------------------------|------------------------------|
| 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
| 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
| 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
| 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
| 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstein, Fusion-io    |
| 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |



#### January 2012

2

#### Abstract

PCIe SSD Form Factor has the attractive attributes that PCIe brings to SSD storage, and adds more capabilities from the existing storage form factors.

Mark Meyers, Intel

PCIe SSD Form Factor

Mark is a Server Platform Architect working in Intel's Datacenter and Connected System group.

Mark is technical chair of the Enterprise SSD Form Factor WG which includes definition of proposed SFF-8639 connector.

Mark has been at Intel for 12 years in various server and IO architecture projects.

Previous employers includes Siemens Nixdorf, Pyramid Technology, and an early stint at Intel.







## PCIe SSD Form Factor for SNIA 2011 Winter Symposium

Mark Myers Intel Datacenter Platform Architect January 23, 2012

# Introduction

## Goal

- Status of the PCIe SSD Form Factor WG summary
- PCIe as a Storage Interface
- Common configurations
- Technical Attributes



# Enterprise PCIe SSD Form Factor WG Status

## Defined usages and requirement and connector (SFF-8639)

• 5 promoters: Dell, IBM, Fujitsu, EMC, Intel; >50 contributor companies

Rev 1.0 Specification Approved <a href="http://www.ssdformfactor.org/">http://www.ssdformfactor.org/</a>

Mechanical piece is SFF-8639 <u>ftp://ftp.seagate.com/sff/SFF-8639.PDF</u>

## Looks like existing SAS connector with pins all across both sides

Datacenter Group Platform Architecture

Interoperates with existing SATA/SAS connector



# PCIe as a Storage Interface

## PCIe value

- Industry standard, high BW, multilane, low latency interconnect
- Flexible attach models, discoverable, and supports many form factors → Our work adds a classic 3.5" or 2.5" disk form factor

## PCIe as high performance interface; Many storage interfaces;

- Hard Disks stay on SATA/SAS for long time, even for many SSDs
- High performance SSD will move to PCIe higher BW & low latency
- PCIe supports multiple device types: NVM-Express, SOP, proprietary
  - Advocate NVMe as standard block device
  - Expect interface models to evolve as devices improve



## Common Usages: Servers x4, Dual Port storage

#### **Typical Server configuration**



### **Typical High Availability Storage configuration**







(intel)

# **Drives Supported**

#### Support drive types

- Enterprise PCIe x4 SSDs
  - Server x4, Storage Dual Port x2
- Existing SAS drive (dual port)
- Existing SATA drives
- Emerging SATA-Express x1-x2
- Emerging x4 SAS

#### Support Flexible Backplanes

- Enterprise x4 PCIe SSDs
- SAS/SATA HDDs



# **Technical Attributes of Specification**

- 6 High speed lanes
  - 4 new lanes for Enterprise PCle
  - 2 existing lanes for SAS/SATA
- Side Band
  - Enterprise: RefClk, ePCleRst#, SM-Bus, 3.3VAux, DualPort
  - Client/Shared: IfDet#, PRSNT#, cPCleRst#, Rsvd (pwr mgt)
  - Removed 3.3V, Enterprise SSD supports 12V only
- Keying
  - Support universal receptacle
  - Key to block SATA-express Cable to x4 drive
  - Key to block Enterprise x4 cable to SATA/SAS drive



# Conclusion

## **Enterprise PCIe SSD Form Factor Specification**

- Rev 1.0 Approved and Released
- Expect products this year based on standard

## Supports Flexible Storage Backplanes

- High Performance Enterprise x4 PCIe SSDs
  - Using existing PCle root ports
- Existing SAS/SATA drives
- Emerging SATA-Express and x4 SAS





# Thank You

10 INTEL CONFIDENTIAL

# Additional Detail



## **Overview of Connector Pins**

#### Primary Side (closest to drive edge)



(intel)

## Pin out

| Drive  | Usage            | Signal Description                      | Name                   | Mating | Pin #     |   |
|--------|------------------|-----------------------------------------|------------------------|--------|-----------|---|
|        |                  | Ground                                  | GND                    | 2nd    | 51        |   |
| input  | SAS+SATA         | SAS/SATA/SATAe 0 TX+                    | S0T+ (A+)              | 3rd    | 52        |   |
| input  | SAS+SATA         | SAS/SATA/SATAe 0 TX-                    | S0T- (A-)              | 3rd    | 53        |   |
|        |                  | Ground                                  | GND                    | 2nd    | 54        |   |
| outout | SAS+SATA         | SAS/SATA/SATAe 0 800 -                  | SOR- (B-)              | 3rd    | 55        | 0 |
| output |                  | SAS/SATA/SATAe 0 Rcv +                  | SOR+ (B+)              | 3rd    | 56        |   |
| output |                  | Ground                                  | GND                    | 2nd    | 57        |   |
| input  | Dual Port        | ePCIe RefClk + (port B)                 | RefClk1+               | 3rd    | E1        | 4 |
| input  | Dual Port        | ePCIe RefClk – (port B)                 | RefClk1-               | 3rd    | E2        | d |
| input  | ePCIe opt        | 3.3V for SM bus                         | 3.3Vaux                | 3rd    | E3        | d |
| input  | Dual Port        | ePCIe Reset (port B)                    | ePERst1#               | 3rd    | E4        | d |
| input  | ePCle            | ePCIe Reset (port A)                    | ePERst0#               | 3rd    | E5        | 9 |
|        |                  | Reserved                                | RSVD                   | 3rd    | E6        | 9 |
| input  | SATAe            | Reserved(WAKE#/OBFF),                   | RSVD(Wake#)            | 3rd    | P1        |   |
|        | +SAS4            | SASAct2                                 | /SASAct2               |        |           |   |
| Bi-Qic | SATAe            | SATAe Client /SAS reset                 | sPCIeRst/SAS           | 3rd    | P2        |   |
| input  | SATAe            | Reserved ( <u>DevSLP</u> #)             | RSVD( <u>DevSLP</u> #) | 2nd    | P3        |   |
| output | SATAe +<br>ePCle | Interface Detect<br>(Was GND-precharge) | IfDet#                 | 1st    | P4        |   |
|        | all              | Ground                                  | GND                    | 2nd    | P5        |   |
|        | all              | Ground                                  | GND                    | 2nd    | P6        |   |
| NC     | SAS+SATA         | Precharge                               |                        | 2nd    | P7        |   |
| NC     | SAS+SATA         | SATA, SATAe, SAS only                   | 5 V                    | 3rd    | <b>P8</b> |   |
| NC     | SAS+SATA         |                                         |                        | 3rd    | P9        |   |
|        | all              | Presence (Drive type)                   | PRSNT#                 | 2nd    | P10       |   |
| Bi-Qic | all              | Activity(output)/Spinup                 | Activity               | 3rd    | P11       |   |
|        | all              | Hot Plug Ground                         | GND                    | 1st    | P12       |   |
| input  | all              | Precharge                               |                        | 2nd    | P13       |   |
| input  | all              | All – 12V                               | 12 V                   | 3rd    | P14       |   |
| input  | all              | Only power for ePCIe SSD                |                        | 3rd    | P15       |   |

Þ

Q

ePCle → Enterprise PCle (separate from SATA/SAS)

SATAe → SATA Express (Client PCle- muxed on SATA/SAS signals)

SAS4 → SAS x4

| L |     | Pin # |     |             | Usage                     | Drive      |        |
|---|-----|-------|-----|-------------|---------------------------|------------|--------|
| L | Þ.  | E7    | 3rd | RefClk0+    | ePCIe Primary RefClk +    | ePCle      | input  |
| L | Þ.  | E8    | 3rd | RefClk0-    | ePCIe Primary RefClk -    | ePCIe      | input  |
| L | Þ   | E9    | 2nd | GND         | Ground                    |            |        |
| L | Þ   | E10   | 3rd | PETpO       | ePCIe 0 Transmit +        | ePCle      | input  |
| L | Þ   | E11   | 3rd | PETnO       | ePCIe 0 Transmit -        | ePCle      | input  |
| L | Þ   | E12   | 2nd | GND         | Ground                    |            |        |
| L | Þ   | E13   | 3rd | PERnO       | ePCIe O Receive -         | ePCIe      | output |
| L | Þ.  | E14   | 3rd | PERpO       | ePCIe 0 Receive +         | ePCIe      | output |
| L | Þ   | E15   | 2nd | GND         | Ground                    |            |        |
| L | Þ   | E16   | 3rd | RSVD        | Reserved                  |            |        |
| L | Þ   | 58    | 2nd | GND         | Ground                    |            |        |
| L | Þ   | 59    | 3rd | S1T+        | SAS/SATAe 1 Transmit +    | SAS+SATAe  | input  |
| L | Þ   | 510   | 3rd | S1T-        | SAS/SATAe 1 Transmit -    | SAS+SATAe  | input  |
| L | Þ   | 511   | 2nd | GND         | Ground                    |            |        |
| L | Þ   | 512   | 3rd | S1R-        | SAS/SATAe 1 Receive -     | SAS+SATAe  | output |
| L | Þ   | 513   | 3rd | S1R+        | SAS/SATAe 1 Receive +     | SAS+SATAe  | output |
| L | Þ   | 514   | 2nd | GND         | Ground                    |            |        |
| L | Þ   | E17   | 3rd | RSVD        | Reserved                  |            |        |
| L | P   | E18   | 2nd | GND         | Ground                    |            |        |
| L | P   | E19   | 3rd | PETp1/S2T+  | ePCIe 1 /SAS 2 Transmit + | ePCIe+SAS4 | input  |
| L | P   | E20   | 3rd | PETn1/S2T-  | ePCIe 1 /SAS 2 Transmit - | ePCIe+SAS4 | input  |
| L | P   | E21   | 2nd | GND         | Ground                    |            |        |
| L | P   | E22   | 3rd | PERn1/S2R-  | ePCIe 1 /SAS 2 Receive -  | ePCIe+SAS4 | output |
| L | P   | E23   | 3rd | PERp1/S2R+  | ePCIe 1 /SAS 2 Receive +  | ePCIe+SAS4 | output |
| L | E   | E24   | 2nd | GND         | Ground                    |            |        |
| L | E   | E25   | 3rd | PETp2/S3T+  | ePCIe2 / SAS 3 Transmit + | ePCIe+SAS4 | input  |
| L | E.  | E26   | 3rd | PETn2/S3T-  | ePCIe2 / SAS 3 Transmit - | ePCIe+SAS4 | input  |
| L | Ε.  | E27   | 2nd | GND         | Ground                    |            |        |
| L | E.  | E28   | 3rd | PERn2/S3R-  | ePCIe 2 / SAS 3 Receive - | ePCIe+SAS4 | output |
| L | Ε.  | E29   | 3rd | PERp2/S3R+  | ePCIe 2 / SAS 3 Receive + | ePCIe+SAS4 | output |
| L | Ε.  | E30   | 2nd | GND         | Ground                    |            |        |
| L | Ľ.  | E31   | 3rd | PETp3       | ePCIe 3 Transmit +        | ePCle      | input  |
| L |     | E32   | 3rd | PETn3       | ePCIe 3 Transmit -        | ePCle      | input  |
| L | E I | E33   | 2nd | GND         | Ground                    |            |        |
|   | E   | E34   | 3rd | PERn3       | ePCIe 3 Receive -         | ePCle      | output |
|   |     | E35   | 3rd | PERp3       | ePCIe 3 Receive +         | ePCle      | output |
|   | 5   | E36   | 2nd | GND         | Ground                    |            |        |
|   | E.  | E37   | 3rd | SMCK        | SM-Bus Clock              | PCIe opt   | Bi-DIr |
|   | E.  | E38   | 3rd | SMDat       | SM-Bus Data               | PCIe opt   | Bi-Qir |
|   |     | E39   | 3rd | DualPortEn# | ePCIe 2x2 Select          | Dual Port  | input  |



# Keying

- Prevent mating if will not work
- Support Universal Receptacle
  - accepts any drive
  - Driver carrier provide keying
- Cable block for client cables
  - Prevent client service calls

|                                  | SATA drive                                 | SATA Express drive                                      | SAS drive                                     | Enterprise PCIe<br>drive                  |
|----------------------------------|--------------------------------------------|---------------------------------------------------------|-----------------------------------------------|-------------------------------------------|
| interprise backplane             | Works-<br>system supports<br>(carrier key) | Works-<br>if system supports<br>(carrier key)           | Works-<br>if system supports<br>(carrier key) | Works                                     |
| SAS backplane                    | Works with STP                             | Mates-Nonfunctional<br>(requires STP+)<br>(carrier key) | Works                                         | Mates-nonfunctional<br>(carrier key)      |
| SATA Express<br>backplane/laptop | Works                                      | Works                                                   | Blocked-Key                                   | Blocked-Key                               |
| TA backplane/laptop              |                                            | Blocked-Key                                             | Blocked-Key                                   | Blocked-Key                               |
| Enterprise cable                 | Blocked-Key                                | Blocked-Key                                             | Blocked-Key                                   | Works                                     |
| SAS cable                        | Works                                      | Mates-Nonfunctional<br>(requires STP+)                  | Works                                         | Mates-nonfunctional & no detent retention |
| ATA Express cable                | Works                                      | Blocked-Key                                             | Blocked-Key                                   | Blocked-Key                               |
| SATA cable                       | Works                                      | Blocked-Key                                             | Blocked-Key                                   | Blocked-Key                               |





ጣ

## Layers of Standards







#### Agenda

| 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|----|---------------------|-------------------------------------------------------|------------------------------|
| 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
| 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
| 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
| 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
| 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstein, Fusion-io    |
| 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |



#### Marty Czekalski, Seagate

Standards and Deployment Models



#### Abstract:

There are multiple standardization activities ongoing for PCIe based storage, some aspects of which overlap. Additionally, there are multiple deployment/provisioning options that will exist in the marketplace. A overview of these activities and issues will be discussed.

Marty Czekalski brings over thirty years of senior engineering management experience in advanced architecture development for Storage and IO subsystem design, ASIC, and Solid State Storage Systems.

He is currently Sr. Staff Program Manager within Seagate's Strategic Planning and Development Group.





# **PCIe SSD Alternatives**

# SAS is the preferred SSD Interface for Storage Systems





#### Storage-attached SSD Units

### Server Attached SSDs



Server-attached SSD Units



Forward-Insights 11-2011

### **Multi-Function Bay**

- Multi-function SAS/PCIe bay
  - Uses SFF-8639 Multi-function connector
  - High performance (up to 25W per slot)
  - Hot swap, serviceability (SAS)
  - High availability (2 fault domains)
  - Supports a range of devices (system dependent)
    - 12Gb/s SAS
    - 6Gb/s SATA
    - MultiLink SAS (4 SAS Ports)
    - PCIe SSDs (emerging)
      - NVMe, SOP-PQI, Proprietary
    - SATA Express





### SFF-8639 Signals

| Drive                                                                                                                      | Usage            | Signal Description                      | Name                    | Mating | Pi       |     |      |
|----------------------------------------------------------------------------------------------------------------------------|------------------|-----------------------------------------|-------------------------|--------|----------|-----|------|
| [                                                                                                                          |                  | Ground                                  | GND                     | 2nd    | s        |     |      |
| input                                                                                                                      | SAS+SATA         | SAS/SATA/SATAe 0 Tx+                    | S0T+ (A+)               | 3rd    | s        |     |      |
| input                                                                                                                      | SAS+SATA         | SAS/SATA/SATAe 0 Tx -                   | S0T- (A-)               | 3rd    | s        |     |      |
|                                                                                                                            |                  | Ground                                  | GND                     | 2nd    | s        |     |      |
| output                                                                                                                     | SAS+SATA         | SAS/SATA/SATAe 0 Rcv -                  | SOR- (B-)               | 3rd    | s        |     |      |
| output                                                                                                                     | SAS+SATA         | SAS/SATA/SATAe 0 Rcv +                  | SOR+ (B+)               | 3rd    | s        |     |      |
|                                                                                                                            |                  | Ground                                  | GND                     | 2nd    | s        |     | F    |
| input                                                                                                                      | Dual Port        | ePCIe RefClk + (port B)                 | RefClk1+                | 3rd    | E        |     |      |
| input                                                                                                                      | Dual Port        | ePCIe RefClk – (port B)                 | RefClk1-                | 3rd    | E        |     | IE - |
| input                                                                                                                      | ePCle opt        | 3.3V for SM bus                         | 3.3Vaux                 | 3rd    | E        |     |      |
|                                                                                                                            | Dual Port        | ePCIe Reset (port B)                    | ePERst1#                | 3rd    | E        | ٩., | R    |
| input                                                                                                                      | ePCle            | ePCle Reset (port A)                    | ePERst0#                | 3rd    | E        | ٩., | P    |
| mpar                                                                                                                       | ci oic           | Reserved                                | RSVD                    | 3rd    | [        |     | I.   |
| _                                                                                                                          | SATAe            | Reserved(WAKE#/OBFF),                   |                         | 510    | <u> </u> |     | q    |
| input                                                                                                                      | +SAS4            | SASAct2                                 | RSVD(Wake#)<br>/SASAct2 | 3rd    | F        |     |      |
| Bi-Dir                                                                                                                     | SATAe            | SATAe Client /SAS reset                 | sPCIeRst/SAS            | 3rd    | P        |     |      |
| input                                                                                                                      | SATAe            | Reserved (DevSLP#)                      | RSVD(DevSLP#)           | 2nd    | F        |     |      |
| output                                                                                                                     | SATAe +<br>ePCle | Interface Detect<br>(Was GND-precharge) | IfDet#                  | 1st    | P        |     |      |
|                                                                                                                            | all              | Ground                                  | GND                     | 2nd    | P        |     |      |
|                                                                                                                            | all              | Ground                                  | 0110                    | 2nd    | P        |     |      |
| NC                                                                                                                         | SAS+SATA         | Precharge                               |                         | 2nd    | P        |     |      |
| NC                                                                                                                         | SAS+SATA         | SATA, SATAe, SAS only                   | 5 V                     | 3rd    | F        |     |      |
| NC                                                                                                                         | SAS+SATA         |                                         |                         | 3rd    | F        |     |      |
|                                                                                                                            | all              | Presence (Drive type)                   | PRSNT#                  | 2nd    | P:       |     |      |
| Bi-Dir                                                                                                                     | all              | Activity(output)/Spinup                 | Activity                | 3rd    | P:       |     |      |
|                                                                                                                            | all              | Hot Plug Ground                         | GND                     | 1st    | P:       |     |      |
| input                                                                                                                      | all              | Precharge                               |                         | 2nd    | P        |     |      |
| input                                                                                                                      | all              | All – 12V                               | 12 V                    | 3rd    | P:       |     |      |
| input all Only power for ePCIe SSD 3rd P                                                                                   |                  |                                         |                         |        | P:       |     |      |
| ePCIe → Enterprise PCIe (separate from<br>SATA/SAS)<br>SATAe → SATA Express<br>(Client PCIe- muxed on SATA/SAS<br>signals) |                  |                                         |                         |        |          |     |      |

| 7 | Y. |       |        |             |                           |            |        |
|---|----|-------|--------|-------------|---------------------------|------------|--------|
| L |    | Pin # | Mating | Name        | Signal Description        | Usage      | Drive  |
| L | Þ. | E7    | 3rd    | RefClk0+    | ePCIe Primary RefClk +    | ePCle      | input  |
| L | Þ. | E8    | 3rd    | RefClk0-    | ePCIe Primary RefClk -    | ePCle      | input  |
| L | Þ. | E9    | 2nd    | GND         | Ground                    |            |        |
| L | Þ  | E10   | 3rd    | PETp0       | ePCIe 0 Transmit +        | ePCle      | input  |
| L | Þ  | E11   | 3rd    | PETn0       | ePCIe 0 Transmit -        | ePCle      | input  |
| L | Þ  | E12   | 2nd    | GND         | Ground                    |            |        |
| L | Þ  | E13   | 3rd    | PERn0       | ePCle 0 Receive -         | ePCle      | output |
| L | Ρ. | E14   | 3rd    | PERp0       | ePCle 0 Receive +         | ePCle      | output |
| L | Ρ. | E15   | 2nd    | GND         | Ground                    |            |        |
| L | Ρ. | E16   | 3rd    | RSVD        | Reserved                  |            |        |
| L | P  | S8    | 2nd    | GND         | Ground                    |            |        |
| L | P  | S9    | 3rd    | S1T+        | SAS/SATAe 1 Transmit +    | SAS+SATAe  | input  |
| L | P  | S10   | 3rd    | S1T-        | SAS/SATAe 1 Transmit -    | SAS+SATAe  | input  |
| L | E. | S11   | 2nd    | GND         | Ground                    |            |        |
| L | Ľ  | S12   | 3rd    | S1R-        | SAS/SATAe 1 Receive -     | SAS+SATAe  | output |
| L | Ľ  | S13   | 3rd    | S1R+        | SAS/SATAe 1 Receive +     | SAS+SATAe  | output |
| L | Ľ  | S14   | 2nd    | GND         | Ground                    |            |        |
| L | Ľ. | E17   | 3rd    | RSVD        | Reserved                  |            |        |
| L | Ε. | E18   | 2nd    | GND         | Ground                    |            |        |
| L | 5  | E19   | 3rd    | PETp1/S2T+  | ePCle 1 /SAS 2 Transmit + | ePCIe+SAS4 | input  |
| L | Ε. | E20   | 3rd    | PETn1/S2T-  | ePCIe 1 /SAS 2 Transmit - | ePCIe+SAS4 | input  |
| L |    | E21   | 2nd    | GND         | Ground                    |            |        |
| L | 6  | E22   | 3rd    | PERn1/S2R-  | ePCIe 1 /SAS 2 Receive -  | ePCIe+SAS4 | output |
| L |    | E23   | 3rd    | PERp1/S2R+  | ePCIe 1 /SAS 2 Receive +  | ePCIe+SAS4 | output |
| L | 6  | E24   | 2nd    | GND         | Ground                    |            |        |
| L |    | E25   | 3rd    | PETp2/S3T+  | ePCle2 / SAS 3 Transmit + | ePCIe+SAS4 | input  |
| L |    | E26   | 3rd    | PETn2/S3T-  | ePCIe2 / SAS 3 Transmit - | ePCIe+SAS4 | input  |
| L | þ. | E27   | 2nd    | GND         | Ground                    |            |        |
| L | þ. | E28   | 3rd    | PERn2/S3R-  | ePCle 2 / SAS 3 Receive - | ePCIe+SAS4 | output |
| L | Þ. | E29   | 3rd    | PERp2/S3R+  | ePCle 2 / SAS 3 Receive + | ePCIe+SAS4 | output |
| L | þ. | E30   | 2nd    | GND         | Ground                    |            |        |
| L | Þ. | E31   | 3rd    | PETp3       | ePCIe 3 Transmit +        | ePCle      | input  |
| L | Þ  | E32   | 3rd    | PETn3       | ePCIe 3 Transmit -        | ePCle      | input  |
| L | Þ  | E33   | 2nd    | GND         | Ground                    |            |        |
| L | Þ. | E34   | 3rd    | PERn3       | ePCle 3 Receive -         | ePCle      | output |
|   | Þ  | E35   | 3rd    | PERp3       | ePCle 3 Receive +         | ePCle      | output |
|   | Þ  | E36   | 2nd    | GND         | Ground                    |            |        |
|   | P  | E37   | 3rd    | SMClk       | SM-Bus Clock              | PCIe opt   | Bi-DIr |
|   | P  | E38   | 3rd    | SMDat       | SM-Bus Data               | PCIe opt   | Bi-Dir |
|   |    | E39   | 3rd    | DualPortEn# | ePCIe 2x2 Select          | Dual Port  | input  |
| 3 |    |       |        |             |                           |            |        |

From: SFF-8639 Rev. 0.5, January 3, 2012

### T10/STA Standards Update



- Performance Enhancements
  - 12Gb/sec SAS (2013 Product Shipments)
  - Copy Offload
- Power management
  - Ability to adjust power consumption vs performance
- Multi-function (SAS/PCIe) serviceable bay
  - SFF-8639 Connector
- SCSI over PCIe (SOP-PQI)
  - Direct attached devices (e.g. SSDs)
  - HBAs, RAID controllers, and Bridge devices
- New device types SMR, SSD Commands & Hints

### Enterprise Interfaces: PCIe SSDs



|                                                      | Native                                             | Aggregator                                                      |  |
|------------------------------------------------------|----------------------------------------------------|-----------------------------------------------------------------|--|
| Commands/Transport                                   | PCle<br>(FTL <sup>1</sup> in host/<br>main memory) | PCIe SCSI or SATA<br>(Multiple SSDs<br>& controller<br>on card) |  |
| Committee                                            | None                                               | None                                                            |  |
| Standards Based                                      | No                                                 | Yes                                                             |  |
| Performance with Flash                               | High                                               | High                                                            |  |
| CPU/Memory Overhead                                  | High-Low                                           | Low                                                             |  |
| Latency with short queue                             | Very Low                                           | Low                                                             |  |
| Latency with deep queue                              | Moderate                                           | Low                                                             |  |
| Use Case Extensibility                               | Case Extensibility No                              |                                                                 |  |
| Maturity                                             | Evolving                                           | Based on Proven Industry<br>Architectures                       |  |
| Enterprise feature set<br>(PI, Security, Mgmt, etc.) | No                                                 | Depends on implementation                                       |  |
|                                                      |                                                    | <sup>1</sup> FTL : Flash Translation Layer                      |  |



### LSI WarpDrive SLP-300 PCIe Solid State Storage Acceleration Card



Base LSI Data Protection Layer (DPL) & Storage Management



Application Acceleration for IO Intensive and Latency Sensitive Workloads

# Enterprise Interfaces: The Future of PCIe SSDs



|                                                                                                                                    | SOP/PQI <sup>1</sup>                                               | NVMe <sup>2</sup>         |  |
|------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------|---------------------------|--|
| Commands/Transport                                                                                                                 | PCle SOP/PQI <sup>3</sup><br>Controller (FTL in controller)<br>F F | PCIe<br>Controller<br>F F |  |
| Committee                                                                                                                          | T10/INCITS <sup>4</sup>                                            | Industry Working Group    |  |
| Standards Based                                                                                                                    | Yes (ANSI/ISO)                                                     | No                        |  |
| Performance with Flash                                                                                                             | Very High                                                          | Very High                 |  |
| CPU Overhead                                                                                                                       | Low                                                                | Low                       |  |
| Latency with short queue                                                                                                           | Very Low                                                           | Very Low                  |  |
| Latency with deep queue                                                                                                            | Low                                                                | Low                       |  |
| Use Case Extensibility                                                                                                             | Yes (RAID, HBA, etc.)                                              | No (NVM only)             |  |
| Maturity                                                                                                                           | <b>Investment Protection</b>                                       | TBD                       |  |
| Enterprise feature set<br>(PI, Security, Mgmt, etc.)                                                                               | Full Support                                                       | Limited                   |  |
| <sup>1</sup> SOP : SCSI over PCI Express<br><sup>2</sup> NVMe : Non- Volatile Memory Expres<br><sup>3</sup> PCIe Queuing Interface | S                                                                  |                           |  |

1/23/12 <sup>4</sup>INCITS : International Committee for Information Technology Standards



- Preserves Storage Investment Logical SCSI
- Broad <u>Open</u> Industry Standards Support
- Dynamic Platform for Storage Innovation
- Enterprise Proven RAS (Hot Plug)
- Multi-Host, High Queue Depths, Concurrency
- Depth & Breath of Infrastructure
- Ease of integration with existing management infrastructures & features
- Compliments PCIe Attached Storage

### So who wins? - TBD



- NVMe has an early lead in development, but not hardened yet
- SOP is behind NVMe in development, but has a more robust ecosystem
- Support across industry is fragmented
- Market is still small, can it sustain the current level of investment?
- SAS controllers > 1 Million IOPS diminish PCIe SSD differentiation
- Once the PCIe capable bays are available, any PCIe device can be packaged in a 2.5" FF and used, in as long as a driver exists.
  - Creates confusion and fragment the market
- Open issues remain
  - Interoperability Electrical spec for the bay??
  - Hot plug?
  - Compliance testing?
- Will additional form factors emerge and further fragment the market

#### Agenda

| 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|----|---------------------|-------------------------------------------------------|------------------------------|
| 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
| 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
| 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
| 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
| 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstein, Fusion-io    |
| 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |





#### Abstract:

Since its introduction in 2001, SATA technology has evolved from a solely client/server storage interface to provide low-cost, high performance storage solutions for a wide variety of applications. There is an emerging segment of the client storage market, SSDs and hybrid HDDs, that requires higher performance than today's 6Gb/s SATA. To meet the needs of this segment, SATA-IO introduced SATA Express, a new specification that provides higher performance by utilizing readily available, fast, and scalable PCI Express connectivity while preserving established SATA software compatibility. This presentation will describe the details of SATA Express and the implications for devices and systems that will support it.

Paul Wassenberg, SATA-IO

SATA-IO & SATA Express – PCIe for Client Storage

Paul Wassenberg has over 20 years of experience in data storage and has been deeply involved with storage interface technology, including SATA since its inception. Early in his career, he was a storage controller designer, before moving into Marketing in the HDD industry, and eventually into storage semiconductors.

Paul currently holds the position of Director, Product Marketing with Marvell Semiconductor. In that role, he has responsibility for transceiver technology and HDD/SSD storage standards. He is on the SATA-IO board of directors and chairs the SNIA Solid State Storage Initiative. Paul holds BSEE and MBA degrees from San Jose State University.









# **SATA Express**

### **Evolving SATA for High Speed Storage**

January 23, 2012

# **SATA for PC Client Storage**

- A Mature Interface
  - SATA is the de facto standard for PC storage; also widely implemented in mobile and enterprise applications
  - Adoption of SATA 6Gb/s technology is strong



# A Growing Ecosystem



SATA implementations are becoming increasingly application specific

Since its introduction, SATA has evolved into new application spaces and now provides storage interface solutions for HDDs, ODDs, SSDs, and Hybrid HDDs in client PC, mobile, enterprise, CE, and embedded storage markets

# **Example Application-Specific Implementations**

mSATA (mobile applications)

SATA µSSD (embedded applications)

 SATA Universal Storage Module (consumer electronics, PC applications)







# Application Speed Requirements

- Today, most applications are well-served by SATA 6Gb/s and will be for the foreseeable future
- However, SSDs and Hybrid HDDs will soon require greater speeds than those enabled by the current generation of SATA



# Introducing SATA Express™

- To meet speed requirements in SSD/hybrid drive applications, SATA-IO is developing SATA Express ™
  - Combines SATA software infrastructure with the PCI Express® (PCIe®) interface
    - Utilizes standard register-level interface such as AHCI
  - Provides up to 8Gb/s and 16Gb/s
    - One lane or two lanes of PCIe
  - Defines new device and motherboard connectors to support both new SATA Express and current SATA devices
  - Will coexist with other application-specific SATA formats



# **SATA Express Connectors**



SATA Express connector supports PCIe and SATA

- Mechanism to detect device interface
- Allows a single motherboard / backplane connector to support both interfaces

SATA Express supports HDD-compatible form factors

 Enables system-level mechanical compatibility

SATA-IO is developing backward compatible connectors for SATA Express motherboards & devices

# **SATA Express Benefits**

- Provides a cost-effective solution for increasing device interface speed
- Specification can be completed and implemented relatively quickly, since both SATA and PCIe are already widely implemented
- Helps ensure seamless coexistence between SATA and PCIe
- Protects developer investments in both interfaces

# **Next Steps And Timeline**

SATA Express is currently under development within the SATA-IO Cable & Connector Work Group

Completed specification expected within 2012

In the meantime, SATA-IO will continue to optimize the existing SATA infrastructure for a wide variety of applications

 SATA will continue to be the mainstream storage interface for the foreseeable future

#### Agenda

|    | 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|----|----|---------------------|-------------------------------------------------------|------------------------------|
|    | 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
|    | 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
| 76 | 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
|    | 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| 11 | 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
|    | 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstein, Fusion-io    |
|    | 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |



#### Janene Ellefson, Micron

#### PCle 2.5" Form Factor

#### Abstract:

A key factor standing in the way of widespread PCIe SSD adoption is serviceability of the current card form factor. In most hosts, the card form factor requires that the system be powered down and the unit be opened up to remove the existing card and insert a new card. This is not optimal given the widespread adoption of virtualization. Powering down a machine can disrupt overall system efficiency. Providing the industry with a robust form factor that can be serviceable and still provide PCIe highperformance capability will be a game changer and will increase adoption.

The 2.5-inch form factor is an overall industry standard, and when coupled with a PCIe interface and a SATA/SAS combo connector, it becomes a portable, compact, hot-pluggable PCIe device that is very compelling and enables better performance and serviceability in enterprise systems. Enterprise applications everywhere will benefit from the increased performance, lower energy consumption compared to HDDs, and hot plug serviceability.

Janene Ellefson is the Product Marketing Manager for Enterprise PCIe SSDs and is responsible for worldwide PCIe SSD marketing efforts.

She joined Micron in 1989 and has spent the majority of her Micron career in various marketing roles, supporting NOR Flash and NAND Flash products.

Ms. Ellefson holds a BS from the University of Phoenix in business and marketing.







# PCIe 2.5" Form Factor

Janene Ellefson

Product Marketing Manager – PCIe SSD



©2011 Micron Technology, Inc. All rights reserved. Products are warranted only to meet Micron's production data sheet specifications. Information, products, and/or specifications are subject to change without notice. All information is provided on an "AS IS" basis without warranties of any kind. Dates are estimates only. Drawings are not to scale. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners.

January 23, 2012

### Advantages of PCIe over SATA/SAS SSDs



Higher Performance



Lower power consumption



#### Lots of advantages for PCIe Enterprise would use it more if they could



# What's Holding it Back?



- Too much space
- Limited PCIe Slots
- Power down required

Todays PCIe form factors are not optimal for Enterprise serviceability

### Propose the 2.5" PCIe Form Factor





### 2.5" Advantages



- PCIe performance
- Common Form Factor
- Compactness

- Serviceability
- Lower TCO
- Supports RAID



# Summary

- PCIe offers lots of advantages Adoption rates are low
- Today's PCIe form factors are not optimal for Enterprise serviceability
- 2.5" Form Factor: All the performance of PCIe with the serviceability standards of SATA/SAS
- 2.5" = increase PCIe adoption





#### Agenda

| 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|----|---------------------|-------------------------------------------------------|------------------------------|
| 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
| 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
| 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
| 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
| 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstein, Fusion-io    |
| 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |



January 2012

convention designs still utilize I/O interfaces such as PCIe. As such, conventional designs have storage access imbalances Although, memory system technology has been utilizing DRAM-based approaches, many business applications require even larger memory spaces in order to take advantage of more recent CPU technology advancement. In this presentation, the use of the extended memory access architecture will be introduced.

As a venture partner of the Harbor Pacific Capital, Dr. Moon J. Kim serves as the CEO of TailWind Storage company. Most recently, Dr. Kim served as the Vice-Chairman & CEO Technology Advisor of Samsung Electronics Corp., where he led several special projects. He also served the executive technology advisor of LG and the senior managing executive of Exponent, a New York based technology consulting company. Dr. Kim is specialized in IO and memory architecture on HPC and main frame servers. During his 28 years in IBM R&D, he led and managed all aspects of IT technology and server development. He held the prestigious title of IBM Master Inventor and has led numerous Emerging Technology developments. He has produced over 130 inventions and has authored several system and IT technology books and published numerous technical papers. He is an expert on the technology industry in Asia. Recently he was awarded twice by the Chinese Academy of Science for this work on the China National Supercomputing Grid and multicore processor development He can be reached at mikim@harborpac.com and (650) 690-0795. (845) projects. 702-2422

#### Dr. Moon Kim, Phd, *Tailwind*

As storage devices are used in memory technologies (e.g., flash and DDR devices) in order to speed up data access,

storage system designs have not been changed. That is,

Convergence of Memory and Storage IO Architecture

#### Abstract:



AILWIND STORAGE





### **SNIA** Presentation

January 2012

Dr. Moon J Kim



#### A New Era in Storage Architecture

- High IO demand causes IO congestion. DRAM has the highest bandwidth and least latency for CPUs, thus making it reasonable to exploit DRAM as an IO channel.
- Conventional IOs, such as PCIe, demand too many supporting resources for the IO itself, and several CPU cycles are required to move the data.
- New and innovative technologies are needed to bring IOs closer to CPU.



#### **Expanded Storage Architecture**



George et al.

- [54] DYNAMIC RECONFIGURATION OF MAI STORAGE AND EXPANDED STORAGE BY MEANS OF A SERVICE CALL LOGICAL PROCESSOR
- [75] Inventors: Jonel George, Pleasant Valley; Stev Gardner Glassen, Wallkill; Matthe Anthony Krygowski, Hopewell Junction; Moon Ju Kim, Wappinger Falls; Allen Herman Preston; Davi Emmett Stucki, both of Poughkeeps all of N.Y.
- [73] Assignee: International Business Machines Corporation, Armonk, N.Y.
- [21] Appl. No.: 635,537
- [22] Filed: Apr. 22, 1996

**Related U.S. Application Data** 

[63] Continuation of Ser. No. 70,588. Jun. 1. 1993. abandoned

**TAILWIND STORAGE** 

|      |                                                                                                                                                                                                                                       |                                                                                                                                                                                      | US006026462A                                                                                                                                                                          |                                                            |                         |  |
|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------|-------------------------|--|
| Un   | ited S                                                                                                                                                                                                                                | States Patent [19]                                                                                                                                                                   | [11]                                                                                                                                                                                  | Patent Number:                                             | 6,026,462               |  |
| Geo  | rge et a                                                                                                                                                                                                                              | l.                                                                                                                                                                                   | [45]                                                                                                                                                                                  | Date of Patent:                                            | *Feb. 15, 2000          |  |
| [54] |                                                                                                                                                                                                                                       | ORAGE AND EXPANDED<br>E REASSIGNMENT FACILITY                                                                                                                                        | [58] Fi                                                                                                                                                                               | eld of Search<br>711/206, 170; 74/                         |                         |  |
| [75] | Inventors:                                                                                                                                                                                                                            | Jonel George, Pleasant Valley; Steven                                                                                                                                                | [56]                                                                                                                                                                                  | References Ci                                              | ted                     |  |
|      |                                                                                                                                                                                                                                       | Gardner Glassen, Wallkill; Matthew<br>Anthony Krygowski, Hopewell<br>Junction; Moon Ju Kim, Wappingers<br>Falls; Allen Herman Preston; David<br>Emmett Stucki, both of Poughkeepsie, | U.S. PATENT DOCUMENTS                                                                                                                                                                 |                                                            |                         |  |
|      |                                                                                                                                                                                                                                       |                                                                                                                                                                                      |                                                                                                                                                                                       | 5,322 5/1990 Stimac et al.<br>1,055 12/1997 George et al.  |                         |  |
|      |                                                                                                                                                                                                                                       | all of N.Y.                                                                                                                                                                          |                                                                                                                                                                                       | Examiner—Tuan V. Thai                                      |                         |  |
| [73] | Assignee:                                                                                                                                                                                                                             | International Business Machines                                                                                                                                                      | Attorney,<br>L. Augst                                                                                                                                                                 | Agent, or Firm—Lane, A                                     | itken & McCann; Lynn    |  |
| . ,  | 0                                                                                                                                                                                                                                     | Corporation, Armonk, N.Y.                                                                                                                                                            | [57]                                                                                                                                                                                  | ABSTRAC                                                    | r                       |  |
| [*]  | Notice:                                                                                                                                                                                                                               | This patent is subject to a terminal dis-<br>claimer.                                                                                                                                | A data                                                                                                                                                                                | processing system has a<br>which provides a common         | processing unit and a   |  |
| [21] | Appl. No.:                                                                                                                                                                                                                            | : 08/897,449                                                                                                                                                                         |                                                                                                                                                                                       | age is initially assigned as<br>1 storage during power on. |                         |  |
| [22] | Filed:                                                                                                                                                                                                                                | Jul. 22, 1997                                                                                                                                                                        | assignme                                                                                                                                                                              | ent, storage assigned as ma                                | ain storage or expanded |  |
|      | Related U.S. Application Data<br>Division of application No. 08/635,537, Apr. 22, 1996, Pat.<br>No. 5,704,055, which is a continuation of application No.<br>08/070,588, Jun. 1, 1993, abandoned.<br>5,479,631 12/1995 Manners et al. |                                                                                                                                                                                      | storage may be unassigned and thus returned to the common<br>pool. Once returned to the common pool, the storage may be<br>reassigned as either main storage or expanded storage. The |                                                            |                         |  |
| [62] |                                                                                                                                                                                                                                       |                                                                                                                                                                                      | storage reassignment is done dynamically without requiring<br>a reset action and transparent to the operating system and<br>any active application programs                           |                                                            |                         |  |
|      | Assistant                                                                                                                                                                                                                             | Examiner—Tod R. Swann<br>Examiner—Tuan V. Thai<br>Agent, or Firm—Lynn L. Augspurger; La<br>r                                                                                         | aurence J.                                                                                                                                                                            |                                                            |                         |  |
|      | [57]                                                                                                                                                                                                                                  | ABSTRACT                                                                                                                                                                             |                                                                                                                                                                                       |                                                            |                         |  |
|      | A data p                                                                                                                                                                                                                              | rocessing system has a processing un                                                                                                                                                 | uit and a                                                                                                                                                                             |                                                            |                         |  |

CONFIDENTIAL

#### **Problem –** Increasing need for faster storage

As CPUs reach faster clock speeds, storage ullettechnologies have evolved to reduce the "Speed-Gap" between the CPU and the storage device.



Source : Shirish Jamthe , Director of System Engineering, Virident Systems, Inc., August 2011



#### **New Architecture Consideration**





CONFIDENTIAL

#### **DDR+ Extension and Memory Mapper**





CONFIDENTIAL

#### **Example Implementation**



• Slight modification and expansion of SDRAM address scheme allows infinite address space extension, additional command mode, status register space, etc.



#### **New Architecture**

- Memory space can extend additional description tag stored in specified register and memory location.
- Memory space and thread are virtualized within the limited memory space.
- Thread sees physical address space.
- OS maintains virtualization of threads.
- Storage is connected through memory-to-storage mapper.
- Memory can serve as a large off-CPU cache of storage.
- Storage should be fast enough to support memory operation.
- Storage can be accessed directly.



### **Tailwind Storage Company**

- Tailwind's <u>DDR-based storage technology</u> meets the increasing need for ultra-fast storage devices that match faster CPUs.
- Tailwind Storage prototypes have been approved by major OEM partners.
- Tailwind maintains a robust IP portfolio.
- Tailwind's team has over 100 years of IO & Memory experience in storage system technology.



#### **Problem – High Performance Computing Environment**

 Existing storage technologies are unable to fully meet demanding performance in <u>multi-threaded</u>, <u>data-heavy</u> <u>computing environments</u>.



#### NAND Flash Memory Technology

- NAND technology maintains a lower effective capacity.
- IOPS testing: latency is effective and favorable under NAND. It <u>may not reflect</u> the real memory operations.
- NAND scaling usually increases latency.



#### **Solution – Benefits of Our DDR Storage Products**

#### Expandable

#### Unbeatable Speed

- Much faster than flash based SSD
- Access to storage is closer to speed of CPU

#### Sustainable Performance

- No performance degradation\*
- Symmetric read/write performance
- Linear and transparent
- Consistent performance regardless of workload mix



#### **Our Solution – DDR Advantages**

#### Latency

 Faster than Flash SSD and HDD , in the order of nanoseconds instead of milliseconds or microseconds

#### Sustainability of Performance\*\*

- No performance degradation
- Symmetric, and linear read/write performance
  - Read / Write Parity
  - Consistent performance regardless of work load mix



### Fast transition in handling mix block sizes\*\*



### No idle recovery required, minimum background garbage collection\*\*



\*\*Actual test results by an independent test service CONFIDENTIAL company with Tailwind's Pro-E

#### IT Storage Hierarchy, Trends and Opportunity



#### **Our Solution – Y 2011 Early Adopter** products

- Pro Extended
  - 64GB DDR, 700MB/s
  - Initial evaluation completed with prototype from OEM
- Hybrid SSD Storage & Server  $\bullet$ 
  - 8 Core CPU, 512GB DDR, 5GB/s
  - Evaluation approved by major OEM for market development
- Super-Mini
  - 8 Core CPU, 1TB DDR, 16GB/s
  - Customer evaluation in progress









#### Y2012 TW Product Specification

| Feature           | *Pro-Extreme<br>Prototype | Backdraft             | 2 <sup>nd</sup> Backdraft |
|-------------------|---------------------------|-----------------------|---------------------------|
| Memory technology | DDR2 SDRAM                | DDR3 SDRAM            | DDR3 SDRAM                |
| Capacity          | 64GB                      | 512GB max.            | 1024GB max.               |
| Host interface    | PCIe Gen. 1, 4x           | PCIe Gen. 2, 8x       | PCIe Gen. 2, 16x          |
| Host bandwidth    | 0.8GB/s                   | 4GB/s                 | 8GB/s                     |
| Form factor       | Full length PCIe          | Half, full, dual PCIe | Half, full, dual PCIe     |



#### **Contact Information**

For more information:

Dr. Moon J Kim

- <u>mjkim@tailwindstorage.com</u>
- Tel: 650-690-0795
- 525 University Ave, Suite 100, Palo Alto, CA 94301



#### Agenda

|   | 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|---|----|---------------------|-------------------------------------------------------|------------------------------|
|   | 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
|   | 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
|   | 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
|   | 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| Ï | 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
|   | 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstein, Fusion-io    |
|   | 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |



performance improvements for their applications and databases. This talk will explore customer input, reactions, and lessons on new models of deploying NAND flash using PCIe, along with taking a look at the future. Today the industry is on the cusp of a new storage continuum. PCIe as a storage mechanism now spans everything from high end servers like the HP DL 980 with up to 16 PCIe I/O expansion slots, all the way down to Thunderbolt, a consumer focused link based on PCIe. There are also important industry initiatives underway like SCSI Express and activities within T10. This talk will cover some of the latest proposals and how the industry and customers stand to benefit from these developments.

#### Abstract:

In a matter on no time, at least in storage years, NAND flash has emerged in the data center as a force changing the storage landscape. Perhaps no area where this impact has been more visible and more dramatic is in the placement of NAND flash close to the CPUs. By placing process-critical data close to the CPU customers see leap fold performance improvements for their applications and databases.

Gary Orenstein, Fusion-io

PCIe – Lessons from the Front Lines; and a Look to the Future

VP of Products, Fusion-io, Gary has served in leadership roles at numerous data center infrastructure companies. Prior to Fusion-io he was the vice president of marketing at MaxiScale, focused on web scale file systems and acquired by Overland Storage.

Prior to MaxiScale, he was the vice president of marketing and business development at Gear6, focusing on storage and web caching. He also served as vice president of marketing at Compellent which went public and 2007, and was a cofounder at Nishan Systems, acquired by McDATA/Brocade.





#### January 2012

### FUSION-10

PCIe Storage - Lessons From the Front Lines and a Look to the Future Gary Orenstein, VP of Products

SNIA Winter Symposium January 2012

# TICOLD FOR CONTRACTOR OF CONTA

Lessons



#### **TRADITIONAL ARCHITECTURE**

FUSION-10







#### **APPLICATION ACCELERATION EXAMPLES**

#### AVERAGE DATABASE THROUGHPUT







# Turning

## Point

# for PCIe





| Consumer | SMB/SME | Enterprise |
|----------|---------|------------|
| • USB    | • SATA  | • SAS      |
| • SATA   | • SAS   | • FC       |
| • IDE    | • FC    | • IB       |



#### **THUNDERBOLT PCIE DEVICES**

FUSION-iO



January 23, 2012



FUSION-10

| · 8 |             |                       |  |     |
|-----|-------------|-----------------------|--|-----|
|     | III 200 F . |                       |  |     |
|     |             |                       |  | , s |
|     |             |                       |  |     |
|     |             |                       |  |     |
|     |             |                       |  |     |
|     |             | ***** ** <sup>*</sup> |  |     |
|     |             |                       |  |     |
|     |             |                       |  |     |

Up to 11 full height/full length slots supported

2410 GB x 11

26.5TB per server

x11





### SMB/SME Enterprise Consumer **PCle**



# Express





#### A set of industry initiatives delivering a PCIe Express based enterprise storage solution

| Industry Initiative          | Focus                                                    |
|------------------------------|----------------------------------------------------------|
| SCSI Over PCIe (SOP)         | Streamline SCSI command set optimized for solid state    |
| PCIe Queuing Interface (PQI) | Flexible and extensible transport layer                  |
| Universal drive connector    | Supporting current and emerging devices                  |
| PCIe physical layer          | Drive error handling and asynchronous hot add/<br>remove |
| Native OS support            | Standard drivers to support range of devices             |



#### SCSI EXPRESS AND NVM EXPRESS

FUSION-iO

| SCSI Express                                  | NVM Express (NVMe)                                                                                     |
|-----------------------------------------------|--------------------------------------------------------------------------------------------------------|
| <b>A standard</b> to combine<br>SCSI and PCIe | A register level interface for host<br>software to communicate with a<br>non-volatile memory subsystem |
| Enterprise Roots (SCSI based)                 | Consumer Deats (ATA based)                                                                             |
| SCSI reliability and dependability            | Consumer Roots (ATA based)                                                                             |
| Extensible configurations                     | Limited configuration support                                                                          |





- Embrace PCIe
- Fill gap between DRAM and HDD
- Embrace SCSI
- Work together on standards
- Ensure a quality ecosystem

### LINUX KERNEL WITH SCSI SUBSYSTEM FUSION-IO





FUSION-iO



# Don't forget

# software



#### Is GPS technology a new map or new architecture?















#### **Applications and File Systems**

#### Storage Stack

#### **Physical Device Operations**













#### FUSION-iO'

#### Asymmetric read/write latencies

Write-impact on durability

Unique erase characteristics



## **FLASH AS A NEW ARCHITECTURE**

X

FUSION-10



FUSION-iO'

#### Input

Logical Block Address (LBA)

### **Flash Translation Layer**

### Output Commands to Physical NAND flash



- Virtualize the storage layer
- Retain compatibility with conventional block I/O
- Deliver new flash-native capabilities

# **Atomic Writes**



#### **ATOMIC WRITES**

#### FUSION-iO'

#### 23 December 2011

11-229r1 SBC-3 SPC-4 Atomic writes

To: T10 Technical Committee

From: Rob Elliott, HP (elliott@hp.com) and Ashish Batwara, Fusion-io (abatwara@fusionio.com)
 Date: 23 December 2011
 Subject 11-22911 SBC-3 SPC-4 Atomic writes

#### Revision history

Revision 0 (7 May 2011) First revision Revision 1 (23 December 2011) Incorporated feedback from CAP WG 2011-09-14; created a new ATOMIC WRITE command with multiple LBA ranges. Added Ashish Batwara as co-author.

#### References

Atomic Writes for data integrity and consistency in shared storage devices for clusters. Michael Okun and Armon Barak, Future Generation Computer Systems 20(4), 539-547 (2004). See http://www.veizmann.ac.lineurobiology/labs/lampl/mush/humla.huml and http://www.cs.huji.ac.lineurobiology/labs/lampl/mush/humla.huml and http://www.cs.huji.ac.lineurobiology/labs/lampl/mush/humla.huml and Conference on Agorithms and Architectures for Parallel Processing (ICASPP02), 2002.

Beyond Block I/O: Retinking Traditional Storage Primitives. Xiangyong Ouyang (Fusion-io) and Ohio State), David Nellans (Fusion-io), Robert Wipfel (Fusion-io), David Flynn (Fusion-io), Dabaleswar K. Panda (Ohio State). 17th IEEE International Symposium on High-Performance Computer Architecture (HCCA-17), 2011. See http://david.nellans.org/files/HPCA-2011.pdf and http://nowida.cse.ohio-state.edu/publications/conf-presentations/2011/ouyangx-hpca2011-sildes.pdf.

#### **Overview**

Some types of storage devices (e.g., NAND-flash based SSDs) do not overwrite data in place like others (e.g., HDDs); new writes are directed to new storage locations, and the old locations maintain the old data until they are later reclaimed. These devices may have the ability to revert back to the old data in case something goes awry during the write (e.g., power is lost). If an application client is able to rely on this fact, it can avoid performing its own transactional logging operations, increasing performance.

The 2004 Okun/Barak paper defines a new atomic write operation that provides these semantics: "A storage device that supports Atomic Write (AW) guarantees that either all the blocks in a write operation are written or no blocks are written at all."

The 2011 Ouyang/Nellans/et al. paper implemented an atomic write primitive with a NAND-flash based storage device for a MySQL database (see http://www.mysql.com) with the InnoDB transactional storage engine (see http://www.innodb.com), measuring:

- a) 43% reduction in data written to storage;
- b) 20% reduction in transaction latency; and
- c) 33% throughout improvement

#### **Benefits**

An atomic multi-block write (ATOMIC WRTE) command batches multiple write I/O operations into a single logical group written as a whole or rolled back upon falure. These multi-block writtens, which are naive to the hardware, resolve a problem of indeterminate status of failed writes that often requires two-part write – onee write to for the data in-place and another write to update the journal of the activity. Avoiding one extra write doubles the life of SDs. Additionally, by moving the write-atomicity down the stack link the storage device, it is possible to significantly simplify the applications, fleesystems, or operating systems which conventionally do extra processing to guarantee the consistency and integrity of data. In summary, atomic write command eliminates the major overhead, simplifies applications, increases the storage and write-bandwidth, and doubles the eliminate SDs.

Benefits of the atomic write command include:

- a) increased write endurance
- b) increased performance
- c) fewer write I/Os
- A) simplify applications
- B) keep applications from managing atomicity

1

### http://www.t10.org/

#### Doc:11-229R1

### IT IS ABOUT TRANSACTIONS

- Building block of applications and databases
- Transactional
  Semantics
  - Data Integrity
  - Concurrency
  - Crash Recovery

- Applications
- File Systems
- Databases
- Web Services
- Search Engines
- Mission Critical Computing







### Batch multiple I/O operations into a single logical group

 Multiple I/Os are persisted as a whole or rolled back upon failure

### **ATOMIC WRITES - OPTIMIZED**

FUSION-iO



Moving the Atomic-Write Primitive into Storage Stack



#### MySQL Extension for Atomic Writes





#### **Gary Orenstein**

#### go@fusionio.com

@garyorenstein

#### THANK YOU

#### Agenda

| 1. | 10:15 AM - 10:30 AM | Introduction - SSS Performance                        | Eden Kim, Chair SNIA SSS TWG |
|----|---------------------|-------------------------------------------------------|------------------------------|
| 2. | 10:30 AM - 10:45 AM | PCIe SSD Form Factor                                  | Mark Meyers, Intel           |
| 3. | 10:45 AM - 11:00 AM | Standards & Deployment Models                         | Marty Czekalski, Seagate     |
| 4. | 11:00 AM - 11:15 AM | SATA-IO & SATA Express - PCIe for Client Storage      | Paul Wassenberg, Sata-IO     |
| 5. | 11:30 AM - 11:45 AM | PCIe 2.5" Form Factor                                 | Janene Ellefson, Micron      |
| 6. | 11:45 AM - 12:00 PM | Convergence of Memory & Storage IO Architecture       | Moon Kim, Tailwind           |
| 7. | 12:15 PM - 12:30 PM | Lessons from the Front Lines & Lessons for the Future | Gary Orenstien, Fusion-io    |
| 8. | 12:30 PM - 1:00 PM  | Panel Question & Answers / Working Lunch              |                              |



#### PCIe Round Table . . Questions for the Panel



- · Will any one of the competing PCIe interface standards prevail as the Industry Standard and why?
- Is PCIe SSS suitable for both Client and Enterprise Applications?
- How does the higher cost per GB of PCIe Solid State Storage affect adoption?
- Will PCIe SSS become standardized as a Block IO device driver?
- What does one DO with a million IOPS? i.e. limitations of bus, bandwidth, system optimization
- Doesn't the move to virtualization work against the adoption of DAS-oriented PCIe SSD??
- Does PCIe flash make more sense as a memory or as a storage element?



#### Thank You for your Participation in the PCIe Round Table at the 2012 SNIA Face-to-Face

