Patent application number | Description | Published |
20080235430 | Creation and Management of Routing Table for PCI Bus Address Based Routing with Integrated DID - A method is provided for creating and managing tables for routing packets through an environment that includes multiple hosts and shared PCI switches and adapters. A Destination Identification (DID) field in the PBA is appended to a transaction packet dispatched through the PCI switches, wherein a particular DID is associated with a particular host or system image, and thus identifies the physical or virtual end point of its packet. In one embodiment, packets are routed through PCI switches in a distributed computer system comprising multiple root nodes, wherein each root node includes one or more hosts. The embodiment includes the step of creating a table or like data structure in a specified one of the switches. When a particular host of one of the root nodes becomes connected to the specified switch, a PCI Configuration Master (PCM), residing in one of the root nodes, is operated to enter a destination identifier or DID into the table. The DID is then appended as an address component, to packets directed through the specified switch from the particular host to one of the adapters. The destination identifier is also used to determine that a PCI packet, routed through the specified switch from one of the adapters, is intended for the particular root node. | 09-25-2008 |
20080235431 | Method Using a Master Node to Control I/O Fabric Configuration in a Multi-Host Environment - A method is directed to use of a master root node, in a distributed computer system provided with multiple root nodes, to control the configuration of routings through an I/O switched-fabric. One of the root nodes is designated as the master root node or PCI Configuration Manager (PCM), and is operable to carry out the configuration while each of the other root nodes remains in a quiescent or inactive state. In one useful embodiment pertaining to a system of the above type, that includes multiple root nodes, PCI switches, and PCI adapters available for sharing by different root nodes, a method is provided wherein the master root node is operated to configure routings through the PCI switches. Respective routings are configured between respective root nodes and the PCI adapters, wherein each of the configured routings corresponds to only one of the root nodes. A particular root node is enabled to access each of the PCI adapters that are included in any configured routing that corresponds to the particular root node. At the same time, the master root node writes into a particular root node only the configured routings that correspond to the particular root node. Thus, the particular root node is prevented from accessing an adapter that is not included in its corresponding routings. | 09-25-2008 |
20080235785 | Method, Apparatus, and Computer Program Product for Routing Packets Utilizing a Unique Identifier, Included within a Standard Address, that Identifies the Destination Host Computer System - A computer-implemented method, apparatus, and computer program product are disclosed in a data processing environment that includes host computer systems that are coupled to adapters utilizing a switched fabric for routing packets between the host computer systems and the adapters. A unique destination identifier is assigned to one of the host computer systems. A portion of a standard format packet destination address is selected. Within a particular packet, the portion is set equal to the unique identifier that is assigned to the host computer system. The particular packet is then routed through the fabric to the host computer system using the unique destination identifier. | 09-25-2008 |
20080270853 | Method of Routing I/O Adapter Error Messages in a Multi-Host Environment - A method and apparatus is provided for routing error messages in a distributed computer system comprising multiple root nodes, and further comprising one or more PCI switches and one or more I/O adapters, wherein each root node includes one or more system images. In one useful embodiment, a method is provided for routing I/O error messages to root nodes respectively associated with the errors contained in the messages. The method includes detecting occurrence of an error at a specified one of the adapters, wherein the error affects one of the system images, and generating an error message at the specified adapter. The method further comprises routing the error message from the specified adapter to the particular root node that includes the affected system image. The error message is then selectively processed at the particular root node, in order to identify the affected system image. Usefully, the step of routing the error message includes using a bus/device/function number associated with the error, together with a routing table located in one of the PCI switches, to route the error message to the correct root node and system image. | 10-30-2008 |
20080307116 | Routing Mechanism in PCI Multi-Host Topologies Using Destination ID Field - Method and system for address routing in a distributed computing system, such as a distributed computing system that uses PCI Express protocol to communicate over an I/O fabric. A destination identifier is provided to identify a physical or virtual host or end point. When a physical or virtual host or end point receives a PCI data packet it compares a list of source identifiers with destination identifiers to determine if a source identifier included in the transaction packet is associated with a destination identifier included in the transaction packet to determine if the transaction packet has a valid association. If the transaction packet has a valid association, it is routed to the target device. The present invention enables each host that attaches to PCI bridges or switches and shares a set of common PCI devices to have its own PCI 64-bit address space and enables the routing of PCI transaction packets between multiple hosts and adapters, through a PCI switched-fabric bus using a destination identifier. | 12-11-2008 |
20090100204 | Method, Apparatus, and Computer Usable Program Code for Migrating Virtual Adapters from Source Physical Adapters to Destination Physical Adapters - A computer-implemented method, apparatus, and computer usable program code are disclosed for migrating a virtual adapter from a source physical adapter to a destination physical adapter in a data processing system where multiple host computer systems share multiple adapters and communicate with those adapters through a PCI switched-fabric bus. The virtual adapter is first caused to stop processing transactions. All in-flight transactions that are associated with the virtual adapter are then captured. The configuration information that defines the virtual adapter is moved from the source physical adapter to the destination physical adapter. The in-flight transactions are then restored to their original locations on the destination virtual adapter. The virtual adapter is then restarted on the destination physical adapter such that the virtual adapter begins processing transactions. | 04-16-2009 |
20090119551 | Broadcast of Shared I/O Fabric Error Messages in a Multi-Host Environment to all Affected Root Nodes - A method, mechanism and computer usable medium is provided for distributing I/O fabric errors to the appropriate root nodes in a multi-root environment. The case where the I/O fabric is attached to more than one root node and where each root can potentially share with the other roots the I/O adapter (IOA) resources which are attached to the I/O is addressed. Additionally, a method, mechanism and computer usable medium is provided by which errors detected in an I/O fabric may be routed to all root nodes which may be affected by the error, while not being reported to the root nodes that will not be affected by those errors. In particular, distributed computing system which uses the PCI Express protocol to communicate over the I/O fabric is addressed. | 05-07-2009 |
20090133016 | System and Method for Management of an IOV Adapter Through a Virtual Intermediary in an IOV Management Partition - The system and method address the situation where an input/output (I/O) fabric is shared by more than one logical partition (LPAR) and where each LPAR can share with the other LPARs an I/O adapter (IOA). In particular, each LPAR is assigned its own separate address space to access a virtual function (VF) assigned to it such that each LPAR's perception is that it has its own independent IOA. Each VF may be shared across multiple LPARs. Facilities are provided for management of the shared resources of the IOA via a Physical Function (PF) of the IOA by assignment of that PF to an I/O Virtualization Management Partition (IMP). The code running in the IMP acts as a virtual intermediary to the VFs for fully managing the VF error handling, VF reset, and configuration operations. The IMP also acts as an interface to the PF for accessing common VF functionality. | 05-21-2009 |
20090133028 | SYSTEM AND METHOD FOR MANAGEMENT OF AN IOV ADAPTER THROUGH A VIRTUAL INTERMEDIARY IN A HYPERVISOR WITH FUNCTIONAL MANAGEMENT IN AN IOV MANAGEMENT PARTITION - A system and method which provide a mechanism for an I/O virtualization management partition (IMP) to control the shared functionality of an I/O virtualization (IOV) enabled I/O adapter (IOA) through a physical function (PF) of the IOA while the virtual functions (VFs) are assigned to client partitions for normal I/O operations directly. A hypervisor provides device-independent facilities to the code running in the IMP and client partitions. The IMP may include device specific code without the hypervisor needing to sacrifice its size, robustness, and upgradeability. The hypervisor provides the virtual intermediary functionally for the sharing and control of the IOA's control functions. | 05-21-2009 |
20090144508 | PCI Express Address Translation Services Invalidation Synchronization with TCE Invalidation - A PCI Express (PCIe) computer system utilizes address translation services to translate virtual addresses from I/O device adaptors to physical addresses of system memory. A combined memory controller and host bridge uses a translation agent to convert the I/O addresses via translation control entries (TCEs) in a TCE table (also known as an address translation and protection table). Some of the I/O device adaptors have address translation caches for local storage of TCEs. The TCE definition includes a new non-cacheable control bit which is set active in the TCE table when the TCE is in the process of being invalidated. The memory controller prevents further caching of the TCE while the non-cacheable control bit is active. A further implementation utilizes a change-in-progress control bit of the TCE to indicate that the TCE is in the process of being changed to allow simultaneous invalidation of the previously TCE information. | 06-04-2009 |
20090144731 | SYSTEM AND METHOD FOR DISTRIBUTION OF RESOURCES FOR AN I/O VIRTUALIZED (IOV) ADAPTER AND MANAGEMENT OF THE ADAPTER THROUGH AN IOV MANAGEMENT PARTITION - The system and method address the situation where an input/output (I/O) fabric is shared by more than one logical partition (LPAR) and where each LPAR can share with the other LPARs an I/O adapter (IOA). In particular, each LPAR is assigned its own separate address space to access a virtual function (VF) assigned to it such that each LPAR's perception is that it has its own independent IOA. Each VF may be shared across multiple LPARs. Facilities are provided for management of the shared resources of the IOA via a Physical Function (PF) of the IOA by assignment of that PF to an I/O Virtualization Management Partition (IMP). The code running in the IMP acts as a virtual intermediary to the VFs for fully managing the VF error handling, VF reset, and configuration operations. The IMP also acts as an interface to the PF for accessing common VF functionality. Furthermore, the functions of resource assignment and management relative to the VFs and the client partitions that use those VFs, which might normally be done by an entity like a hypervisor, are implemented by this IMP. | 06-04-2009 |
20090276544 | Mapping a Virtual Address to PCI Bus Address - Registering memory space within a data processing system is performed. One or more open calls are received from an application to access one or more input/output (I/O) devices. Responsive to receiving the one or more open calls, one or more I/O map and pin calls are sent in order to register memory space for the one or more I/O devices within at least one storage area that will be accessed by the application. At least one virtual I/O bus address is received for each registered memory space of the one or more I/O devices. At least one I/O command is executed using the at least one virtual I/O bus address without intervention by an operating system or operating system image. | 11-05-2009 |
20090276551 | Native and Non-Native I/O Virtualization in a Single Adapter - Mechanisms for enabling both native and non-native input/output virtualization (IOV) in a single I/O adapter are provided. The mechanisms allow a system with a large number of logical partitions (LPARs) and system images to use IOV to share a native IOV enabled I/O adapter or endpoint that does not implement the necessary number of virtual functions (VFs) for each LPAR and system image. A number of VFs supported by the I/O adapter, less one, are assigned to LPARs and system images so that they may make use of native IOV using these VFs. The remaining VF is associated with a virtual intermediary (VI) which handles non-native IOV of the I/O adapter. Any remaining LPARs and system images share the I/O adapter using the non-native IOV via the VI. Thus, any number of LPARs and system images may share the same I/O adapter or endpoint. | 11-05-2009 |
20090276605 | Retaining an Association Between a Virtual Address Based Buffer and a User Space Application that Owns the Buffer - Registering memory space for an application is performed. One or more open calls are received from an application to access one or more input/output (I/O) devices. Responsive to receiving the one or more open calls, one or more I/O map and pin calls are sent in order to register memory space for the one or more I/O devices within at least one storage area that will be accessed by the application. A verification is made as to whether the memory space to be registered is associated with the application. Responsive to the memory space being associated with the application, at least one virtual I/O bus address is received for each registered memory space of the one or more I/O devices. At least one I/O command is executed using the at least one virtual I/O bus address without intervention by an operating system or operating system image. | 11-05-2009 |
20090276773 | Multi-Root I/O Virtualization Using Separate Management Facilities of Multiple Logical Partitions - Mechanisms are provided for implementing a multi-root PCI manager (MR-PCIM) in a multi-root I/O virtualization management partition (MR-IMP) to control the shared functionality of an multi-root I/O virtualization (IOV) enabled switch fabric and multi-root IOV enabled I/O adapter (IOA) through the base functions (BF) of the switches and IOAs. A hypervisor provides device-independent facilities to the code running in the I/O Virtualization Management Partition (IMP), Multi-Root (MR)-IMP and client partitions. The MR-IMP may include device specific code without the hypervisor needing to sacrifice its size, robustness, and upgradeability. The hypervisor provides the virtual intermediary functionally for the sharing and control of the switch and IOA's control functions. | 11-05-2009 |
20090276775 | PCI Function South-Side Data Management - A hypervisor, during device discovery, has code which can examine the south-side management data structure in an adapter's configuration space and determine the type of device which is being configured. The hypervisor may copy the south-side management data structure to a hardware management console (HMC) and the HMC can populate the data structure with south-side data and then pass the structure to the hypervisor to replace the data structure on the adapter. In another embodiment the hypervisor may copy the data structure to the HMC and the HMC can instruct the hypervisor to fill-in the data structure, a virtual function at a time, with south-side management data associations. The administrator can assign south-side data, such as a MAC address for a virtual instance of an Ethernet device, to LPARs sharing the adapter. Thus, a standard way to manage the south-side data of virtual functions is provided. | 11-05-2009 |
20100146089 | Use of Peripheral Component Interconnect Input/Output Virtualization Devices to Create High-Speed, Low-Latency Interconnect - A computer-implemented method for a high speed peripheral component interconnect input/output virtualization configuration creates a set of virtual function path authorization tables, receives a request including a virtual function, from a requester, to provide requested data, and identifies a source address in the source system and a target address in each target system of the target set of systems. A virtual function work queue entry for the source system is created containing the source and the target address and responsive to determining the virtual function is authorized, write the requested data from the source address of the source system through a firewall of an intermediate device into the target address of each target system, wherein the intermediate device is one of a multi-root peripheral component interconnect device and a single root peripheral component interconnect device, and issuing a notice of completion to the requester. | 06-10-2010 |
20100146170 | Differentiating Traffic Types in a Multi-Root PCI Express Environment - Mechanisms for differentiating traffic types in a multi-root PCI Express environment are provided. The mechanisms generate a first mapping data structure that, for each single-root virtual hierarchy in the multi-root data processing system, associates a plurality of traffic classes with a plurality of priority groups and maps each traffic class in the plurality of traffic classes to a corresponding virtual channel in a plurality of virtual channels. Moreover, a second mapping data structure is generated that maps each virtual channel in the plurality of virtual channels to corresponding virtual link in a plurality of virtual links of the multi-root data processing system. Traffic of a particular priority group is routed from a single-root virtual hierarchy to a particular virtual link in the plurality of the virtual links based on the first mapping data structure and second mapping data structure. | 06-10-2010 |
20100153592 | Use of Peripheral Component Interconnect Input/Output Virtualization Devices to Create Redundant Configurations - In one embodiment, a computer-implemented method for creating redundant system configurations is presented. The computer-implemented method creates a set of virtual function path authorization tables, and receives a request from a requester to provide requested data from a virtual function wherein the virtual function is performed by a single root or a multi-root peripheral component interconnect device. Further a receive buffer is created in a selected address range in a set of addresses ranges as well as a virtual function work queue entry for the virtual function containing an address of the receive buffer in the selected address range. Responsive to a determination that the virtual function is authorized, writing the requested data into the receive buffer of the selected address range in the one or more systems, and responsive to writing the requested data, issuing a notice of completion to the requester. | 06-17-2010 |
20100165874 | Differentiating Blade Destination and Traffic Types in a Multi-Root PCIe Environment - Mechanisms for differentiating traffic types per host system blade in a multi-root PCI Express environment are provided. The mechanisms generate a first mapping data structure that, for each single-root virtual hierarchy in the multi-root data processing system, associates a plurality of traffic classes with a plurality of priority groups and maps each traffic class in the plurality of traffic classes to a corresponding virtual channel in a plurality of virtual channels. Moreover, a second mapping data structure is generated that maps each virtual channel in the plurality of virtual channels to corresponding per host system blade virtual links in a plurality of virtual links of the multi-root data processing system. Traffic of a particular priority group is routed from a single-root virtual hierarchy to a particular virtual link in the plurality of the virtual links based on the first mapping data structure and second mapping data structure. | 07-01-2010 |
20110252167 | PHYSICAL TO HIERARCHICAL BUS TRANSLATION - In an embodiment, a translation of a physical bus number to a hierarchical bus number is written to a south chip. The south chip receives a configuration write command that comprises a physical bus number. The south chip sends the configuration write command to a device via the bus identified by the physical bus number, and the device stores the physical bus number in the device. In response to a received message from a device that comprises the physical bus number, the south chip replaces the physical bus number in the message with the hierarchical bus number. The south chip sends the message to a north chip via a point-to-point serial link. Both the physical bus number and the hierarchical bus number identify a bus with which the device connects to a bridge in the south chip. | 10-13-2011 |
20110252170 | HIERARCHICAL TO PHYSICAL BUS TRANSLATION - In an embodiment, a translation of a hierarchical bus number to a physical bus number and a bridge identifier of a bridge are written to a north chip. A request is received that comprises an identifier of a destination. A determination is made that the identifier comprises the hierarchical bus number. In response to the determination, the identifier of the destination is replaced in the request with the physical bus number and the bridge identifier. The request is sent to the bridge identified by the bridge identifier. A south chip comprises the bridge, and the south chip is connected to the north chip via a point-to-point serial link. The physical bus number identifies a bus that connects the bridge to a device. The request comprises a configuration write request that requests a write of data to the device. | 10-13-2011 |
20110252173 | TRANSLATING A REQUESTER IDENTIFIER TO A CHIP IDENTIFIER - In an embodiment a translation of RID (requester identifier) ranges to identifiers of north chips is stored into a south chip. A command that comprises a command RID is received at the south chip from a device. In response, a RID range is determined that encompasses the command RID, and a north chip identifier is found that is assigned a virtual function identified by the command RID. The command is sent from the south chip to the north chip identified by the north chip identifier. The translation comprises a RID compare value and a RID mask. A determination is made that the RID range encompasses the command RID by performing a logical-and operation on the command RID and the RID mask and comparing a result of the logical-and operation to the RID compare value. | 10-13-2011 |
20110252174 | HIERARCHICAL TO PHYSICAL MEMORY MAPPED INPUT/OUTPUT TRANSLATION - In an embodiment, a translation of a hierarchical MMIO address range to a physical MMIO address range and an identifier of a bridge in a south chip are written to a north chip. A transaction is received that comprises a hierarchical MMIO address. The hierarchical MMIO address that is within the hierarchical MMIO address range is replaced in the transaction with the identifier of the bridge and with a physical MMIO address that is within the physical MMIO address range in the south chip. The transaction is sent to the device that is connected to the bridge in the south chip. The physical MMIO address range specifies a range of physical MMIO addresses in memory in the device. | 10-13-2011 |
20110276779 | MEMORY MAPPED INPUT/OUTPUT BUS ADDRESS RANGE TRANSLATION - In an embodiment, a north chip receives a secondary bus identifier that identifies a bus that is immediately downstream from a bridge in a south chip, a subordinate bus identifier that identifies a highest bus identifier of all of buses reachable downstream of the bridge, and an MMIO bus address range that comprises a memory base and a memory limit. The north chip writes a translation of a bridge identifier and a south chip identifier to the secondary bus identifier, the subordinate bus identifier, and the MMIO bus address range. The north chip sends the secondary bus identifier, the subordinate bus identifier, the memory base, and the memory limit to the bridge. The bridge stores the secondary bus identifier, the subordinate bus identifier, the memory base, and the memory limit in the bridge. | 11-10-2011 |
20110296074 | MEMORY MAPPED INPUT/OUTPUT BUS ADDRESS RANGE TRANSLATION FOR VIRTUAL BRIDGES - In an embodiment, a south chip comprises a first virtual bridge connected to a shared egress port and a second virtual bridge also connected to the shared egress port. The first virtual bridge receives a first secondary bus identifier, a first subordinate bus identifier, and a first MMIO bus address range from a first north chip. The second virtual bridge receives a second secondary bus identifier, a second subordinate bus identifier, and a second MMIO bus address range from a second north chip. The first virtual bridge stores the first secondary bus identifier, the first subordinate bus identifier, and the first MMIO bus address range. The second virtual bridge stores the second secondary bus identifier, the second subordinate bus identifier, and the second MMIO bus address range. The first north chip and the second north chip are connected to the south chip via respective first and second point-to-point connections. | 12-01-2011 |
20110320671 | MOVING OWNERSHIP OF A DEVICE BETWEEN COMPUTE ELEMENTS - In an embodiment, a command is received that requests movement of ownership of a target device from an origin compute element to a destination compute element. From the origin compute element, a translation of a virtual bridge identifier to a first secondary bus identifier, a first subordinate bus identifier, and a first MMIO bus address range is removed. To the destination compute element, a translation of the target virtual bridge identifier to a second secondary bus identifier, a second subordinate bus identifier, and a second MMIO bus address range is added. From a south chip that comprises the target virtual bridge, a translation of the target virtual bridge identifier to an identifier of the origin compute element is removed. To the south chip, a translation of the target virtual bridge identifier to an identifier of the destination compute element is added. | 12-29-2011 |
20120185632 | IMPLEMENTING PCI-EXPRESS MEMORY DOMAINS FOR SINGLE ROOT VIRTUALIZED DEVICES - A method, system and computer program product are provided for implementing PCI-Express memory domains for single root virtualized devices. A PCI host bridge (PHB) includes a memory mapped IO (MMIO) domain descriptor (MDD) and an MMIO Domain Table (MDT) are used to associate MMIO domains with PCI memory VF BAR spaces. One MDD is provided for each unique VF BAR space size per bus segment connecting a single root IO virtualization (SRIOV) device to the PCI host bridge (PHB). The MDT used with the MDD includes having a number of entries limited to a predefined total number of SRIOV VFs to be configured. A VF BAR Stride, which may be further implemented as a VF BAR Stride Capability Structure, is provided to reduce the number of MDDs required to map SRIOV VF BAR spaces. A particular definition of the MDD is provided to reduce the number of MDDs required to at most one per SRIOV bus segment below a PHB. | 07-19-2012 |
20120265916 | DYNAMIC ALLOCATION OF A DIRECT MEMORY ADDRESS WINDOW - A computer-implemented method may include determining that a slot coupled to a peripheral component interconnect host bridge is occupied by an input/output adapter. The computer-implemented method may include determining one or more characteristics of the input/output adapter and determining whether the input/output adapter is capable of using additional memory based on the one or more characteristics of the input/output adapter. The computer-implemented method may also include allocating the additional memory for the input/output adapter in response to determining that the input/output adapter is capable of using the additional memory. | 10-18-2012 |
20130212308 | MEMORY MAPPED INPUT/OUTPUT BUS ADDRESS RANGE TRANSLATION - In an embodiment, a north chip receives a secondary bus identifier that identifies a bus that is immediately downstream from a bridge in a south chip, a subordinate bus identifier that identifies a highest bus identifier of all of buses reachable downstream of the bridge, and an MMIO bus address range that comprises a memory base and a memory limit. The north chip writes a translation of a bridge identifier and a south chip identifier to the secondary bus identifier, the subordinate bus identifier, and the MMIO bus address range. The north chip sends the secondary bus identifier, the subordinate bus identifier, the memory base, and the memory limit to the bridge. The bridge stores the secondary bus identifier, the subordinate bus identifier, the memory base, and the memory limit in the bridge. | 08-15-2013 |
20140129795 | CONFIGURABLE I/O ADDRESS TRANSLATION DATA STRUCTURE - In response to a determination to allocate additional storage, within a real address space employed by a system memory of a data processing system, for translation control entries (TCEs) that translate addresses from an input/output (I/O) address space to the real address space, a determination is made whether or not a first real address range contiguous with an existing TCE data structure is available for allocation. In response to determining that the first real address range is available for allocation, the first real address range is allocated for storage of TCEs, and a number of levels in the TCE data structure is retained. In response to determining that the first real address range is not available for allocation, a second real address range discontiguous with the existing TCE data structure is allocated for storage of the TCEs, and a number of levels in the TCE data structure is increased. | 05-08-2014 |
20140129797 | CONFIGURABLE I/O ADDRESS TRANSLATION DATA STRUCTURE - In response to a determination to allocate additional storage, within a real address space employed by a system memory of a data processing system, for translation control entries (TCEs) that translate addresses from an input/output (I/O) address space to the real address space, a determination is made whether or not a first real address range contiguous with an existing TCE data structure is available for allocation. In response to determining that the first real address range is available for allocation, the first real address range is allocated for storage of TCEs, and a number of levels in the TCE data structure is retained. In response to determining that the first real address range is not available for allocation, a second real address range discontiguous with the existing TCE data structure is allocated for storage of the TCEs, and a number of levels in the TCE data structure is increased. | 05-08-2014 |