Patent application number | Description | Published |
20110057939 | Reading a Local Memory of a Processing Unit - Disclosed herein are systems, apparatuses, and methods for enabling efficient reads to a local memory of a processing unit. In an embodiment, a processing unit includes an interface and a buffer. The interface is configured to (i) send a request for a portion of data in a region of a local memory of an other processing unit and (ii) receive, responsive to the request, all the data from the region. The buffer is configured to store the data from the region of the local memory of the other processing unit. | 03-10-2011 |
20110060879 | SYSTEMS AND METHODS FOR PROCESSING MEMORY REQUESTS - A processing system is provided. The processing system includes a first processing unit coupled to a first memory and a second processing unit coupled to a second memory. The second memory comprises a coherent memory and a private memory that is private to the second processing unit. | 03-10-2011 |
20110202724 | IOMMU Architected TLB Support - Embodiments allow a smaller, simpler hardware implementation of an input/output memory management unit (IOMMU) having improved translation behavior that is independent of page table structures and formats. Embodiments also provide device-independent structures and methods of implementation, allowing greater generality of software (fewer specific software versions, in turn reducing development costs). | 08-18-2011 |
20120147021 | GRAPHICS COMPUTE PROCESS SCHEDULING - A method, system, and computer program product are disclosed for providing improved access to accelerated processing device compute resources to user mode applications. The functionality disclosed allows user mode applications to provide commands to an accelerated processing device without the need for kernel mode transitions in order to access a unified ring buffer. Instead, applications are each provided with their own buffers, which the accelerated processing device hardware can access to process commands. With full operating system support, user mode applications are able to utilize the accelerated processing device in much the same way as a CPU. | 06-14-2012 |
20120159039 | Generalized Control Registers - Methods, systems, and computer readable media generalize control registers in the context of memory address translations for I/O devices. A method includes maintaining a table including a plurality of concurrently available control register base pointers each associated with a corresponding input/output (I/O) device, associating each control register base pointer with a first translation from a guest virtual address (GVA) to a guest physical address (GPA) and a second translation from the GPA to a system physical address (SPA), and operating the first and second translations concurrently for the plurality of I/O devices. | 06-21-2012 |
20120188258 | GRAPHICS PROCESSING DISPATCH FROM USER MODE - A method, system, and computer program product are disclosed for providing improved access to accelerated processing device compute resources to user mode applications. The functionality disclosed allows user mode applications to provide commands to an accelerated processing device without the need for kernel mode transitions in order to access a unified ring buffer. Instead, applications are each provided with their own buffers, which the accelerated processing device hardware can access to process commands. With full operating system support, user mode applications are able to utilize the accelerated processing device in much the same way as a CPU. | 07-26-2012 |
20120229481 | ACCESSIBILITY OF GRAPHICS PROCESSING COMPUTE RESOURCES - A method, system, and computer program product are disclosed for providing improved access to accelerated processing device compute resources to user mode applications. The functionality disclosed allows user mode applications to provide commands to an accelerated processing device without the need for kernel mode transitions in order to access a unified ring buffer. Instead, applications are each provided with their own buffers, which the accelerated processing device hardware can access to process commands. With full operating system support, user mode applications are able to utilize the accelerated processing device in much the same way as a CPU. | 09-13-2012 |
20120246381 | Input Output Memory Management Unit (IOMMU) Two-Layer Addressing - Embodiments of the present invention provide methods, systems, and computer readable media for input output memory management unit (IOMMU) two-layer addressing in the context of memory address translations for I/O devices. According to an embodiment, a method includes translating a guest virtual address (GVA) to a corresponding guest physical address (GPA) using a guest address translation table according to a process address space identifier associated with an address translation transaction associated with an I/O device, and translating the GPA to a corresponding system physical address (SPA) using a system address translation table according to a device identifier associated with the address translation transaction. | 09-27-2012 |
20130070515 | METHOD AND APPARATUS FOR CONTROLLING STATE INFORMATION RETENTION IN AN APPARATUS - A method and apparatus for controlling state information retention determines at least a state information save or restore condition for at least one processing circuit such as one or more CPU or GPU cores or pipelines, in an integrated circuit. In response to determining the state information save or restore condition, the method and apparatus controls either or both of saving or restoring of state information for different virtual machines operating on the processing circuit, into corresponding on-die persistent passive variable resistance memory. The state information save or restore condition is a virtual machine level state information save or restore condition. State information for each of differing virtual machines is saved or restored from differing on-die passive variable resistance memory cells that are assigned on a per-virtual machine basis. | 03-21-2013 |
20130138840 | Efficient Memory and Resource Management - The present system enables passing a pointer, associated with accessing data in a memory, to an input/output (I/O) device via an input/output memory management unit (IOMMU). The I/O device accesses the data in the memory via the IOMMU without copying the data into a local I/O device memory. The I/O device can perform an operation on the data in the memory based on the pointer, such that I/O device accesses the memory without expensive copies. | 05-30-2013 |
20130145051 | Direct Device Assignment - A system is enabled for configuring an IOMMU to provide direct access to system memory data by at least one I/O device/peripheral. Further, the IOMMU is configured to pass a pointer to at least one I/O device without having to translate the pointer. Further, commands are sent from a process within a guest operating system (OS) directly to a peripheral without intervention from a hypervisor. Further, the IOMMU is configured to grant peripherals access permissions to memory blocks to maintain isolation among peripherals. | 06-06-2013 |
20130145055 | Peripheral Memory Management - The present system enables an input/output (I/O) device to request memory for performing a direct memory access (DMA) of system memory. Further, the system uses an input/output memory management unit (IOMMU) to determine whether or not the system memory is available. The IOMMU notifies an operating system associated with the system memory if the system memory is not available, such that the operating system allocates non-system memory for use by the I/O device to perform the DMA. | 06-06-2013 |
20130147821 | Methods and Systems to Facilitate Operation in Unpinned Memory - In an embodiment, a method of processing memory requests in a first processing device is provided. The method includes generating a memory request associated with a memory address located in an unpinned memory space managed by an operating system running on a second processing device; and responsive to a determination that the memory address is not resident in a physical memory, transmitting a message to the second processing device. In response to the message, the operating system controls the second processing device to bring the memory address into the physical memory. | 06-13-2013 |
20130166834 | SUB PAGE AND PAGE MEMORY MANAGEMENT APPARATUS AND METHOD - A method and apparatus for managing a virtual address to physical address translation utilize a subpage level fault detecting and access. The method and apparatus may also use an additional subpage and page store Non-Volatile Store (NVS). The method and apparatus determines whether a page fault occurs or whether a subpage fault occurs to effect an address translation and also operates such that if a subpage fault had occurred, a subpage is loaded corresponding to the fault from a NVS to a DRAM, such as DRAM or any other suitable volatile memory historically referred to as main memory. The method and apparatus, if a page fault has occurred, determines if a page fault has occurred without operating system assistance and is a hardware page fault detection system that loads a page corresponding to the fault from NVS to DRAM. | 06-27-2013 |
20130262736 | MEMORY TYPES FOR CACHING POLICIES - The present system enables receiving a request from an I/O device to translate a virtual address to a physical address to access the page in system memory. One or more memory attributes of the page defining a cacheability characteristic of the page is identified. A response including the physical address and the cacheability characteristic of the page is sent to the I/O device. | 10-03-2013 |
20130262775 | Cache Management for Memory Operations - Embodiments of the present invention provides for the execution of threads and/or workitems on multiple processors of a heterogeneous computing system in a manner that they can share data correctly and efficiently. Disclosed method, system, and article of manufacture embodiments include, responsive to an instruction from a sequence of instructions of a work-item, determining an ordering of visibility to other work-items of one or more other data items in relation to a particular data item, and performing at least one cache operation upon at least one of the particular data item or the other data items present in any one or more cache memories in accordance with the determined ordering. The semantics of the instruction includes a memory operation upon the particular data item. | 10-03-2013 |
20130262776 | Managing Coherent Memory Between an Accelerated Processing Device and a Central Processing Unit - Existing multiprocessor computing systems often have insufficient memory coherency and, consequently, are unable to efficiently utilize separate memory systems. Specifically, a CPU cannot effectively write to a block of memory and then have a GPU access that memory unless there is explicit synchronization. In addition, because the GPU is forced to statically split memory locations between itself and the CPU, existing multiprocessor computing systems are unable to efficiently utilize the separate memory systems. Embodiments described herein overcome these deficiencies by receiving a notification within the GPU that the CPU has finished processing data that is stored in coherent memory, and invalidating data in the CPU caches that the GPU has finished processing from the coherent memory. Embodiments described herein also include dynamically partitioning a GPU memory into coherent memory and local memory through use of a probe filter. | 10-03-2013 |
20130262784 | Memory Heaps in a Memory Model for a Unified Computing System - A method and system for allocating memory to a memory operation executed by a processor in a computer arrangement having a first processor configured for unified operation with a second processor. The method includes receiving a memory operation from a processor and mapping the memory operation to one of a plurality of memory heaps. The mapping produces a mapping result. The method also includes providing the mapping result to the processor. | 10-03-2013 |
20130262814 | Mapping Memory Instructions into a Shared Memory Address Place - Embodiments of the present invention provide a method of a first processor using a memory resource associated with a second processor. The method includes receiving a memory instruction from a first processor process, wherein the memory instruction refers to a shared memory address (SMA) that maps to a second processor memory. The method also includes mapping the SMA to the second processor memory, wherein the mapping produces a mapping result and providing the mapping result to the first processor. | 10-03-2013 |
20130263141 | Visibility Ordering in a Memory Model for a Unified Computing System - Provided is a method of permitting the reordering of a visibility order of operations in a computer arrangement configured for permitting a first processor and a second processor threads to access a shared memory. The method includes receiving in a program order, a first and a second operation in a first thread and permitting the reordering of the visibility order for the operations in the shared memory based on the class of each operation. The visibility order determines the visibility in the shared memory, by a second thread, of stored results from the execution of the first and second operations. | 10-03-2013 |
20130304841 | SERVER NODE INTERCONNECT DEVICES AND METHODS - Described are systems and methods for interconnecting devices. A switch fabric is in communication with a plurality of electronic devices. A rendezvous memory is in communication with the switch fabric. Data is transferred to the rendezvous memory from a first electronic device of the plurality of electronic devices in response to a determination that the data is ready for output from a memory at the first electronic device and in response to a location allocated in the rendezvous memory for the data. | 11-14-2013 |
20130339466 | DEVICES AND METHODS FOR INTERCONNECTING SERVER NODES - Described are aggregation devices and methods for interconnecting server nodes. The aggregation device can include an input region, an output region, and a memory switch. The input region includes a plurality of input ports. The memory switch has a shared through silicon via (TSV) memory coupled to the input ports for temporarily storing data received at the input ports from a plurality of source devices. The output region includes a plurality of output ports coupled to the TSV memory. The output ports provide the data to a plurality of destination devices. A memory allocation system coordinates a transfer of the data from the source devices to the TSV memory. The output ports receive and process the data from the TSV memory independently of a communication from the input ports. | 12-19-2013 |
20130346531 | SYSTEMS AND METHODS FOR INPUT/OUTPUT VIRTUALIZATION - Described is an aggregation device comprising a plurality of virtual network interface cards (vNICs) and an input/output (I/O) processing complex. The vNICs are in communication with a plurality of processing devices. Each processing device has at least one virtual machine (VM). The I/O processing complex is between the vNICs and at least one physical NIC. The I/O processing complex includes at least one proxy NIC and a virtual switch. The virtual switch exchanges data with a processing device of the plurality of processing devices via a communication path established by a vNIC of the plurality of vNICs between the at least one VM and at least one proxy NIC. | 12-26-2013 |
20140052808 | SPECULATION BASED APPROACH FOR RELIABLE MESSAGE COMMUNICATIONS - Described are a system and method for lossless message delivery between two processing devices. Each device includes a remote direct memory access (RDMA) messaging interface. The RDMA messaging interface at the first device generates one or more messages that are processed by the RDMA messaging interface of the second device. The RDMA messaging interface of the first device outputs a notification to the second device that a message of the one or more messages is available at the first device. A determination is made that the second device has resources to accommodate the message. The second device performs an operation in response to determining that the processing device has the resources to accommodate the message. | 02-20-2014 |
20140059160 | SYSTEMS AND METHODS FOR SHARING DEVICES IN A VIRTUALIZATION ENVIRONMENT - Described are systems and methods for communication between a plurality of electronic devices and an aggregation device. An aggregation device processes instructions related to a configuration of an electronic device in communication with the aggregation device. One or more virtual devices are generated in response to processing the instructions. The electronic device enumerates a configuration space to determine devices for use by the electronic device. The aggregation device detects an access of the configuration space by the electronic device. The one or more virtual devices are presented from the aggregation device to the electronic device in accordance with the instructions. | 02-27-2014 |
20140068088 | SYSTEMS AND METHODS FOR PROCESSING MEDIA ACCESS CONTROL (MAC) ADDRESSES - Described are a system and method for processing a media access control (MAC) address. A communication is established between a processing device and a network port of a data switching device. The data switching device assigns a MAC address to the processing device. The assigned MAC address is directly associated with the network port of the data switching device absent a learning mechanism. | 03-06-2014 |
20140068205 | SYSTEMS AND METHODS FOR MANAGING QUEUES - Described are systems and methods for transmitting data at an aggregation device. The aggregation device includes a record queue and an output bypass queue. The data is received from an electronic device. A record is generated of the received data. The record is placed in the record queue. A determination is made that the record in the record queue is blocked. The blocked record is transferred from the record queue to the output bypass queue. | 03-06-2014 |
20140137215 | DATA FLOW PROCESSING IN A NETWORK ENVIRONMENT - Described are a system and method for managing a data exchange in a network environment. A flowtag is assigned to a data packet at a source device. The flowtag includes a port identification corresponding to a port at an aggregation device. A destination device is in communication with the port at the aggregation device. The data packet is authenticated at the aggregation device. The data packet is output from the source device to the destination device via the aggregation device according to the port identification in the flowtag of the authenticated data packet. | 05-15-2014 |