Patent application number | Description | Published |
20130265309 | PATCHED SHADING IN GRAPHICS PROCESSING - Aspects of this disclosure generally relate to a process for rendering graphics that includes performing, with a hardware shading unit of a graphics processing unit (GPU) designated for vertex shading, vertex shading operations to shade input vertices so as to output vertex shaded vertices, wherein the hardware unit is configured to receive a single vertex as an input and generate a single vertex as an output. The process also includes performing, with the hardware shading unit of the GPU, a geometry shading operation to generate one or more new vertices based on one or more of the vertex shaded vertices, wherein the geometry shading operation operates on at least one of the one or more vertex shaded vertices to output the one or more new vertices. | 10-10-2013 |
20140040552 | MULTI-CORE COMPUTE CACHE COHERENCY WITH A RELEASE CONSISTENCY MEMORY ORDERING MODEL - A method includes storing, with a first programmable processor, shared variable data to cache lines of a first cache of the first processor. The method further includes executing, with the first programmable processor, a store-with-release operation, executing, with a second programmable processor, a load-with-acquire operation, and loading, with the second programmable processor, the value of the shared variable data from a cache of the second programmable processor. | 02-06-2014 |
20140204080 | INDEXED STREAMOUT BUFFERS FOR GRAPHICS PROCESSING - A graphics processing unit (GPU) includes an indexed streamout buffer. The indexed streamout buffer is configured to: receive vertex data of a primitive, and determine if any entries in a reuse table of the indexed streamout buffer reference the vertex data. Responsive to determining that an entry of in the reuse table references the vertex data, the buffer is further configured to: generate an index that references the vertex data, store the index in the buffer, and store a reference to the index in the reuse table. Responsive to determining that an entry does not reference the vertex data, the indexed streamout buffer is configured to: store the vertex data in the buffer, generate an index that references the vertex data, store the index in the buffer, and store a reference to the index in the reuse table. | 07-24-2014 |
20140267259 | TILE-BASED RENDERING - This disclosure describes techniques for using bounding regions to perform tile-based rendering with a graphics processing unit (GPU) that supports an on-chip, tessellation-enabled graphics rendering pipeline. Instead of generating binning data based on rasterized versions of the actual primitives to be rendered, the techniques of this disclosure may generate binning data based on a bounding region that encompasses one or more of the primitives to be rendered. Moreover, the binning data may be generated based on data that is generated by at least one tessellation processing stage of an on-chip, tessellation-enabled graphics rendering pipeline that is implemented by the GPU. The techniques of this disclosure may, in some examples, be used to improve the performance of an on-chip, tessellation-enabled GPU when performing tile-based rendering without sacrificing the quality of the resulting rendered image. | 09-18-2014 |
20150070369 | FAULT-TOLERANT PREEMPTION MECHANISM AT ARBITRARY CONTROL POINTS FOR GRAPHICS PROCESSING - This disclosure presents techniques and structures for preemption at arbitrary control points in graphics processing. A method of graphics processing may comprise executing commands in a command buffer, the commands operating on data in a read-modify-write memory resource, double buffering the data in the read-modify-write memory resource, such that a first buffer stores original data of the read-modify-write memory resource and a second buffer stores any modified data produced by executing the commands in the command buffer, receiving a request to preempt execution of the commands in the command buffer before completing all commands in the command buffer, and restarting execution of the commands at the start of the command buffer using the original data in the first buffer. | 03-12-2015 |
20150089146 | CONDITIONAL PAGE FAULT CONTROL FOR PAGE RESIDENCY - The present disclosure provides for systems and methods to process a non-resident page that may include attempting to access the non-resident page, an address for the non-resident page pointing to a memory page containing default values, determining that the non-resident page should not cause a page fault based on an indicator indicating that a particular non-resident page should not generate a page fault, returning an indication that a memory read did not translate and returning the default value when the access of the non-resident page is a read and the non-resident page should not cause a page fault. Another example may discontinue a write when the access of the non-resident page is a write and the non-resident page should not cause a page fault. | 03-26-2015 |
20150109293 | SELECTIVELY MERGING PARTIALLY-COVERED TILES TO PERFORM HIERARCHICAL Z-CULLING - This disclosure describes techniques for performing hierarchical z-culling in a graphics processing system. In some examples, the techniques for performing hierarchical z-culling may involve selectively merging partially-covered source tiles for a tile location into a fully-covered merged source tile based on whether conservative farthest z-values for the partially-covered source tiles are nearer than a culling z-value for the tile location, and using a conservative farthest z-value associated with the fully-covered merged source tile to update the culling z-value for the tile location. In further examples, the techniques for performing hierarchical z-culling may use a cache unit that is not associated with an underlying memory to store conservative farthest z-values and coverage masks for merged source tiles. The capacity of the cache unit may be smaller than the size of cache needed to store merged source tile data for all of the tile locations in a render target. | 04-23-2015 |
20150317157 | TECHNIQUES FOR SERIALIZED EXECUTION IN A SIMD PROCESSING SYSTEM - A SIMD processor may be configured to determine one or more active threads from a plurality of threads, select one active thread from the one or more active threads, and perform a divergent operation on the selected active thread. The divergent operation may be a serial operation. | 11-05-2015 |