Patent application number | Description | Published |
20110320869 | HOMOGENEOUS RECOVERY IN A REDUNDANT MEMORY SYSTEM - Providing homogeneous recovery in a redundant memory system that includes a memory controller, a plurality of memory channels in communication with the memory controller, an error detection code mechanism configured for detecting a failing memory channel, and an error recovery mechanism. The error recovery mechanism is configured for receiving notification of the failing memory channel, for blocking off new operations from starting on the memory channels, for completing any pending operations on the memory channels, for performing a recovery operation on the memory channels and for starting the new operations on at least a first subset of the memory channels. The memory system is capable of operating with the first subset of the memory channels. | 12-29-2011 |
20110320872 | HIERARCHICAL ERROR INJECTION FOR COMPLEX RAIM/ECC DESIGN - A computer-implemented method for verifying a RAIM/ECC design using a hierarchical injection scheme that includes selecting marks for generating an error mask, selecting a fixed bit mask based on the selected marks, determining whether to inject errors into at least one of a marked channel and at least one marked chip of a channel; and randomly injecting errors into the at least one of the marked channel and the at least one marked chip when determined. | 12-29-2011 |
20110320914 | ERROR CORRECTION AND DETECTION IN A REDUNDANT MEMORY SYSTEM - Error correction and detection in a redundant memory system that includes a memory controller; a plurality of memory channels in communication with the memory controller, the memory channels including a plurality of memory devices; a cyclical redundancy code (CRC) mechanism for detecting that one of the memory channels has failed, and for marking the memory channel as a failing memory channel; and an error correction code (ECC) mechanism. The ECC is configured for ignoring the marked memory channel and for detecting and correcting additional memory device failures on memory devices located on one or more of the other memory channels, thereby allowing the memory system to continue to run unimpaired in the presence of the memory channel failure. | 12-29-2011 |
20110320918 | ERROR CORRECTION AND DETECTION IN A REDUNDANT MEMORY SYSTEM - Error correction and detection in a redundant memory system including a a computer implemented method that includes receiving data including error correction code (ECC) bits, the receiving from a plurality of channels, each channel comprising a plurality of memory devices at memory device locations. The method also includes computing syndromes of the data; receiving a channel identifier of one of the channels; and removing a contribution of data received on the channel from the computed syndromes, the removing resulting in channel adjusted syndromes. The channel adjusted syndromes are decoded resulting in channel adjusted memory device locations of failing memory devices, the channel adjusted memory device locations corresponding to memory device locations. | 12-29-2011 |
20120173936 | CHANNEL MARKING FOR CHIP MARK OVERFLOW AND CALIBRATION ERRORS - Marking memory chips as faulty when a fault is detected in data from the memory chip. Upon detecting that a plurality of memory chips are faulty, determining which of a plurality of memory channels contains the faulty memory chips. Marking one of a plurality of memory channels as failing in response to determining that the number of failing memory chips has exceeded a threshold. | 07-05-2012 |
20120198309 | CORRECTING MEMORY DEVICE AND MEMORY CHANNEL FAILURES IN THE PRESENCE OF KNOWN MEMORY DEVICE FAILURES - Correcting memory device (chip) and memory channel failures in the presence of known memory device failures. A memory channel failure is located and corrected, or alternatively up to c chip failures are corrected and up to d chip failures are detected in the presence of up to u chips that are marked as suspect. A first stage of decoding is performed that results in recovering an estimate of correctable errors affecting the data or in declaring an uncorrectable error state. When an uncorrectable error state is declared, a second stage of decoding is performed to attempt to correct u erasures and a channel error in M iterations where the channel location is changed in each iteration. A correctable error is declared in response to exactly one of the M iterations being successful. | 08-02-2012 |
20130047040 | CHANNEL MARKING FOR CHIP MARK OVERFLOW AND CALIBRATION ERRORS - Marking memory chips as faulty when a fault is detected in data from the memory chip. Upon detecting that a plurality of memory chips are faulty, determining which of a plurality of memory channels contains the faulty memory chips. Marking one of a plurality of memory channels as failing in response to determining that the number of failing memory chips has exceeded a threshold. | 02-21-2013 |
20130191682 | HOMOGENEOUS RECOVERY IN A REDUNDANT MEMORY SYSTEM - A computer implemented method for providing homogeneous recovery in a redundant memory system. The method includes receiving a notification that a memory channel has failed, where the memory channel is one of a plurality of memory channels in a memory system. New operations are blocked from starting on the memory channels in response to the notification, and any pending operations on the memory channels are completed in response to the notification. A recovery operation is performed on the memory channels in response to the completing. The new operations are started on at least a first subset of the memory channels in response to the recovery operation completing. The memory system is configured to operate with the first subset of the memory channels. | 07-25-2013 |
20130191683 | HETEROGENEOUS RECOVERY IN A REDUNDANT MEMORY SYSTEM - Providing heterogeneous recovery in a redundant memory system that includes a memory controller, a plurality of memory channels in communication with the memory controller, an error detection code mechanism configured for detecting a failing memory channel, and an error recovery mechanism. The error recovery mechanism is configured for receiving notification of the failing memory channel, for performing a recovery operation on the failing memory channel while other memory channels are performing normal system operations, for bringing the recovered channel back into operational mode with the other memory channels for store operations, for continuing to mark the recovered channel to guard against stale data, for removing any stale data after the recovery operation is complete, and for removing the mark on the recovered channel to allow the normal system operations with all of the memory channels, the removing based on the removing any stale data being complete. | 07-25-2013 |
20130191685 | PER-RANK CHANNEL MARKING IN A MEMORY SYSTEM - Channel marking is provided in a memory system that includes a memory channel with a plurality of memory devices. The memory devices are arranged into a first group of memory devices and a second group of memory devices. The memory system is configured to perform a method that includes determining that more than a threshold number of memory devices in the first group are failing. An error correction code (ECC) is configured to compensate for errors associated with memory devices in the first group on the memory channel and to perform error correction on errors associated with memory devices in the second group on the memory channel. | 07-25-2013 |
20130191698 | HIERARCHICAL CHANNEL MARKING IN A MEMORY SYSTEM - Channel marking is provided in a memory system that includes a first memory channel, a second memory channel, and error correction code (ECC) logic. The memory system is configured to perform a method that includes receiving a request to apply a first channel mark to the first memory channel and determining a priority level of the first channel mark. A request is received to apply a second channel mark to the second memory channel, and a priority level of the second mark is determined. It is determined that the priority level of the first channel mark is higher than the priority level of the second channel mark. The first channel mark is supplied to the ECC logic while blocking the second channel mark from the ECC logic. | 07-25-2013 |
20130191703 | DYNAMIC GRADUATED MEMORY DEVICE PROTECTION IN REDUNDANT ARRAY OF INDEPENDENT MEMORY (RAIM) SYSTEMS - Dynamic graduated memory device protection in redundant array of independent memory (RAIM) systems that include a plurality of memory devices is provided. A first severity level of a first failing memory device in the plurality of memory devices is determined. The first failing memory device is associated with an identifier used to communicate a location of the first failing memory device to an error correction code (ECC). A second severity level of a second failing memory device in the plurality of memory devices is determined. It is determined that the second severity level is higher than the first severity level. The identifier from the first failing memory device is removed based on determining that the second severity level is higher than the first severity level. The identifier is applied to the second failing memory device based on determining that the second severity level is higher than the first severity level. | 07-25-2013 |
20140101481 | PER-RANK CHANNEL MARKING IN A MEMORY SYSTEM - Channel marking is provided in a memory system that includes a memory channel with a plurality of memory devices. The memory devices are arranged into a first group of memory devices and a second group of memory devices. The memory system is configured to perform a method that includes determining that more than a threshold number of memory devices in the first group are failing. An error correction code (ECC) is configured to compensate for errors associated with memory devices in the first group on the memory channel and to perform error correction on errors associated with memory devices in the second group on the memory channel. | 04-10-2014 |
20140101518 | DYNAMIC GRADUATED MEMORY DEVICE PROTECTION IN REDUNDANT ARRAY OF INDEPENDENT MEMORY (RAIM) SYSTEMS - Dynamic graduated memory device protection in redundant array of independent memory (RAIM) systems that include a plurality of memory devices is provided. A first severity level of a first failing memory device in the plurality of memory devices is determined. The first failing memory device is associated with an identifier used to communicate a location of the first failing memory device to an error correction code (ECC). A second severity level of a second failing memory device in the plurality of memory devices is determined. It is determined that the second severity level is higher than the first severity level. The identifier from the first failing memory device is removed based on determining that the second severity level is higher than the first severity level. The identifier is applied to the second failing memory device based on determining that the second severity level is higher than the first severity level. | 04-10-2014 |
20140310570 | STALE DATA DETECTION IN MARKED CHANNEL FOR SCRUB - Embodiments relate to stale data detection in a marked channel for a scrub. An aspect includes bringing the marked channel online, wherein the computer comprises a plurality of memory channels comprising the marked channel and a remaining plurality of unmarked channels. Another aspect includes performing a scrub read of an address in the plurality of memory channels. Another aspect includes determining whether data returned by the scrub read from the marked channel is valid or stale based on data returned from the unmarked channels by the scrub read. Another aspect includes based on determining that the data returned by the scrub read from the marked channel is valid, not performing a scrub writeback to the marked channel. Another aspect includes based on determining that the data returned by the scrub read from the marked channel is stale, performing a scrub writeback of corrected data to the marked channel. | 10-16-2014 |
20150019905 | STALE DATA DETECTION IN MARKED CHANNEL FOR SCRUB - Embodiments relate to stale data detection in a marked channel for a scrub. An aspect includes bringing the marked channel online, wherein the computer comprises a plurality of memory channels comprising the marked channel and a remaining plurality of unmarked channels. Another aspect includes performing a scrub read of an address in the plurality of memory channels. Another aspect includes determining whether data returned by the scrub read from the marked channel is valid or stale based on data returned from the unmarked channels by the scrub read. Another aspect includes based on determining that the data returned by the scrub read from the marked channel is valid, not performing a scrub writeback to the marked channel. Another aspect includes based on determining that the data returned by the scrub read from the marked channel is stale, performing a scrub writeback of corrected data to the marked channel. | 01-15-2015 |