Patent application number | Description | Published |
20120303576 | SYNCHRONOUS REPLICATION IN A DISTRIBUTED STORAGE ENVIRONMENT - Embodiments of the present invention relate to synchronously replicating data in a distributed computing environment. To achieve synchronous replication both an eventual consistency approach and a strong consistency approach are contemplated. Received data may be written to a log of a primary data store for eventual committal. The data may then be annotated with a record, such as a unique identifier, which facilitates the replay of the data at a secondary data store. Upon receiving an acknowledgment that the secondary data store has written the data to a log, the primary data store may commit the data and communicate an acknowledgment of success back to the client. In a strong consistency approach, the primary data store may wait to send an acknowledgement of success to the client until it receives an acknowledgment that the secondary has not only written, but also committed, the data. | 11-29-2012 |
20120303577 | ASYNCHRONOUS REPLICATION IN A DISTRIBUTED STORAGE ENVIRONMENT - Embodiments of the present invention relate to asynchronously replicating data in a distributed computing environment. To achieve asynchronous replication, data received at a primary data store may be annotated with information, such as an identifier of the data. The annotated data may then be communicated to a secondary data store, which may then write the data and annotated information to one or more logs for eventual replay and committal at the secondary data store. The primary data store may communicate an acknowledgment of success in committing the data at the primary data store as well as of success in writing the data to the secondary data store. Additional embodiments may include committing the data at the secondary data store in response to receiving an instruction that authorizes committal of data through a identifier. | 11-29-2012 |
20120303578 | Versioned And Hierarchical Data Structures And Distributed Transactions - Presented herein are methods of replicating versioned and hierarchical data structures, as well as data structures representing complex transactions. Due to interdependencies between data entities and a lack of guaranteed message ordering, simple replication methods employed for simple data types cannot be used. Operations on data structures exhibit dependencies between the messages making up the operations. This strategy can be extended to various types of complex transactions by considering certain messages to depend on other messages or on the existence of other entries at the data store. Regardless of origin, these dependencies can be enforced by suspending the processing of messages with unsatisfied dependencies until all of its dependencies have been met. Alternately, transactions can be committed immediately, creating entities that include versioned identifiers for each of their dependencies. These entities can then be garbage collected of the parent objects are not subsequently created. | 11-29-2012 |
20120303581 | REPLICATION PROCESSES IN A DISTRIBUTED STORAGE ENVIRONMENT - Embodiments of the present invention relate to systems, methods, and computer storage media for replicating data in a distributed computing environment utilizing a combination of replication methodologies. A full-object replication may be utilized to replicate a full state of an object from a primary data store to a secondary data store. A checkpoint created after initiating the full-object replication may be parsed to identify changes to the object that have been entered since initiating the full-object replication. This replication process is referred to as a delta-checkpoint replication methodology. Additionally, in an embodiment, a log-based replication methodology may be utilized. The log-based replication may communicate data from a log of the primary data store to the secondary data store. It is also contemplated in an exemplary embodiment that when the log-based replication fails to maintain a throughput threshold, one of the other replication methodologies may be initiated, at least temporarily. | 11-29-2012 |
20120303593 | Geo-Verification And Repair - Presented herein are methods of continuously verifying data and repairing errors introduced during replication. In a particular embodiment, a primary data store sends out information sufficient to create a checkpoint together with a checksum for the data being verified at that checkpoint. At the secondary data store, a checkpoint is created in accordance with the checkpointing information, and a checksum is calculated over the indicated data at the created checkpoint. If the calculated checksum disagrees with the received checksum, additional checksums are calculated over subranges of the indicated data and compared with corresponding checksums over the data at the primary data store. The checksums at the primary data store may be requested from the primary data store or calculated locally based on the received overall checksum. Once an erroneous entry is identified, it can then be re-replicated from the primary data store to restore data consistency. | 11-29-2012 |
20120303791 | LOAD BALANCING WHEN REPLICATING ACCOUNT DATA - Embodiments of the present invention relate to invoking and managing load-balancing operation(s) applied to partitions within a distributed computing environment, where each partition represents a key range of data for a storage account. The partitions affected by the load-balancing operation(s) are source partitions hosted on a primary storage stamp and/or destination partitions hosted on a secondary storage stamp, where the primary and secondary storage stamps are located in geographically distinct areas and are equipped to replicate the storage account's data therebetween. The load-balancing operation(s) include splitting partitions into child partitions upon detecting an increased workload as a result of active replication, merging partitions to form parent partitions upon detecting a reduction in workload as a result of decreased processing-related resource consumption, or offloading partitions based on resource consumption. A service within a partition layer of the storage stamps is responsible for determining when to invoke these load-balancing operation(s). | 11-29-2012 |
20120303912 | STORAGE ACCOUNT MIGRATION BETWEEN STORAGE STAMPS - Embodiments of the present invention relate to invoking and managing migration operations applied to partitions within a distributed computing environment, where each partition represents a key range of data for a storage account. The partitions affected by the migration operations are source partitions hosted on a primary storage stamp and/or destination partitions hosted on a secondary storage stamp, where the primary and secondary storage stamps are equipped to replicate the storage account's data therebetween upon initiating a migration. Upon substantial completion of a bootstrapping phase of replication, one migration operation includes designating the secondary storage stamp as a new primary storage stamp such that the destination partitions commence processing client requests, sending resultant transactions to the source partitions, and providing read and write access thereto. Another migration operation includes designating the primary storage stamp as a new secondary storage stamp such that the source partitions commence replaying the transactions. | 11-29-2012 |
20120303999 | IMPLEMENTING FAILOVER PROCESSES BETWEEN STORAGE STAMPS - Embodiments of the present invention relate to invoking and managing a failover of a storage account between partitions within a distributed computing environment, where each partition represents a key range of data for the storage account. The partitions affected by the failover include source partitions hosted on a primary storage stamp and destination partitions hosted on a secondary storage stamp, where the storage account's data is being actively replicated from the primary to the secondary storage stamp. Upon receiving a manual or automatic indication to perform the failover, configuring the source partitions to independently perform flush-send operations (e.g., distributing pending messages as a group) and then configuring the destination partitions to independently perform flush-replay operations (e.g., aggressively replaying currently pending transactions). Upon completing the flush-replay operations, designating the secondary storage stamp as a new primary storage stamp such that live traffic is directed to the new primary storage stamp. | 11-29-2012 |
20140258499 | LOAD BALANCING WHEN REPLICATING ACCOUNT DATA - Embodiments of the present invention relate to invoking and managing load-balancing operation(s) applied to partitions within a distributed computing environment, where each partition represents a key range of data for a storage account. The partitions affected by the load-balancing operation(s) are source partitions hosted on a primary storage stamp and/or destination partitions hosted on a secondary storage stamp, where the primary and secondary storage stamps are located in geographically distinct areas and are equipped to replicate the storage account's data therebetween. The load-balancing operation(s) include splitting partitions into child partitions upon detecting an increased workload as a result of active replication, merging partitions to form parent partitions upon detecting a reduction in workload as a result of decreased processing-related resource consumption, or offloading partitions based on resource consumption. A service within a partition layer of the storage stamps is responsible for determining when to invoke these load-balancing operation(s). | 09-11-2014 |
20140289554 | IMPLEMENTING FAILOVER PROCESSES BETWEEN STORAGE STAMPS - Embodiments of the present invention relate to invoking and managing a failover of a storage account between partitions within a distributed computing environment, where each partition represents a key range of data for the storage account. The partitions affected by the failover include source partitions hosted on a primary storage stamp and destination partitions hosted on a secondary storage stamp, where the storage account's data is being actively replicated from the primary to the secondary storage stamp. Upon receiving a manual or automatic indication to perform the failover, configuring the source partitions to independently perform flush-send operations (e.g., distributing pending messages as a group) and then configuring the destination partitions to independently perform flush-replay operations (e.g., aggressively replaying currently pending transactions). Upon completing the flush-replay operations, designating the secondary storage stamp as a new primary storage stamp such that live traffic is directed to the new primary storage stamp. | 09-25-2014 |