Patent application number | Description | Published |
20090193427 | MANAGING PARALLEL DATA PROCESSING JOBS IN GRID ENVIRONMENTS - Method, system, and computer program product for managing parallel data processing jobs in grid environments are provided. A request to deploy a parallel data processing job in a grid environment is received. A plurality of resource nodes in the grid environment are dynamically allocated to the parallel data processing job. A configuration file is automatically generated for the parallel data processing job based on the allocated resource nodes. The parallel data processing job is then executed in the grid environment using the generated configuration file. | 07-30-2009 |
20110295867 | KEY-BREAK AND RECORD-LOOP PROCESSING IN PARALLEL DATA TRANSFORMATION - Embodiments of the invention provide a method and apparatus for providing additional functionality to a data processing program. This is achieved by various means, including preprocessing records in a data volume, designating certain records with a key-break, and creating an aggregation structure that user programs may use to store previously-processed records from the data volume. | 12-01-2011 |
20120259875 | KEY-BREAK AND RECORD-LOOP PROCESSING IN PARALLEL DATA TRANSFORMATION - Embodiments of the invention provide a method and apparatus for providing additional functionality to a data processing program. This is achieved by various means, including preprocessing records in a data volume, designating certain records with a key-break, and creating an aggregation structure that user programs may use to store previously-processed records from the data volume. | 10-11-2012 |
20140007121 | LIGHT WEIGHT WORKLOAD MANAGEMENT SERVER INTEGRATION | 01-02-2014 |
20140280441 | DATA INTEGRATION ON RETARGETABLE ENGINES IN A NETWORKED ENVIRONMENT - Techniques are disclosed for data integration on retargetable engines in a networked environment. The networked environment includes data processing engines of different types and having different sets of characteristics. A request is received execute a data flow model in the networked environment. The data flow model includes data flow objects. A first data processing engine is programmatically selected based on a predefined set of criteria and the sets of characteristics of the data processing engines. The data flow model is executed using the selected data processing engine and responsive to the request. | 09-18-2014 |
20140281704 | DEPLOYING PARALLEL DATA INTEGRATION APPLICATIONS TO DISTRIBUTED COMPUTING ENVIRONMENTS - System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system. | 09-18-2014 |
20140282563 | DEPLOYING PARALLEL DATA INTEGRATION APPLICATIONS TO DISTRIBUTED COMPUTING ENVIRONMENTS - System, method, and computer program product to process parallel computing tasks on a distributed computing system, by computing an execution plan for a parallel computing job to be executed on the distributed computing system, the distributed computing system comprising a plurality of compute nodes, generating, based on the execution plan, an ordered set of tasks, the ordered set of tasks comprising: (i) configuration tasks, and (ii) execution tasks for executing the parallel computing job on the distributed computing system, and launching a distributed computing application to assign the tasks of the ordered set of tasks to the plurality of compute nodes to execute the parallel computing job on the distributed computing system. | 09-18-2014 |
20140282604 | QUALIFIED CHECKPOINTING OF DATA FLOWS IN A PROCESSING ENVIRONMENT - Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed. | 09-18-2014 |
20140282605 | QUALIFIED CHECKPOINTING OF DATA FLOWS IN A PROCESSING ENVIRONMENT - Techniques are disclosed for qualified checkpointing of a data flow model having data flow operators and links connecting the data flow operators. A link of the data flow model is selected based on a set of checkpoint criteria. A checkpoint is generated for the selected link. The checkpoint is selected from different checkpoint types. The generated checkpoint is assigned to the selected link. The data flow model, having at least one link with no assigned checkpoint, is executed. | 09-18-2014 |
20140325519 | VARIABLE WAIT TIME IN AN ASYNCHRONOUS CALL-BACK SYSTEM - A method includes a workload management (WLM) server that receives a first CHECK WORKLOAD command for a workload in a queue of the WLM server. It may be determined whether the workload is ready to run on a WLM client. If the workload is not ready to run, a wait time for the workload with the WLM server is dynamically estimated. The wait time is sent to the WLM client. If the workload is ready to run, then a response is sent to the WLM client that workload is ready to run. | 10-30-2014 |
20150052530 | TASK-BASED MODELING FOR PARALLEL DATA INTEGRATION - System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment. | 02-19-2015 |
20150074669 | TASK-BASED MODELING FOR PARALLEL DATA INTEGRATION - System, method, and computer program product to perform an operation for task-based modeling for parallel data integration, by determining, for a data flow, a set of processing units, each of the set of processing units defining one or more data processing operations to process the data flow, generating a set of tasks to represent the set of processing units, each task in the set of tasks comprising one or more of the data processing operations of the set of processing units, optimizing the set of tasks based on a set of characteristics of the data flow, and generating a composite execution plan based on the optimized set of tasks to process the data flow in a distributed computing environment. | 03-12-2015 |