Patent application number | Description | Published |
20110047525 | QUALITY-DRIVEN ETL DESIGN OPTIMIZATION - A method for quality objective-based ETL pipeline optimization is provided. An improvement objective is obtained from user input into a computing system. The improvement objective represents a priority optimization desired by a user for improved ETL flows for an application designed to run in memory of the computing system. An ETL flow is created in the memory of the computing system. The ETL flow is restructured for flow optimization with a processor of the computing system. The flow restructuring is based on the improvement objective. Flow restructuring can include application of flow rewriting optimization or application of an algebraic rewriting optimization. The optimized ETL flow is stored as executable code on a computer readable storage medium. | 02-24-2011 |
20110209149 | OPTIMIZATION OF INTEGRATION FLOW PLANS - Computer-based methods, computer-readable storage media and computer systems are provided for optimizing integration flow plans. An initial integration flow plan, one or more objectives and/or an objective function related to the one or more objectives may be received as input. A computing cost of the initial integration flow plan may be compared with the objective function. Using one or more heuristics, a set of close-to-optimal integration flow plans may be identified from all possible integration flow plans that are functionally equivalent to the initial integration flow plan. A close-to-optimal integration flow plan with a lowest computing cost may be selected from the set as a replacement for the initial integration flow plan. | 08-25-2011 |
20110264667 | COLUMN-ORIENTED STORAGE IN A ROW-ORIENTED DATABASE MANAGEMENT SYSTEM - Systems, methods, and computer-readable storage media are provided for column-oriented storage in a row-oriented database management system. Data may be provided in one or more columns, each datum associated with a position within a column. A list may be created of one or more records per column, each record including a plurality of values stored in an order of position within the column and a first positional indicator. An index may be created to access a value stored in a record, wherein the index includes an index parameter derived from each record in the list and the index parameters are ordered in accordance with an order of records in the list. | 10-27-2011 |
20120072391 | APPARATUS AND METHOD FOR AN AUTOMATIC INFORMATION INTEGRATION FLOW OPTIMIZER - An apparatus and method provides automatic information integration flow optimization. The apparatus may include an input/output port connecting the information integration flow optimizer to extract-transform-load tools. The information integration flow optimizer includes a parser unit to create a tool-agnostic input file containing rich semantics, a converter to transform the tool-agnostic input file into an input DAG, and a QoX-driven optimizer applying one or more heuristic algorithms to the input DAG to develop an optimum information integration flow design based on the rich semantics. The method may include receiving a tool-specific input file representing a physical information integration flow, parsing and converting the tool-specific input file into an input DAG containing tool-agnostic rich semantics, and applying heuristic algorithms to the input DAG to develop an optimum information integration flow design based on the rich semantics. | 03-22-2012 |
20120101978 | SYSTEM AND METHOD FOR GENERATING AN INFORMATION INTEGRATION FLOW DESIGN USING HYPERCUBES - A system, method, and computer readable medium for generating an information integration flow design (IIFD). The system includes a processor to receive a conceptual model of the IIFD, having an extract phase, a load phase, and a transformation phase, an extract unit to model an interface between a data source information object and a transformation function based on at least one extract hypercube, a load unit to specify at least one load hypercube and a data warehouse target object, a transformation unit to express one or more steps as a hypercube operation, and a translation unit to generate the IIFD based on the conceptual model. The method includes receiving a conceptual model of the IIFD having an extract phase, a load phase, and a transformation phase. The method generates logical information integration operations based on the conceptual model. A computer readable medium may include instructions to generate the IIFD. | 04-26-2012 |
20130003965 | SURROGATE KEY GENERATION - A method for surrogate key generation performed by a physical computing system includes creating a lookup record for a production key of an input record, a key of the lookup record including the production key and a value of the lookup record including both a record identifier for the input record and a unique identifier of the production key within the input record. The method further includes sending the lookup record to a first node of a distributed computing system, the first node determined by hashing the production key with a first hash function, and with the first node, determining a surrogate key for the production key. | 01-03-2013 |
20130047161 | SELECTING PROCESSING TECHNIQUES FOR A DATA FLOW TASK - A method for data flow processing includes determining values for each of a set of parameters associated with a task within a data flow processing job, and applying a set of rules to determine one of a set of processing techniques that will be used to execute the task. The set of rules is determined through a set of benchmark tests for the task using each of the set of processing techniques while varying the set of parameters. | 02-21-2013 |
20130093771 | MODIFIED FLOW GRAPH DEPICTION - A method and apparatus apply a transition to an initial information integration flow graph to form a modified information integration flow graph which is visually depicted in a modified design canvas. The initial information integration flow graph has nodes, each node having initial location coordinates for visual depiction in an initial design canvas, wherein nodes of the modified information integration flow graph having location coordinates based upon the initial location coordinates. | 04-18-2013 |
20130096967 | OPTIMIZER - A method and apparatus: (1) select and apply a transition from a set of first objective enhancing transitions to an initial information integration flow graph based upon how application of each transition impacts a length of a chain of nodes to produce a first set of modified information integration flow graphs that satisfy a first objective; (2) select and apply a second transition from the set of first objective transitions and a set of second objective enhancing transitions to the first set of modified information integration flow graphs to produce a second set of modified information integration flow graphs that satisfy the first objective and the second objective; and (3) identify an information integration flow graph from the first set and the second set having a lowest cost. | 04-18-2013 |
20130097592 | USER SELECTED FLOW GRAPH MODIFICATION - A computer implemented method and apparatus display an information integration flow graph, receive user input selecting a modification to apply to the displayed information integration flow graph and modify the information integration flow graph based on the selected modification to form a modified information integration flow graph, wherein the modified information integration flow graph is displayed. | 04-18-2013 |
20130097604 | INFORMATION INTEGRATION FLOW FRESHNESS COST - A computer implemented method and apparatus calculate a freshness cost for each of a plurality of information integration flow graphs and select one of the plurality of information integration flow graphs based upon the calculated freshness cost. | 04-18-2013 |
20130179394 | System and Method for Interpreting and Generating Integration Flows - There is provided a computer system for generating an extract, transform, and load (ETL) workflow. The computer system includes a processor configured to receive ( | 07-11-2013 |
20130191306 | Providing Operational Business Intelligence - Embodiments described herein can be used to provide business intelligence. For example, a tangible, computer-readable medium may include code configured to direct a processor to create a conceptual model of an business process. The code may be configured to direct the processor to parse the conceptual model to create a logical model of the business process, and to parse the logical model to create a physical model of the business process. | 07-25-2013 |
20130290296 | NESTING LEVEL - A system, method, and non-transitory computer readable medium are provided to access a graph comprising a plurality of nodes and at least one edge. Each node is associated with at least one database operation. Computer code is constructed that corresponds to the graph in accordance with a nesting level. The nesting level represents a degree of temporary storage to be allocated for intermediate output produced by the at least one database operation. | 10-31-2013 |
20140068055 | RESOURCE SHARING IN COMPUTER CLUSTERS ACCORDING TO OBJECTIVES - A method of assigning resources of a computer duster with resource sharing according to objectives. The method includes monitoring resources of each of a plurality of cloud nodes, providing information descriptive of the cloud node resources, receiving a reservation, determining whether resources are available to satisfy the reservation and any other pending reservations, if resources are available, using a rapid search to determine resource assignments for the reservation and any other pending reservations according to one or more objectives, and allocating resources according to the resource assignments. | 03-06-2014 |
20140068056 | COMPUTER CLUSTER WITH OBJECTIVE-BASED RESOURCE SHARING - A computer cluster with objectives-based resource sharing. The cluster includes cloud nodes each with one or more resources, a terminal, data storage, and an allocation node to monitor cloud node resources, provide information descriptive of the cloud node resources to a customer through the terminal, receive a reservation for cloud node resources from the customer, store the reservation in the data storage, determine assignments of the cloud node resources for the reservation and any other pending reservations according to one or more objectives, and allocate the cloud node resources to customers according to the resource assignments. | 03-06-2014 |
20140068550 | SELECTING EXECUTION ENVIRONMENTS - Disclosed herein are techniques for selecting execution environments. Each operation in a sequence of operations is implemented using a selected execution environment. Each operation is converted into code executable in the selected execution environment. If some operations in the sequence were implemented in different execution environments, execution of the operations is coordinated. | 03-06-2014 |
20140101092 | ADJUSTMENT OF MAP REDUCE EXECUTION - Disclosed herein are techniques for adjusting a map reduce execution environment. It is determined whether some operations in a sequence of operations should be implemented in a map reduce execution environment. If it is determined that some operations in a sequence of operations should be implemented in a map reduce execution environment, the map reduce execution environment is adjusted to achieve a predefined performance objective. | 04-10-2014 |
20140156589 | DIVIDING AND COMBINING OPERATIONS - Disclosed herein are techniques for arranging a series of operations. It is determined whether an operation executes more efficiently when divided. It is further determined whether a plurality of operations execute more efficiently when combined. | 06-05-2014 |
20140181080 | COSTS OF OPERATIONS ACROSS COMPUTING SYSTEMS - Disclosed herein are techniques for measuring or assessing the costs of executing operations across a plurality of computing systems. The cost of transferring data across at least one arrangement of computing systems is determined. The cost of executing at least one arrangement of the operations is also determined. | 06-26-2014 |
20140215473 | OBJECTIVES OF OPERATIONS EXECUTING ACROSS ENVIRONMENTS - Disclosed herein are techniques for managing operations. A distribution of operations across a plurality of execution environments is determined in order to achieve a performance objective. Another distribution of the operations is determined, if the status of the execution environments renders the distribution suboptimal or incapable of achieving the performance objective. | 07-31-2014 |
20140244570 | OPTIMIZING AND MANAGING EXECUTION OF HYBRID FLOWS - Disclosed herein are techniques for optimizing and managing the execution of hybrid flows. An execution plan is generated for each hybrid flow based at least partially on attributes associated therewith. The execution of each hybrid flow is managed in accordance with the execution plan. | 08-28-2014 |
20140280110 | REQUESTS FOR SOURCE CODE TEXT - Disclosed herein are a system, non-transitory computer readable medium and method for fulfilling requests for source code. A description is associated with each section of source code text. A section of source code, whose description at least partially matches a source code request, is obtained and displayed. | 09-18-2014 |
20140303933 | OPTIMIZING ANALYTIC FLOWS - A technique of optimizing analytic flows includes sampling source data using a sampling method, executing a flow over the sampled data, obtaining runtime statistics from the executed flow, and combining runtime statistics with historical statistics. | 10-09-2014 |
20140324839 | DETERMINING CANDIDATE SCRIPTS FROM A CATALOG OF SCRIPTS - According to an example, candidate scripts may be determined from a catalog of scripts to perform a requested operation. In determining the candidate scripts, a request for an operation may be received, in which the request includes an input and an output. In addition, based upon the input and the output, a plurality of candidate scripts that are to perform the requested operation may be identified from the catalog of scripts, in which each of the plurality of candidate scripts comprises at least one of a script that is to perform the requested operation individually or a number of scripts that, in combination, are to perform the requested operation. Moreover, a score for each of plurality of candidate scripts may be calculated based upon a plurality of factors respectively corresponding to the plurality of candidate scripts and the plurality of candidate scripts and the calculated scores may be outputted. | 10-30-2014 |
20140325476 | MANAGING A CATALOG OF SCRIPTS - According to an example, a catalog of scripts may be managed. Management of the catalog of scripts may include the addition of a script description into the catalog of scripts. In one example, the script description may be directly added to the catalog of scripts. In another example, the script description may be added through generation of a merged query of scripts. | 10-30-2014 |
20140344817 | CONVERTING A HYBRID FLOW - Converting a hybrid flow can include combining each of a plurality of task nodes with a plurality of corresponding operators of the hybrid flow and converting the combined plurality of task nodes and the plurality of corresponding operators of the hybrid flow to a data flow graph using a code template. | 11-20-2014 |
20140379322 | CONVERTING AN INPUT SCRIPT - Converting an input script includes obtaining an input script comprising at least one variable, obtaining at least one translation transformation rule from a library, converting the input script into a tree representation, folding the tree representation to hide a subset of variables in the input script to create a folded tree, and generating a natural language text by applying at least one translation transformation rule from the library to the folded tree. | 12-25-2014 |