Patent application number | Description | Published |
20100082642 | Classifier Indexing - Provided are, among other things, systems, methods and techniques for document-based processing. In one implementation, a document is input; features are extracted from it; an index is queried using at least a subset of the extracted features and, in response, identifications for selected document classifiers are received from a larger pool of document classifiers; the document is processed using individual ones of the selected document classifiers, thereby generating corresponding classifier outputs; and then, based on such classifier outputs, (1) the document is categorized within a computer database and/or (2) feedback information is provided to a user. | 04-01-2010 |
20100083346 | Information Scanning Across Multiple Devices - Provided are, among other things, systems, methods and techniques for scanning information across multiple different devices. In one representative system, remote data-processing devices are provided with scanning applications that repeatedly scan information on their respective data-processing devices to identify matching data units that satisfy a specified matching criterion, the specified matching criterion including required matches against a set of screening digests, and then transmit characteristic information regarding the matching data units; and a central processing facility receives the characteristic information from the remote data-processing devices and determines whether the corresponding matching data units satisfy a policy criterion. | 04-01-2010 |
20100153314 | SYSTEMS AND METHODS FOR COLLABORATIVE FILTERING USING COLLABORATIVE INDUCTIVE TRANSFER - Systems and methods are disclosed that are configured to access a database that includes a list of members of a first group, a list of members of a second group, and ratings for at least some of the members of the second group. The ratings are attributed to the members of the first group. A machine learning training set is built for a particular member of the first group. The training set includes class labels corresponding to the particular member's ratings for the members of the second group, and features that include supplied and predicted ratings from at least a subset of processed members of the first group. A predictor for the particular member of the first group is trained based on the machine learning training set. The predictor corresponding to the particular member is used to generate predicted ratings for one or more members of the second group the particular member has not rated. | 06-17-2010 |
20100199350 | Federated Scanning of Multiple Computers - A data processing apparatus and associated computer-executed method are adapted for federated scanning of multiple computers. The data processing apparatus comprises a logic that controls scanning among a plurality of data objects distributed among a plurality of distributed electronic data storage systems. The logic maintains a data set of paired location identifiers and intrinsic references corresponding to individual data objects of the plurality of data objects and controls scanning so that redundant scanning of duplicate data objects with matching intrinsic references occurring in multiple locations is avoided. | 08-05-2010 |
20110029515 | METHOD AND SYSTEM FOR PROVIDING WEBSITE CONTENT - An exemplary embodiment of the present invention provides a method of receiving Website content. The method includes generating a user profile comprising a cluster type obtained from a list of cluster types, wherein the list of cluster types is generated by processing a database of search queries. The method includes providing the relevant cluster types included in the user profile to a selected Website, wherein the cluster type sent to the Website is used by the Website at least in part to determine the content provided by the Website. | 02-03-2011 |
20110060717 | SYSTEMS AND METHODS FOR IMPROVING WEB SITE USER EXPERIENCE - Methods, systems, and computer program products are provided for personalizing web sites. A model based on mining web usage data is accessed. The model defines associations between web sites. Interest associations extracted from web interactions are stored. The interest associations comprise interest indications and web sites associated with the interest indications. An interest indication from the interest associations is selected. The interest indication is associated with an associated web site. The associated web site has an association with a target web site as defined by the model. The interest indication is sent to the target web site. | 03-10-2011 |
20110078152 | METHOD AND SYSTEM FOR PROCESSING TEXT - An exemplary embodiment of the present invention provides a method of processing an electronic text document. The method includes obtaining a character from the document. The method also includes obtaining a hash input code from a character map, the hash input code corresponding to the character. The method also includes modifying a hash value based on the hash input code if the hash input code indicates that the character is part of a token, or asserting the hash value if the hash input code indicates that character is not part of a token. | 03-31-2011 |
20110119208 | METHOD AND SYSTEM FOR DEVELOPING A CLASSIFICATION TOOL - An exemplary embodiment of the present invention provides a computer implemented method of developing a classifier. The method includes receiving input for a case, the case comprising a plurality of instances and an example, the example comprising a plurality of data fields each corresponding to one of the plurality of instances, wherein the input indicates which, if any, of the instances includes a data field belonging to a target class. The method also includes training the classifier based, at least in part, on the input from the trainer. | 05-19-2011 |
20110119209 | METHOD AND SYSTEM FOR DEVELOPING A CLASSIFICATION TOOL - An exemplary embodiment of the present invention provides a computer implemented method of developing a classifier. The method includes obtaining a set of training data comprising labeled cases. The method also includes training a classifier based, at least in part, on the training data. The method also includes applying the classifier to a plurality of unlabeled cases to generate classification scores for each of the unlabeled cases, wherein each classification score corresponds with an instance of a corresponding case. Furthermore, the classification score corresponding to a first instance in a case is computed based, at least in part, on a value of a case-centric feature corresponding to the first instance, wherein the value of the case-centric feature is based, at least in part, on characteristics of the first instance and a second instance in the case. | 05-19-2011 |
20110119267 | METHOD AND SYSTEM FOR PROCESSING WEB ACTIVITY DATA - The present disclosure provides a computer-implemented method of processing Web activity data. The method includes obtaining a collection of Web activity data generated by a plurality of users at a plurality of Webpages, wherein the Webpages are from a plurality of unaffiliated Websites. The method also includes extracting a plurality of search terms from the Web activity data and associating each of the plurality of search terms with a corresponding Webpage. The method also includes generating statistical data from the Web activity data based, at least in part, on the search terms, the statistical data corresponding to the online activity at one or more Webpages. | 05-19-2011 |
20110119268 | METHOD AND SYSTEM FOR SEGMENTING QUERY URLS - A computer implemented method of grouping query URLs is provided. The method includes obtaining a plurality of query URLs generated at a plurality of Websites. The method also includes analyzing the query URLs to identify similarities between the URLs. The method also includes grouping the query URLs into cases based, at least in part, on the similarities, wherein each case comprises a plurality of instances, and each instance comprises a plurality of data field values corresponding to data fields with a same data field name. | 05-19-2011 |
20110126122 | SYSTEMS AND METHODS FOR GENERATING PROFILES FOR USE IN CUSTOMIZING A WEBSITE - Systems and methods are disclosed for constructing a profile that obtain text associated with web content. Logic instructions are provided by a party that is unaffiliated with a party that provides the web content to allow a profile associated with a user to include information from two or more web sites that are unaffiliated with one another. A match between the text and a target in a target set is detected. The profile associated with the user is modified based on the match. | 05-26-2011 |
20110173532 | GENERATING A LAYOUT OF TEXT LINE IMAGES IN A REFLOW AREA - A decomposition specification is received. The decomposition specification includes specifications of locations of text line images corresponding to complete lines of text in a document image. Based on the decomposition specification, a layout of the text line images in respective lines of a reflow area is generated, where each of the lines of the reflow area has a respective maximum line length. In this process, successive ones of the text line images are packed onto the lines of the reflow area with divisions of one or more of the text line images into respective portions that are concatenated with text image content of other ones of the text line images to fill respective ones of the lines of the reflow area without exceeding the respective maximum line lengths. | 07-14-2011 |
20110182513 | WORD-BASED DOCUMENT IMAGE COMPRESSION - Locations of word images corresponding to words in a document image are ascertained. The word images are grouped into clusters. For each of multiple of the clusters, a respective compressed word image cluster is determined based on a joint compression of respective ones of the word images that are grouped into the cluster. The positions of the word images in the document image are associated with the respective ones of the compressed word image clusters corresponding to the clusters respectively containing the word images. | 07-28-2011 |
20110295762 | Predictive performance of collaborative filtering model - For each first entity of a subset of a number of first entities, an expected improvement of a predictive performance of a collaborative filtering model if additional ratings of the first entity in relation to a plurality of second entities were obtained is estimated. Particular first entities from the subset of the first entities of which to obtain the additional ratings in relation to the second entities are selected based at least on the expected improvements that have been determined. The additional ratings of the particular first entities in relation to the second entities are obtained. | 12-01-2011 |
20120020561 | METHOD AND SYSTEM FOR OPTICAL CHARACTER RECOGNITION USING IMAGE CLUSTERING - The present disclosure provides a computer-implemented method of translating an image-based electronic document into a text-based electronic document. The method includes electronically scanning an image-based document to determine positions of word images in the image-based document. The method also includes extracting the word images from the image-based document and storing the word images to an electronic storage device. The method also includes grouping a subset of the word images into a word cluster based on a similarity of the word images, wherein the word images in the word cluster correspond to a same actual word. The method also includes generating a character-encoded transcription for the word cluster based on the word images in the word cluster. The method also includes adding the character-encoded transcription to a text-based electronic document at locations corresponding to the positions of the word images in the image-based document. | 01-26-2012 |
20130091088 | MAKING A RECOMMENDATION TO A USER THAT IS CURRENTLY GENERATING EVENTS BASED ON A SUBSET OF HISTORICAL EVENT DATA - A method and a system of making a recommendation to a user that is currently generating events based on a subset of historical event data are provided. Historical event data, which is segmented into a set of sessions, is received. Each session includes events. The sessions are associated with clusters that represent the users that generated the historical event data. Each of the associated sessions is associated with one cluster and the number of the clusters is the same as the number of the users. A determination as to which cluster is associated with events currently being generated by a current user's behavior is made. The determining does not require identification of the current user. A recommendation is made to the current user based on the cluster that is associated with the events currently being generated. | 04-11-2013 |
20140040169 | ACTIVE LEARNING WITH PER-CASE SYMMETRICAL IMPORTANCE SCORES - A method for classifying cases includes receiving a pool of unlabeled cases with associated per-case symmetrical importance scores, applying a selection algorithm with a classifier to a training set and the pool, but without the per-case symmetrical importance scores, to determine selection scores for the unlabeled case, and combining the selection scores and the corresponding per-case symmetrical importance scores to form combined scores for the unlabeled cases. The method further includes providing a high scoring unlabeled case to an oracle to label, receiving a labeled case back from the oracle and augmenting the training set with the labeled case, training the classifier with the augmented training set, and applying the classifier to an additional unlabeled case. | 02-06-2014 |
20140278055 | UPDATING ROAD MAPS - A technique for updating road maps is disclosed. A number of GPS traces can be matched with a number of roads in a map. Matched GPS traces may be processed by a matched segment module to produce proposed changes to the map. The map can be updated using a map updating module based on the proposed changes from the matched segment module. Unmatched GPS traces may be processed by an unmatched segment module to produce proposed changes to the map. The map can be updated using a map updating module based on the proposed changes from the unmatched segment module. The proposed changes to the map may include metadata defining new roads in the map, new intersections in the map, updates to turn restrictions in the map, updates to the allowable directional traffic flow on the roads within the map, updates to road closures in the map. | 09-18-2014 |
20140279734 | Performing Cross-Validation Using Non-Randomly Selected Cases - A technique to perform cross-validation using a set of randomly selected labeled cases and a set of non-randomly selected labeled cases. A training set for use during cross-validation can include cases from both sets. A test set for use during cross-validation can include cases from the randomly selected set but exclude cases from the non-randomly selected set. | 09-18-2014 |
20140279742 | DETERMINING AN OBVERSE WEIGHT - A technique for determining an obverse weight. A set of cases can be divided into bins. An obverse weight for a bin can be determined based on an importance weight of the bin and a variance of an error estimate of the bin. | 09-18-2014 |
20140307880 | MONITOR AN EVENT THAT PRODUCES A NOISE RECEIVED BY A MICROPHONE - A computing system including a component to perform a function and generates a noise. A microphone to receive an input including the noise. The computing system can monitor a component for an event that produces a noise. | 10-16-2014 |