摘要:
A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.
摘要:
A system (30) and method are provided for single-pass execution of dynamic pages across multiple request-response cycles. The system (30) comprises a client (32) and server (34) in communication with one another. A container (35) resides on the server and handles requests made for the result of a dynamic page (36). The container controls the processing of the dynamic page. If the dynamic page requires additional information to continue processing, an intermediate request (44) is transmitted to the client, which responds with an intermediate response (46) containing the additional information. A notifier servlet (38) receives the intermediate response and passes the information to the dynamic page so that execution can resume without interruption.
摘要:
A source sentence is decoded in an iterative manner. At each step a set of partially constructed target sentences are collated, each of which has a score or an associated probability, computed from a language model score and a translation model score. At each iteration, a family of exponentially many alignments is constructed and the optimal translation for this family is found out. To construct the alignment family, a set of transformation operators is employed. The described decoding algorithm is based on the Alternating Optimization framework and employs dynamic programming. Pruning and caching techniques may be used to speed up the decoding.
摘要:
Methods, apparatuses and computer program products for decoding source text in a first language to target text in a second language are disclosed. The source text is decoded into an intermediate text portion based on a fixed alignment between words in the source text and words in the intermediate text portion and an alignment between words in the source text and words in the intermediate text portion is determined. The steps of decoding the source text and determining an alignment are alternately repeated while a decoding improvement in the intermediate text portion can be obtained. Finally, the intermediate text portion is output as the target text. The step of alternately repeating the source text decoding and alignment determination steps may be repeated for each of a plurality of lengths of the intermediate text portion.