Abstract:
Systems and methods are provided for generating a pseudo-labeled training dataset by at least one of: (1) extracting a set of intermediate outputs from an automatic speech recognition model based on applying the automatic speech recognition model to the set of unlabeled speech data, clustering the set of intermediate outputs into different clusters, and generating a first set of pseudo-labels comprising cluster assignments associated with the different clusters and which correspond to the unlabeled speech data, or (2) generating a set of decoded word sequences for the unlabeled speech data by applying the automatic speech recognition model to the set of unlabeled speech data, and generating a second set of pseudo-labels associated with the unlabeled speech data by applying the automatic speech recognition model to both (i) the set of decoded word sequences and (ii) the set of unlabeled speech data.
Abstract:
This invention is concerned with biocompatible magnetic nanocrystals highly soluble and dispersible in a physiological buffer, powder of biocompatible magnetic nanocrystals and nanocrystals bearing surface reactive N-hydroxysuccinimide ester moiety, and preparations thereof. The magnetic nanocrystals in powder form are highly soluble in a physiological buffer. The resultant aqueous colloidal solution presents long term stability in ambient conditions. Moreover, the carboxyl group on the surface of the magnetic nanocrystals can be converted to N-hydroxysuccinimide ester moiety in an organic solvent. The resultant powder of the magnetic nanocrystals carrying surface N-hydroxysuccinimide ester moiety is soluble and dispersible in an aqueous solution. Different types of biomolecules bearing amino group can covalently be attached to the magnetic nanocrystal simply by mixing them in aqueous solutions. Moreover, the powder of the magnetic nanocrystals bearing surface N-hydroxysuccinimide ester moiety retain reaction activity with biomolecule after long-term storage.
Abstract:
Several implementations relate, for example, to depth encoding and/or filtering for 3D video (3DV) coding formats. A sparse dyadic mode for partitioning macroblocks (MBs) along edges in a depth map is provided as well as techniques for trilateral (or bilateral) filtering of depth maps that may include adaptive selection between filters sensitive to changes in video intensity and/or changes in depth. One implementation partitions a depth picture, and then refines the partitions based on a corresponding image picture. Another implementation filters a portion of a depth picture based on values for a range of pixels in the portion. For a given pixel in the portion that is being filtered, the filter weights a value of a particular pixel in the range by a weight that is based on one or more of location distance, depth difference, and image difference.
Abstract:
A temporal information integration dis-occlusion system and method for using historical data to reconstruct a virtual view containing an occluded area. Embodiments of the system and method use temporal information of the scene captured previously to obtain a total history. This total history is warped onto information captured by a camera at a current time in order to help reconstruct the dis-occluded areas. The historical data (or frames) from the total history match only a portion of the frames contained in the captured information. This warping yields warped history information. Warping is performed by using one of two embodiments to match points in an estimation of the current information to points in the captured information. Next, regions of current information are split using a classifier. The warped history information and the captured information then are merged to obtain an estimate for the current information and the reconstructed virtual view.
Abstract:
Embodiments are provided for building a configurable multilingual model. A computing system obtains a plurality of language-specific automatic speech recognition modules and a universal automatic speech recognition module trained on a multi-language training dataset comprising training data corresponding to each of the plurality of different languages. The computing system then compiles the universal automatic speech recognition module with the plurality of language-specific automatic speech recognition modules to generate a configurable multilingual model that is configured to selectively and dynamically utilize a sub-set of the plurality of language-specific automatic speech recognition modules with the universal automatic speech recognition module to process audio content in response to user input identifying one or more target languages associated with the audio content.
Abstract:
Various implementations address depth coding and related disciplines. In one particular implementation, a segmentation is determined for a particular portion of a video image in a sequence of video images. The segmentation is determined based on reference depth indicators that are associated with at least a portion of one video image in the sequence of video images. Target depth indicators associated with the particular portion of the video image are processed. The processing is based on the determined segmentation in the particular portion of the video image. In another particular implementation, a segmentation is determined for at least a given portion of a video image based on depth indicators associated with the given portion. The segmentation is extended from the given portion into a target portion of the video image based on pixel values in the given portion and on pixel values in the target portion.
Abstract:
This invention is concerned with biocompatible magnetic nanocrystals highly soluble and dispersible in a physiological buffer, powder of biocompatible magnetic nanocrystals and nanocrystals bearing surface reactive N-hydroxysuccinimide ester moiety, and preparations thereof. The magnetic nanocrystals in powder form are highly soluble in a physiological buffer. The resultant aqueous colloidal solution presents long term stability in ambient conditions. Moreover, the carboxyl group on the surface of the magnetic nanocrystals can be converted to N-hydroxysuccinimide ester moiety in an organic solvent. The resultant powder of the magnetic nanocrystals carrying surface N-hydroxysuccinimide ester moiety is soluble and dispersible in an aqueous solution. Different types of biomolecules bearing amino group can covalently be attached to the magnetic nanocrystal simply by mixing them in aqueous solutions. Moreover, the powder of the magnetic nanocrystals bearing surface N-hydroxysuccinimide ester moiety retain reaction activity with biomolecule after long-term storage.
Abstract:
Embodiments are provided for building a configurable multilingual model. A computing system obtains a plurality of language-specific automatic speech recognition modules and a universal automatic speech recognition module trained on a multi-language training dataset comprising training data corresponding to each of the plurality of different languages. The computing system then compiles the universal automatic speech recognition module with the plurality of language-specific automatic speech recognition modules to generate a configurable multilingual model that is configured to selectively and dynamically utilize a sub-set of the plurality of language-specific automatic speech recognition modules with the universal automatic speech recognition module to process audio content in response to user input identifying one or more target languages associated with the audio content.
Abstract:
A temporal information integration dis-occlusion system and method for using historical data to reconstruct a virtual view containing an occluded area. Embodiments of the system and method use temporal information of the scene captured previously to obtain a total history. This total history is warped onto information captured by a camera at a current time in order to help reconstruct the dis-occluded areas. The historical data (or frames) from the total history match only a portion of the frames contained in the captured information. This warping yields warped history information. Warping is performed by using one of two embodiments to match points in an estimation of the current information to points in the captured information. Next, regions of current information are split using a classifier. The warped history information and the captured information then are merged to obtain an estimate for the current information and the reconstructed virtual view.
Abstract:
Several implementations relate, for example, to depth encoding and/or filtering for 3D video (3DV) coding formats. A sparse dyadic mode for partitioning macroblocks (MBs) along edges in a depth map is provided as well as techniques for trilateral (or bilateral) filtering of depth maps that may include adaptive selection between filters sensitive to changes in video intensity and/or changes in depth. One implementation partitions a depth picture, and then refines the partitions based on a corresponding image picture. Another implementation filters a portion of a depth picture based on values for a range of pixels in the portion. For a given pixel in the portion that is being filtered, the filter weights a value of a particular pixel in the range by a weight that is based on one or more of location distance, depth difference, and image difference.