-
公开(公告)号:US20230386168A1
公开(公告)日:2023-11-30
申请号:US18192393
申请日:2023-03-29
Inventor: Yipeng SUN , Mengjun CHENG , Longchao WANG , Xiongwei ZHU , Kun YAO , Junyu HAN , Jingtuo LIU , Errui DING , Jingdong WANG , Haifeng Wang
IPC: G06V10/42 , G06F16/583 , H04N19/176
CPC classification number: G06V10/42 , G06F16/5846 , H04N19/176
Abstract: A pre-training method for a Vision and Scene Text Aggregation model includes: acquiring a sample image-text pair; extracting a sample scene text from a sample image; inputting a sample text into a text encoding network to obtain a sample text feature; inputting the sample image and an initial sample aggregation feature into a visual encoding subnetwork and inputting the initial sample aggregation feature and the sample scene text into a scene encoding subnetwork to obtain a global image feature of the sample image and a learned sample aggregation feature; and pre-training the Vision and Scene Text Aggregation model according to the sample text feature, the global image feature of the sample image, and the learned sample aggregation feature.
-
公开(公告)号:US20230169351A1
公开(公告)日:2023-06-01
申请号:US18060705
申请日:2022-12-01
Inventor: Haifeng Wang , Zhihua Wu , Dianhai Yu , Yanjun Ma , Tian Wu
IPC: G06N3/098
CPC classification number: G06N3/098
Abstract: A distributed training method based on end-to-end adaption, a device and a storage medium. The method includes: obtaining slicing results by slicing a model to be trained; obtaining an attribute of computing resources allocated to the model for training by parsing the computing resources, in which the computing resources are determined based on a computing resource requirement of the model, computing resources occupied by another model being trained, and idle computing resources, and the attribute of the computing resources is configured to represent at least one of a topology relation and a task processing capability of the computing resources; determining a distribution strategy of each of the slicing results in the computing resources based on the attributes of the computing resources; and performing distributed training on the model using the computing resources based on the distribution strategy.
-
公开(公告)号:US20230120985A1
公开(公告)日:2023-04-20
申请号:US18083313
申请日:2022-12-16
Inventor: Yanwen Fan , Xiyu Yu , Gang Zhang , Jingtuo Liu , Haifeng Wang , Errui Ding , Junyu Han
IPC: G06V10/774 , G06V40/16 , G06V10/26 , G06V10/77
Abstract: A method for training a face recognition model includes: acquiring a plurality of first training images being uncovered face images, and acquiring a plurality of covering object images; generating a plurality of second training images by separately fusing the plurality of covering object images with the uncovered face images; and training the face recognition model by inputting the plurality of first training images and the plurality of second training images into the face recognition model.
-
公开(公告)号:US20230115163A1
公开(公告)日:2023-04-13
申请号:US17989644
申请日:2022-11-17
Inventor: Haifeng Wang , Xiaoguang Hu , Dianhai Yu , Xiang Lan , Yanjun Ma
Abstract: The disclosure provides a method for processing data, and an electronic device. The method includes: obtaining first attribute information of input data and second attribute information of a computing device corresponding to the input data; selecting a target operator implementation mode from a plurality of candidate operator implementation modes based on the first attribute information and the second attribute information; determining a plurality of sub-operators included in an operator required for the input data from an operator library based on the target operator implementation mode, to generate the operator; and obtaining an operation result by performing an operation on the input data by the computing device based on the operator.
-
公开(公告)号:US20230015112A1
公开(公告)日:2023-01-19
申请号:US17933152
申请日:2022-09-19
Inventor: Jiankang Hou , Tao Sun , Zhipeng Nie , Liqiang Zhang , Lei Jia , Haifeng Wang
IPC: G10L21/10 , G10L13/02 , G10L21/0208 , G10L25/51
Abstract: A method for processing a speech includes: acquiring an original speech; extracting a spectrogram from the original speech; acquiring a speech synthesis model, where the speech synthesis model comprises a first generation sub-model and a second generation sub-model; generating a harmonic structure of the spectrogram, by invoking the first generation sub-model to process the spectrogram; and generating a target speech, by invoking the second generation sub-model to process the harmonic structure and the spectrogram.
-
26.
公开(公告)号:US20250103959A1
公开(公告)日:2025-03-27
申请号:US18885339
申请日:2024-09-13
Inventor: Liang Shen , Dianhai Yu , Weibao Gong , Jinle Zeng , Haifeng Wang
IPC: G06N20/00
Abstract: Provided is a performance optimization method for a model training device, an electronic device, and a storage medium, relating to the fields of deep learning, large model training, and distributed parallel strategies. The method includes: determining communication timing of a current model training device with respect to a target model block at a target sorting position, so as to be able to perform synchronously collective communication with other model training devices of a plurality of model training devices with respect to model blocks at the target sorting position; and performing the collective communication on a backward gradient of the target model block at the communication timing.
-
公开(公告)号:US12205025B2
公开(公告)日:2025-01-21
申请号:US17211146
申请日:2021-03-24
Inventor: Haifeng Wang , Xiaoguang Hu , Dianhai Yu
Abstract: The present application discloses a processor video memory optimization method and apparatus for deep learning training tasks, and relates to the technical field of artificial intelligence. In the method, by determining an optimal path for transferring a computing result, the computing result of a first computing unit is transferred to a second computing unit by using the optimal path. Thus, occupying the video memory is avoided, and meanwhile, a problem of low utilization rate of the computing unit of a GPU caused by video memory swaps is avoided, so that training speed of most tasks is hardly reduced.
-
公开(公告)号:US11887376B2
公开(公告)日:2024-01-30
申请号:US17517702
申请日:2021-11-03
Inventor: Jizhou Huang , Hui Zhao , Deguo Xia , Haifeng Wang
IPC: G06V30/00 , G06V20/56 , G06V20/40 , G06V10/40 , G06V30/262 , G06V20/10 , B60R16/023 , G06F18/24 , G06F18/214
CPC classification number: G06V20/56 , B60R16/0231 , G06F18/214 , G06F18/24 , G06V10/40 , G06V20/176 , G06V20/41 , G06V20/46 , G06V30/274
Abstract: The present disclosure provides a method and apparatus of estimating a road condition, and a method and apparatus of establishing a road condition estimation model, which relates to a field of big data and intelligent traffic. The method includes: acquiring, for a first preset duration before a first moment, a sequence of user tracks for a road and a sequence of road images for the road; extracting a track-related feature of the road from the sequence of the user tracks, and extracting an image-related feature of the road from the sequence of the road images; and inputting the track-related feature of the road and the image-related feature of the road into a pre-trained road condition estimation model, so as to determine, for a second preset duration after the first moment, a road condition information of the road by using an estimated result of the road condition estimation model.
-
公开(公告)号:US20230368523A1
公开(公告)日:2023-11-16
申请号:US18152119
申请日:2023-01-09
Inventor: Xiaoqing Ye , Deguo Xia , Jizhou Huang , Haifeng Wang
CPC classification number: G06V20/182 , G06V10/7715 , G06V20/13 , G06V10/25 , G01C21/3852 , G01C21/3819 , G06V10/42
Abstract: Provided are a road network extraction method, a device, and a storage medium, which relate to the technical field of artificial intelligence and, in particular, to the fields of image processing, computer vision, and the like and are specifically applicable to scenarios such as intelligent transportation and a smart city. A specific implementation scheme includes: extracting a first road network of a target region according to user trajectories of the target region; extracting a second road network of the target region according to a satellite aerial image of the target region; and extract a target road network of the target region according to the first road network, the second road network, and the user trajectories. Efficient and accurate road network extraction can be achieved through techniques in embodiments of the present disclosure.
-
公开(公告)号:US20230186024A1
公开(公告)日:2023-06-15
申请号:US17874394
申请日:2022-07-27
Inventor: Zeyu Chen , Haifeng Wang , Tian Wu , Dianhai Yu , Yanjun Ma , Xiaoguang Hu
IPC: G06F40/284 , G06F40/47
CPC classification number: G06F40/284 , G06F40/47
Abstract: Provided are a text processing method, a device and a storage medium, relating to a field of computer technology, and especially to a field of artificial intelligence, such as natural language processing and deep learning. The specific implementation scheme includes: performing text processing on first text, by using a text processing acceleration operator; and processing, in parallel and faster, content after the text processing, by using the text processing acceleration operator. Text processing and parallel acceleration are carried out by the text processing acceleration operator, which can improve the speed of text processing.
-
-
-
-
-
-
-
-
-