-
公开(公告)号:US20190258718A1
公开(公告)日:2019-08-22
申请号:US16403281
申请日:2019-05-03
Applicant: DeepMind Technologies Limited
Inventor: Lei Yu , Christopher James Dyer , Tomas Kocisky , Philip Blunsom
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.
-
2.
公开(公告)号:US20240169211A1
公开(公告)日:2024-05-23
申请号:US18388180
申请日:2023-11-08
Applicant: DeepMind Technologies Limited
Inventor: Domenic Joseph Donato , Christopher James Dyer , Lei Yu , Wang Ling
IPC: G06N3/092 , G06N3/0985
CPC classification number: G06N3/092 , G06N3/0985
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network to perform a machine learning task through reinforcement learning. In one aspect, the training uses importance weights generated using standardized absolute deviations of quality scores generated by the neural network for candidate network outputs.
-
公开(公告)号:US11423237B2
公开(公告)日:2022-08-23
申请号:US16746012
申请日:2020-01-17
Applicant: DeepMind Technologies Limited
Inventor: Lei Yu , Christopher James Dyer , Tomas Kocisky , Philip Blunsom
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.
-
公开(公告)号:US12131248B2
公开(公告)日:2024-10-29
申请号:US18144810
申请日:2023-05-08
Applicant: DeepMind Technologies Limited
Inventor: Yujia Li , Christopher James Dyer , Oriol Vinyals
CPC classification number: G06N3/047 , G06F16/9024 , G06F17/18 , G06N3/045 , G06N3/08
Abstract: There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.
-
公开(公告)号:US20240013769A1
公开(公告)日:2024-01-11
申请号:US18038631
申请日:2021-11-22
Applicant: DeepMind Technologies Limited
Inventor: Ian Michael Gemp , Yoram Bachrach , Roma Patel , Christopher James Dyer
IPC: G10L13/047 , G10L13/08
CPC classification number: G10L13/047 , G10L13/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting an input vocabulary for a machine learning model using power indices. One of the methods includes computing a respective score for each of a plurality of text tokens in an initial vocabulary and then selecting the text tokens in the input vocabulary based on the respective scores.
-
公开(公告)号:US20200151398A1
公开(公告)日:2020-05-14
申请号:US16746012
申请日:2020-01-17
Applicant: DeepMind Technologies Limited
Inventor: Lei Yu , Christopher James Dyer , Tomas Kocisky , Philip Blunsom
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.
-
公开(公告)号:US10572603B2
公开(公告)日:2020-02-25
申请号:US16403281
申请日:2019-05-03
Applicant: DeepMind Technologies Limited
Inventor: Lei Yu , Christopher James Dyer , Tomas Kocisky , Philip Blunsom
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a target sequence from an input sequence. In one aspect, a method comprises maintaining a set of current hypotheses, wherein each current hypothesis comprises an input prefix and an output prefix. For each possible combination of input and output prefix length, the method extends any current hypothesis that could reach the possible combination to generate respective extended hypotheses for each such current hypothesis; determines a respective direct score for each extended hypothesis using a direct model; determines a first number of highest-scoring hypotheses according to the direct scores; rescores the first number of highest-scoring hypotheses using a noisy channel model to generate a reduced number of hypotheses; and adds the reduced number of hypotheses to the set of current hypotheses.
-
公开(公告)号:US12287795B2
公开(公告)日:2025-04-29
申请号:US18401120
申请日:2023-12-29
Applicant: DeepMind Technologies Limited
Inventor: Domenic Joseph Donato , Christopher James Dyer , Rémi Leblond
IPC: G06F16/00 , G06F16/2457 , G06F40/284
Abstract: Methods and systems for beam search decoding. One of the methods includes initializing beam data specifying a set of k candidate output sequences and a respective total score for each of the candidate output sequences; updating the beam data at each of a plurality of decoding steps, comprising, at each decoding step: generating a score distribution that comprises a respective score for each token in the vocabulary; identifying a plurality of expanded sequences; generating, for each expanded sequence, a respective backwards-looking score; generating, for each expanded sequence, a respective forward-looking score; computing, for each expanded sequence, a respective total score from the respective forward-looking score for the expanded sequence and the respective backwards-looking score for the expanded sequence; and updating the set of k candidate output sequences using the respective total scores for the expanded sequences.
-
公开(公告)号:US20240220506A1
公开(公告)日:2024-07-04
申请号:US18401120
申请日:2023-12-29
Applicant: DeepMind Technologies Limited
Inventor: Domenic Joseph Donato , Christopher James Dyer , Rémi Leblond
IPC: G06F16/2457 , G06F40/284
CPC classification number: G06F16/24573 , G06F40/284
Abstract: Methods and systems for beam search decoding. One of the methods includes initializing beam data specifying a set of k candidate output sequences and a respective total score for each of the candidate output sequences; updating the beam data at each of a plurality of decoding steps, comprising, at each decoding step: generating a score distribution that comprises a respective score for each token in the vocabulary; identifying a plurality of expanded sequences; generating, for each expanded sequence, a respective backwards-looking score; generating, for each expanded sequence, a respective forward-looking score; computing, for each expanded sequence, a respective total score from the respective forward-looking score for the expanded sequence and the respective backwards-looking score for the expanded sequence; and updating the set of k candidate output sequences using the respective total scores for the expanded sequences.
-
公开(公告)号:US11704541B2
公开(公告)日:2023-07-18
申请号:US16759525
申请日:2018-10-29
Applicant: DeepMind Technologies Limited
Inventor: Yujia Li , Christopher James Dyer , Oriol Vinyals
CPC classification number: G06N3/047 , G06F16/9024 , G06F17/18 , G06N3/045 , G06N3/08
Abstract: There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.
-
-
-
-
-
-
-
-
-