Low power wireless communication method
    1.
    发明申请
    Low power wireless communication method 审中-公开
    低功耗无线通信方式

    公开(公告)号:US20080084836A1

    公开(公告)日:2008-04-10

    申请号:US11906809

    申请日:2007-10-04

    IPC分类号: G08C17/00

    摘要: A low power wireless communication method has a remote device with a simple receiver that listens for a wake-up signal. When the wake-up signal is received a complex receiver is turned on to communicate with the control device. In another embodiment, the simple receiver powers up periodically (or aperiodically) to listen for the wake-up signal.In addition, a wireless modem can communicate to a device, such as an electronic lock, in a number of modes to save power. In one mode the wireless modem just passes any incoming messages through to the device in real time. However, if power needs to be conserved incoming messages can be saved in cache and forwarded to the device over a low power bus, such as a serial bus. In another embodiment, the incoming message can be filtered to determine if it needs to be forwarded to the device.

    摘要翻译: 低功率无线通信方法具有远程设备,其具有用于监听唤醒信号的简单接收机。 当接收到唤醒信号时,复合接收机被接通以与控制设备通信。 在另一个实施例中,简单接收机周期性(或非周期性地)加电以监听唤醒信号。 此外,无线调制解调器可以以多种模式与诸如电子锁的设备通信以节省功率。 在一种模式中,无线调制解调器只需将任何传入的消息通过实时传递到设备。 然而,如果需要节省电力,则传入的消息可以保存在高速缓存中,并通过诸如串行总线的低功率总线转发到设备。 在另一个实施例中,输入消息可以被过滤以确定是否需要将其转发到设备。

    Learning controller with advantage updating algorithm
    2.
    发明授权
    Learning controller with advantage updating algorithm 失效
    学习控制器具有优势更新算法

    公开(公告)号:US5608843A

    公开(公告)日:1997-03-04

    申请号:US283729

    申请日:1994-08-01

    CPC分类号: G05B13/0265 G06N99/005

    摘要: A new algorithm for reinforcement learning, advantage updating, is proposed. Advantage updating is a direct learning technique; it does not require a model to be given or learned. It is incremental, requiring only a constant amount of calculation per time step, independent of the number of possible actions, possible outcomes from a given action, or number of states. Analysis and simulation indicate that advantage updating is applicable to reinforcement learning systems working in continuous time (or discrete time with small time steps) for which Q-learning is not applicable. Simulation results are presented indicating that for a simple linear quadratic regulator (LQR) problem with no noise and large time steps, advantage updating learns slightly faster than Q-learning. When there is noise or small time steps, advantage updating learns more quickly than Q-learning by a factor of more than 100,000. Convergence properties and implementation issues are discussed. New convergence results are presented for R-learning and algorithms based upon change in value. It is proved that the learning rule for advantage updating converges to the optimal policy with probability one.

    摘要翻译: 提出了一种新的强化学习算法,优势更新方法。 优势更新是直接学习技术; 它不需要给予或学习模型。 它是递增的,仅需要每个时间步长的不间断的计算量,而不考虑可能的动作数量,给定动作的可能结果或状态数。 分析和模拟表明,优势更新适用于不适用Q学习的连续工作(或时间小的离散时间)的强化学习系统。 仿真结果表明,对于没有噪声和大时间步长的简单线性二次调节器(LQR)问题,优势更新学习比Q学习略快。 当有噪音或小时间步骤时,优势更新比Q学习更快学习超过100,000。 讨论融合属性和实现问题。 基于价值变化的R学习和算法提出了新的收敛结果。 证明优势更新的学习规则以概率一收敛到最优策略。