專利授權區國立清華大學國際產學營運總中心 Operations Center for Industry Collaboration

搜尋專利授權區

關鍵字

» 新增關鍵字

選單

專利授權區

專利授權區
專利名稱(英)	MASTER POLICY TRAINING METHOD OF HIERARCHICAL REINFORCEMENT LEARNING WITH ASYMMETRICAL POLICY ARCHITECTURE
專利家族	中華民國：I835638 美國：2023-0362196(公開號)
專利權人	國立清華大學 100.00%
發明人	李濬屹
技術領域	資訊工程,電子電機

專利摘要(英)
The present invention includes the following steps: loading a master policy, a plurality of sub-policies, and environment data; wherein the sub-policies have different inference costs; selecting one of the sub-policies as a selected sub-policy by using the master policy; generating at least one action signal according to the selected sub-policy; applying the at least one action signal to an action executing unit; detecting at least one reward signal from a detecting module; training the master policy using at least one real inference cost of the at least one reward signal and an expected inference cost of the selected sub-policy to minimize inference cost; the present invention trains the master policy using Hierarchical Reinforcement Learning with an asymmetrical policy architecture, thus allowing the master policy to reduce inference cost while maintaining satisfying performance for a deep neural network model.

專利摘要(英)

The present invention includes the following steps: loading a master policy, a plurality of sub-policies, and environment data; wherein the sub-policies have different inference costs; selecting one of the sub-policies as a selected sub-policy by using the master policy; generating at least one action signal according to the selected sub-policy; applying the at least one action signal to an action executing unit; detecting at least one reward signal from a detecting module; training the master policy using at least one real inference cost of the at least one reward signal and an expected inference cost of the selected sub-policy to minimize inference cost; the present invention trains the master policy using Hierarchical Reinforcement Learning with an asymmetrical policy architecture, thus allowing the master policy to reduce inference cost while maintaining satisfying performance for a deep neural network model.

聯絡資訊
承辦人姓名	李曉琪
承辦人電話	03-5715131 #31061
承辦人Email	hsiaochi@mx.nthu.edu.tw