A method and an apparatus for renewable energy allocation based on reinforcement learning, adapted for an energy aggregator having an energy storage system (ESS) to determine renewable energy allocation between multiple energy suppliers and multiple energy demanders, are provided. In the method, historical power generation data of each energy supplier is collected and used to generate renewable energy indexes representing uncertainty of the renewable energy. Multiple market indexes associated with a state of a renewable energy market are collected and integrated with the renewable energy indexes and electricity information of the ESS to generate multiple states of Markov’s decision process. The states are inputted to a reinforcement learning model to determine prices bid for the energy suppliers and the energy demanders. According to supplies and demands submitted by the energy suppliers and the energy demanders in response to the prices, the ESS is adjusted to coordinate supply and demand of the renewable energy, and the reinforcement learning model is updated. |