Optimining through Reinforcement Learning

Architex aims to make its new theoretical concept of "Optimining" possible through the Morpheus protocol.

Q-learning Algorithms

Application: Dynamically adjusting mining parameters to maximize rewards (e.g., mining profits) while minimizing costs (such as energy consumption).

Operation: The reinforcement learning agent makes decisions by calculating the expected value of each possible action based on a strategy that maximizes the sum of future rewards.

Basic Formula: The update of the Q-value for a state-action pair is given by:

Q(s,a)=Q(s,a)+α[r+γmaxa′​Q(s′,a′)−Q(s,a)]

Where r is the immediate reward, γ is the discount factor, and α is the learning rate.

Last updated