Machine Learning AlgorithmsQ-learningOn this pageQLearner Description: A simple reinforcement learning framework that can be used to learn optimal policies for Markov decision processes using Q-learning. Q-learning is a model-free reinforcement learning algorithm that learns an optimal action-value function from experience by repeatedly updating estimates of the Q-value of state-action pairs. Class Object: QLearner Class. Inherits from: Object. matrix Type: Readonly Field. Description: The matrix that stores state, action, and Q-value. Signature: const matrix: {{ --[[state]] integer, --[[action]] integer, --[[Q-value]] number }} update Type: Function. Description: Update Q-value for a state-action pair based on received reward. Signature: update: function(self: QLearner, state: integer, action: integer, reward: number) Parameters: ParameterTypeDescriptionstateintegerRepresenting the state.actionintegerRepresenting the action. Must be greater than 0.rewardnumberRepresenting the reward received for the action in the state. getBestAction Type: Function. Description: Returns the best action for a given state based on the current Q-values. Signature: getBestAction: function(self: QLearner, state: integer): integer Parameters: ParameterTypeDescriptionstateintegerThe current state. Returns: Return TypeDescriptionintegerThe action with the highest Q-value for the given state. Returns 0 if no action is available. load Type: Function. Description: Load Q-values from a matrix of state-action pairs. Signature: load: function(self: QLearner, values: {{ --[[state]] integer, --[[action]] integer, --[[Q-value]] number }}) Parameters: ParameterTypeDescriptionvalues{{integer - The state, integer The action, number - The Q-value for the given state-action pair}}The matrix of state-action pairs to load.