Indicate which option describes the following action-selection method best.
Keep track of a count, K s,a for each state-action tuple, (s,a), of the number of times that tuple has been seen and select argmaxa[Q(s,a)-Ks,a].
Group of answer choices
a. Mix of both
b. Mostly exploitation
c. Mostly exploration

Q&A Education